Split string with respect to REGEX in the text file

Test_File.txt (4.2 KB)

I am having the attached text file which has contents of tabular information, PFA.

Now, I have the fields separated by | (pipes), and I want to conclude one line to be with respect to some ending regex match.

FOR EXAMPLE:- :slight_smile:
Something …|0535E000|Interco Bureau Europe|000000|100700914268|30-APR-19|I/C-BUREAU ITALY|1050.33|XC-B ITALY frm
MCRC||||Receivab
les|O2K_TO_AMG|000

No matter how many lines are there,all I know is that last charecters are to be |000

I am having the for loop to iterate through the attached text file and I was able to split with respect to new line, but problem is that some fields are moving down to next line so I have to use some end line regex to assume that the one liners are accurate.

I dont know how to use regex. All I need is to find the regex (|000) in the end , and till that I not comming, I need to keep iterating and appending the data into the single row.

Please help, it is urgent.

Thanks and Regards,
@hacky

Hi @hacky,

You have delimited context so you can dirextly pass these data into Generate data table activity and outcome will be data-table and you can iterate dt with your logic

Happy Automation :smiley:

@SamanGuruge,

Dude, i Have tried all that, And it is not needed.

I just the regex solution where I can get β€œ|000” as a last charecter, where β€œ000” is alphanumber

you need to check line is ending with β€œ|000” right

1 Like

Yes, thats all, just tell me how to use regex to make it more dynamic thats all…

Use this link to create a regex to get what I wanted. thats all. last part ("|000")

but in real time, instead of 0, it can be any alphanumeric.

How about this pattern in here i’m looking for newline and before that digits starting with pipe
(|\d{1,})(?=\n)

1 Like

@SamanGuruge

Thanks mate!

But can you please make it β€œPipe, three alphanumberic characters and new line” thats all.

alphanumeric means it can be β€œ|CA0[next line ]” β€œ|000[next line ]” etc

Use this one

@SamanGuruge

Please find the screen shot of the results that I am getting:-

As you can see, it is also highlighting the other characters which are not needed.

Also as last line has no (\n) character, the regex in not acknowlegding it correctly.

Please please fix the above two cases?

Regards,
@hacky

@mukeshkala, can you please help?
@Arpit_Kesharwani
@Karthik_Kulkarni

1 Like

@hacky Is it what you are looking at

1 Like

@Arpit_Kesharwani

Thanks this is what I was looking for.

I changed your regex to (|[a-zA-Z0-9]{3}$) and its working fine.

Your approach gave me idea,

1 Like

@Arpit_Kesharwani, @supermanPunch, @Palaniyappan

Can you please have this text file Test_File.txt (4.2 KB) and use the regex to separate the input with respect to the following regex (|[a-zA-Z0-9]{3}$), such as to get the lines as mentioned above in current post thread?

I am not getting the correct results when I am using the string split in the Uipath with respect to regex. But if you look at the provided link, you will see that regex is working fine, but when looking at the UiPath regex viewer, the results are not correct

Regards,
hacky

1 Like

@hacky Do you want the other Line (Starting From || ) Which is a Part of the Previous Line to be Appended back to that Previous Line ?

@supermanPunch

Yes, that is the intent of doing the string split with respect to regex,

the regex is simply as discussed in this post.

Thanks in Advance

Regards,
@supermanPunch

@hacky Do you need the output as a Datatable? Why is the Regards Mentioning Me :rofl:

Ops, sorry mate!!! :smile: :smile: :smile: :rofl: :rofl: :rofl: :joy:

I was in a hurry and my fingers slipped!

Anyways you got it right, but then that is the factor where more data massage is needed, which is already ready in my hand.

And for now I just need this module where we can make robot make understand that what is the last field of each line in the input file (string variable), I need to use regex to make robot understand the last field and divide the string with respect to regex match and keep appending in the new string variable (lets say strNewResult). Now this module is only concerned with this much steps. So Yeah, if regex can determine the last field, then we can keep appending the records in strNewResult and we will be good to go.

I hope you understood my intensions.

And once we are getting strNewResult in our hand, I am having different loop to work out the further execution. This is already there in place.

Thanks and Regards,
@hacky (this time its correct mate…lol)

@hacky If Every Line which needs to be appended to the Previous Line is Starting with a | Symbol then I think I already have the Solution.
Check this Workflow, and revert back if it is not the Expected Output.
Text_To_Datatable.zip (18.4 KB)

@supermanPunch

Thanks, this looks good!

But then there is one challenge.
Every line which needs to be appended to the previous line is not necessarily starting from | in real time scenarios. Thats the reason why I stopped worrying about the STARTING LINE and started worrying about using regex to determine the last field and using it as a separator.

As you see the input text file can be messed up in any ways, and only pattern I was able to come up with is that last field has this pattern such that (|000 and new line) where

last field will be β€œ|CA0”, β€œ|000”, β€œT09”, etc followed by new line (if its last line then there is no new line charactor).

So the regex pattern I was able to understand was for last field: starting as β€œ|” pipe, followed by three Alpha numerics and then a new line chatactor(expect if its last line).

So the regex which I was able to come up with was :- (|[a-zA-Z0-9]{3}$)

And I am worried about how can I use this regex as a separator of the information.

Please give your inputs on this.

Thanks and Regards,
@hacky

@hacky Actually I was working on this from the point you have put that post, it was kind of a challenging task :sweat_smile: , But I was not able to use the Regex to get the Data as needed. Instead I used to lists to Check the Data by Splitting and adding the || line to the Previous line.

But seems to me it looks possible using regex considering the Starting Field will have that pattern of Month and year but there was some blockage while implementing the regex way. I will post that regex for you to understand it and also make changes to it to get the needed output.