How can i extract the data from Text file

Hi There,
Hope doing good!!

The following is the Sample Input data from the Notepad ,

“NO | Kranti 123456 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore
NO | Kranti 123111 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore
NO | Kranti 123112 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore
NO | Kranti 123113 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore
NO | Kranti 123114 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore
NO | Kranti 123115 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore
NO | Kranti 123116 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore”

Output:
Kranti 123456
Kranti 123111
Kranti 123112
Kranti 123113
Kranti 123114
Kranti 123115
Kranti 123116

There are multiple lines between each line, i have shared the above data. But i have share you here only the lines which i have to extract the data from the whole Input …
can you please help me how to get the Exact Output from the Sample Input ?

Thank you in advance !!

HI @kmaddikatla

You can try with SPLIT expression

Split("NO | Kranti 123456 | DATE 022423 | Kra 1111111111 | Karnataka, Bangalore","|")(1)

image

Regards
Gokul

Hi @kmaddikatla

You can iterate through each data in the row & use the following regex to extract the data:

Kranti\s\d{6}

Output:

Hope this helps,
Best Regards.

1 Like

@kmaddikatla

You can use this directly in assign activity…it would give you a stringArray of the matched values

Added little caution to the regex will take the data between pipes and also check if the ending pipe is having date after it

StrArr = System.Text.RegularExpressions.Regex.Matches(str,"(?<=\|).*(?=\| DATE)").Select(function(x) x.Value.Trim).ToArray

Hope this helps

Cheers

this is working fine in my Local machine, but while i am trying this in other machine , it is showing an error

"select is not a member of system.text.regularexpressions.matchcollection

Thanks in Advance @Anil_G

u mean by using for each row in DT ? @arjunshenoy

Thanks in Advance !!

@kmaddikatla

Then just add .ToArray and use it liek this

System.Text.RegularExpressions.Regex.Matches("","(?<=\|).*(?=\| DATE)").ToArray.Select(function(x) x.Value.Trim).ToArray

cheers

now it is coming like this @Anil_G

ToArray is not a member of system.text.regularexpressions.matchcollection

And I am just wondering what is the difference between two machines , using same version of Packages and same version of Studio(Enterprise).

@kmaddikatla

Please try this this should be working

System.Text.RegularExpressions.Regex.Matches(str,"(?<=\|).*(?=\| DATE)").Cast(Of System.Text.RegularExpressions.Match).Select(function(x) x.Value.Trim).ToArray

cheers

@kmaddikatla

Since the text data seems to be structured, you can easily convert to a datatable, then perform the specified actions.

Best Regards.

Hi @kmaddikatla ,

As mentioned by @arjunshenoy , Do Check with the activities at first which are available which we could leverage directly at first and perform the checks if all the data samples would work with it.

So the suggested was to use Generate Datatable Activity, it does provide us with a flexible approach of adding the necessary delimiters of recognising the Column Separators and row separators.

Let us know if you were able to use the Generate Datatable activity and come back to us with your feedback on it.

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.