How to loop over a large text file?

Hi there

I have a text file that contains a little over 150000 lines. And, I need to check a keyword against each line. I noticed it takes a million year with a For Each activity. So I am looking for a faster solution.

In my opinion the fastest solution is regex but I am not sure how to extract those lines where my keyword matches.

Hi @hacback17 ,

Could you please share a sample text file with few lines of text, along with the keyword so that we may develop a solution for you?

Kind Regards,
Ashwin A.K

Hi ! @hacback17 … this documentation could help … https://docs.uipath.com/activities/docs/matches

there is an example too

let us know if you have any question

1 Like

HI @hacback17 ,

Are you able to open this file in an editor like Notepad++? If yes, then you can use the search functionality of this software and check if you get the result. I have used this feature to search for keyword across multiple files ( but not such big files though)

I’m not sure if this will be faster, 150k lines is quite a bit from an RPA perspective. And depending on the ammount of keys that you need to match against it you might just be stuck with a slow script.

You might want to try this and see if it increases performance:
I assume you read the entire txt file into a datatable.
By using Lookup Data Table activity you can search a specific key in a specific column.

One of the outputs is the RowIndex, and optionally also a value of another column with the same rowindex. (So it works kinda like an excel vlookup)

image
image

This will lookup value ‘MyLookupValue’ in column ‘myLookupColumnName’ from DT_MyTable.
It returns the rownumber where it is found (‘MyRowIndex’) and the value found on that row in column ‘myResultColumName’

Hope this helps!

It’s really not. I’ve done testing and found that you need to get into the millions of rows before seeing performance changes.

2 Likes

What is in the text file? Is it CSV? Or just regular text? What is your goal? Is it just to see if the value exists, or are you trying to find out which row? Something else?

Hi @hacback17 ,

Please check this one.

Regards
Balamurugan.S

1 Like

Thanks! I will take a look at it.

SOLVED:

Thanks a lot. A number of great solutions and suggestions were shared here and on my LinkedIn post.

At the end, I am resorting to this C# code snippet (thanks to Nived N for sharing it on my LinkedIn post) that solves my problem:
arr_lines = System.Text.RegularExpression.Regex.Matches(str_text, "^.*word1.*", RegexOptions.Multiline).Cast<Match>().Select(m => m.ToString()).ToArray()

VB.NET Equivalent:
arr_lines = System.Text.RegularExpression.Regex.Matches(str_text,"^.*word1.*",RegexOptions.Multiline).Cast(Of Match).Select(Function (m) m.Value.ToString).ToArray()

The reason why I didn’t use any pre-made packages was because it seemed overkill for a small project like this that I am developing.

Once again, I want to thank each and everyone of you for your efforts.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.