I have a text file that contains a little over 150000 lines. And, I need to check a keyword against each line. I noticed it takes a million year with a For Each activity. So I am looking for a faster solution.
In my opinion the fastest solution is regex but I am not sure how to extract those lines where my keyword matches.
Are you able to open this file in an editor like Notepad++? If yes, then you can use the search functionality of this software and check if you get the result. I have used this feature to search for keyword across multiple files ( but not such big files though)
I’m not sure if this will be faster, 150k lines is quite a bit from an RPA perspective. And depending on the ammount of keys that you need to match against it you might just be stuck with a slow script.
You might want to try this and see if it increases performance:
I assume you read the entire txt file into a datatable.
By using Lookup Data Table activity you can search a specific key in a specific column.
One of the outputs is the RowIndex, and optionally also a value of another column with the same rowindex. (So it works kinda like an excel vlookup)
This will lookup value ‘MyLookupValue’ in column ‘myLookupColumnName’ from DT_MyTable.
It returns the rownumber where it is found (‘MyRowIndex’) and the value found on that row in column ‘myResultColumName’
What is in the text file? Is it CSV? Or just regular text? What is your goal? Is it just to see if the value exists, or are you trying to find out which row? Something else?
Thanks a lot. A number of great solutions and suggestions were shared here and on my LinkedIn post.
At the end, I am resorting to this C# code snippet (thanks to Nived N for sharing it on my LinkedIn post) that solves my problem: arr_lines = System.Text.RegularExpression.Regex.Matches(str_text, "^.*word1.*", RegexOptions.Multiline).Cast<Match>().Select(m => m.ToString()).ToArray()