Removing top x rows from large csv/txt file

Hi all. I have some text/csv files which have data before the actual data that needs to be imported. For some information about the files, they are approximately 1.5 million rows each and there is 4 of them. In order to use these with UiPath, I need to remove the top x rows and the import this file into a datatable.

I currently have got the following in UiPath to remove the tops rows:


However using this method, it can take 5 minutes per file to loop through, remove the top 2 rows and then save the file (not efficient at all).

Is there any other methods that can be used. I have seen this post about a different method (below) however I am not sure how to utilise this method.

Any help will be greatly appreciated.

Hello! A quick thing you can try before looking further for a solution would be this: https://go.uipath.com/component/read-extra-large-spreadsheets
It’s been developed to work with big files, so maybe it can reduce the time you have now.
PS: Since this is for .xlsx, you would need to convert the csv’s to xlsx. If this is not a suitable solution, let us know and we can look for more options.

Hi @q-z, unfortunately using anything other than a csv is not possible (system limitation + filesizes).

You could look to do this by code and the StreamReader class.

The process you have is likely taking a long time as it is reading all lines before deleting the lines you wish. The StreamReader will read them line by line, so you can get it to stop once it reaches the line you want and move onto the next file, making the whole process much quicker.

Here is an example.

Select the file provided when prompted. Review the input from the file versus what is output by the code in the console. You can see any lines containing “REMOVE” are removed.

RemoveLinesCode.zip (1.4 KB)

Thank you for your assistance @ronanpeter. I have ended up using the following method which provides nearly instant processing.

Dim lines As List(Of String) = System.IO.File.ReadAllLines(FilePath1).ToList
lines.RemoveAt(0) ' index starts at 0 
System.IO.File.WriteAllLines(OutputFile, lines)

For those who requires some assistance, I have used the Invoke Code activity and placed this code inside and then set the arguments for FilePath1 and OutputFile

image

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.