Excel Read/Append Range in Chunks


#1

For my process, I need to read lines from an excel file, filter it, then append it to another excel file.
However, the excel files I’m working with have >100,000 lines, which causes UiPath to crash when it gets to Read/Append Range.

I am trying to split the Read and Append activities into 20,000 line chunks to prevent crashing.
But when I split the Read Range, only the first Read Range will have the headers, subsequent Read Ranges won’t have headers.
Does anything have a good way to do this?

Thank you.


#2

@in006, Take your first 20,000 line datatable’s variable say dt. Now use dt.Clone in assign activity for rest of the excel that has only headers where lines can be added.

Regards,
Dominic :slight_smile:


#3

Alternatively use dt.Clear or ClearDataTable activity coupled with MergeDataTable, if previous rows are not needed anymore.


#4

Thank you for your responses.
I need the headers in each 20,000 chunk so that I can filter each table based on the value in a column.

I tried using dt.Clone, but I’m still running into problems.
Below is what I did:

  1. read first 20,000 lines with headers to dt1
  2. assigned dt2 = dt1.Clone (i did the same for dt3, dt4, dt5)
  3. read the remaining 20,000 line chunks into dt2, dt3, dt4, dt5 (without headers)
  4. filter each dt using dt.Select("‘column name’ = ‘xxxx’), but at this step I get an error saying “Cannot find column [column name]”

It seems like my headers from dt1 did not get passed to dt2, dt3, dt4, dt5.
Am I using dt.Clone incorrectly?

Thanks


#5

Your use seems correct.
A rarely explicitly mentioned thing is that ReadXXX activities return a new DataTable, no matter if the one you passed was already initialized.

So if you need to work on ranges, there are a couple of things you might try:

  1. Ignore column names and go with indexes. Probably switching to LINQ would make it easier than trying to fit .Select into it.
  2. Add column names from a template after each read.
    Something like this - ReaddColumnNames.xaml (10.5 KB)
  3. Combine above into both. Depends on how you want it to work.

Point 2 will probably be easiest, especially since you already have the range incrementation done as you said (I’ve added a basic example of that for benefit of other users).


#6

Hi andrzej,
Thanks for your advice.
I got this to work using dt.Clone.

What I did:

  1. Read first 20,000 lines with header into dt1.
  2. Filter dt1 by column value, and store the results into another datatable
  3. Assign dt1=dt1.Clone
  4. Read next 20,000 lines into the cloned dt1.
  5. Repeat with subsequent 20,000 lines.

Using this method I was able to filter each chunk by a value in a column.


#7

Hi All,

Could anybody shed some light if the reading excel in chunks could be done using UiPath or should it be done using custom code? If it could be done using Uipath, could you please share a sample workflow? Your help will be greatly appreciated. :slight_smile:

Thanks