How to make changes on a lot of rows without using For Each activity?

My excel file contains approx 16000 rows.
I need to go through all the rows and perform various actions on each row(usually assigning rows to variables, cutting numbers, etc.)

I’m using For Each row, but that takes a lot of time (almost 45 minutes for all rows) which is unacceptable.

What could be a faster and more efficient way?

Thanks in advance.

using and not using For each on large data set is a hot topic now on the forum.

But, If the actions you need to perform are a lot then you can create a excel file out of it and run a macro to do the actions, excel application scope and macro can help.

That can also be done using LinQ but first we may need to see the type of operation getting performed on rows.

But I see you mentioned assigning rows to a variable, do you mean you create individual data row elements for all the rows?

1 Like

Usually 16000 rows won’t take that much time even if we go with for each row
I m wondering what kind of operations are carried out on each row

And is it necessary to process all the rows
If possible you can Initially filter the records that has to be processed and then take it with FOR EACH ROW activity

There are many other approaches to handle such a large where we can either split and keep the input either as multiple excel files in a folder or keep all the records in a QUEUE in orchestrator with a help of DISPATCHER process and let multiple bots to process it simultaneously which will eventually reduce the time taken to process those records

@bp777

it would be helpfully if you could tell us the exact actions and on how was it implemented
we assume that the excel file is read in with read range and the option preserve format is NOT activated.

Before shifting away from one approach to another it is always recommended to clear what is to achieve, what is the expected output.

Thanks for support

I did some tests and had to get to 1 million rows before any noticeable performance difference appeared, and even then it was only a few seconds. 5 million rows still only took 7 seconds for a for each with column value update.

Stop worrying about avoiding For Each.

https://forum.uipath.com/t/there-is-no-need-to-try-to-avoid-for-each-loops-on-large-amounts-of-data/

@postwick

how was the test including the various actions?

Show us your code. Tell us the exact steps you are performing on each iteration of the loop.

Hi, sorry for the late response.
There are few more assign activities, usually using the Substring method.


So, this part of the process takes too much time. Almost one hour for looping through the ~16.000 rows and performing various activities on each row.

Show us each individual thing it is doing, and the expression being processed. For example, the row["Serial_Numb… Assign, what’s the expression on the right? Show us each expression. Maybe there is a more efficient way.

row(“Serial Number_New”) = row(“Serial Number”).ToString.Remove(0,2)
keyStr = row(“Terminal ID”)
storeKey = row(“Terminal ID”).ToString.Substring(0, row(“Terminal ID”).ToString.Length - 4)
secondStKey= row(“Terminal ID”).ToString.Substring(row(“Terminal ID”).ToString.Length-3, 3)

As you can see, those are all simple expressions.

this is an update, can be done within for each or has a potential to handle it on helper column with a datacolumn expression

dtData.Columns(1).Expression = "Substring(Column1,3,Len(Column1) - 3)"

this is retrieval. Depending on what is done it can be done with a LINQ. But when it is used within the loop a seperate Retrieval LINQ will not bring a benefit

Compared with the topic title we do see more retrievals as changes.

1 Like

Do you advise using LINQ and expressions before anything?

No, I have never stated this. It was asked what will be done with the extracted values within the process

1 Like