We have a large excel file that we need to process (over 4 million records, attached below is a small sample). The aim of our automation is to search through all the records that have negative value. And for each one of these:
Take its correspondent TRX_DATE, ITEM_ID & ORGANIZATION_ID and then search in ALL PREVIOUS THREE (3) DAYS records if it corresponds to another record within the same ORGANIZATION_ID & same ITEM_ID & same PRIMARY_ QUANTITY (but here positive)
If no such records are found, add the original record (the one with a negative value) to a new data table.
Ok there are times when I do envy the forum members for the fantastic challenges they get to work on, like this one!
Nonetheless, here are two things I would evaluate before using excel as input source to an automation.
1. Data binning
Even before we start, I would always advice that any table should always have a unique reference column. May be an ID column which is an auto incremented column. Another advantage will be the ability to containerize your data. This technique is called Data Binning. Using Data Binning you can think of your 4 million data points as may be 100000 and that all the logic you will use will be repeated 40 times.
When you read tables in UiPath they are stored in memory and to ensure that your robot PC does not just hogging the storage disk to compensate for RAM, data binning is the way to go for large dataset. It does not make the code faster, but it improves stability of your automation. You could also try this approach with your Linq expression.
2. Input excel data into a SQL table
Another alternative is to upload your excel file contents to a SQL table and perform the same operations via an SQL Procedure. SQL performance will usually be superior than excel/UiPath. In the end, the cleaned data can be retrieved by the bot/robot to do further processing (Robot calls the stored procedure).
I would still go by this as mentioned by @jeevith
Instead of having excel with such a large dataset as input source we can look for possibilities to hold those data in a DATABASE
It will be way more easier to get or manipulate data being a sql database
But if we don’t have that option then we can try with other approaches
Have a view on this examples with CSV and Datatables maybe this will help: