Comparing two excels with 400K rows each

I have 2 Excel files which has data around 400k+. I need to get distinct of 4 columns for DT1 and then check if a particular row value in distinct .DT1 is available in DT2. If row value is available in DT2, then compare the row values with DT1 and if a condition is satisfied then add a status to the row as Success.
I was able to do the use case as test with very limited data and it works as expected. As the rows are huge it takes for ever to complete the run.
I have - Read 2 excels and added them to DT1 and DT2. Did a distinct on few columns and saved it to DT1. then for each row in DT2 i placed a for each row in dt1 and started comparing with If dt2row(“ColName”)tostring = dt1row(“ColName”) then do this.
I need help to speed up the process.

Hello, maybe you can split your Dt2 in 2 dt with 200k items and do your operation in parallel with Dt2.1 and Dt2.2 ?

Hi @jvarghese9689

Can you please look into the below thread?

Hope this will be useful. Thank you.

The bot is taking 1hr to process 2k rows,even though we do parallel process we are looking at long hours to process

Hi Jobin, I tried playing around with the Query, but i guess am doing something wrong with the query. Attached is the Screenshot of what i want it to be a query/ or optimized approach.
I have tried with parallel process, still takes time. Trying with merge and join tables now.


The best solution for this is LinQ to get it done. I am not much aware of LinQ.

Adding @kirankumar.mahanthi1 @Yoichi @pravin_calvin


I agree, I am trying to get a linq query and its getting confusing for me.


Just Share some sample data with expected Output sample With us
Based in this WE will Help you on a LINQ or alternative solution approach