Matching multiple row data between 2 large excel files (using DTs)

Hello, I have looked on the forum but cannot solve this problem. I have two very large data sets I need to compare in an efficient way. My plan is to create two data tables DT1 (original data ) and DT2 (results data ). These will be formatted the same way and basically I want to search with datatable 1 row for a match in data table 2. If no match is found for the row I want the values written in DT3 and exported to excel.

I have done a lot of searching but cannot find a solid solution. Attached is an example of the data columns and the results on DT3 for a non match. Data in all columns needs to match, Basically I am try to sniff out if any data was transferred incorrectly.

I know how to get the DTs up and just need help with the script to search for the entire row. I feel like creating varibles for each column item will take forever and am trying to find a more efficient route.

Thanks Example.xlsx (9.8 KB)

Use LINQ to get unmatched records from 2 Datatables and store into 3rd Datatable.

Syntax:

dt_UnCommonRows = dt1.AsEnumerable().Except(dt2.AsEnumerable(),System.Data.DataRowComparer.Default).CopyToDataTable

Thanks
@jarrettd88

1 Like

Is it possible to use multiple varibles when doing this ? I somewhat understand the code but is it possible to point to certian column data “MemberidDT1” “MemberIDdt2” “TypeDT1” “typeDT2” etc ?

Sorry but i have to ask where your very large datasets are coming from, because if it is database, you would better do this lookup in there and not bring all data to uipath and do your filtering after… Also is better to avoid linq expressions that look easy but will do a lot of conversions and could make this slower and processing/memory intensive…

the data is being imported from two excel files with over 26k rows on each sheet

ok, so no other way :slight_smile: Still you cant use Except because it will use all columns for the comparison, but you can use regular Were condition…

Do you have any resources on that ?

I basically want to run a for each loop on an entire row in DT1 and then filter to an exact match on DT2 matches require no action but if no match is found in the entire DT2 I want the row written to DT3 for output and review. I have seen many similar solutions but cant get the mapping quite down for some reason.

There will be 2 ways to choose:

  1. loop dt1 and run this to see if this row is present in dt2
    foundRows = dt2.Select("[colum1]='"+row("column1").tostring+"' AND [column 2]='"+row("column1").tostring+"'")
    IF foundRows.Count>0
    Add row to dt3 using foundRows(0).ItemArray
  2. Use Linq that will be able to find all rows that matches in a single query without need for the first loop (but can be slower, need to test)
    Dim vLINQ = (From DT1 In vDT1.AsEnumerable Join DT2 In vDT2.AsEnumerable On DT1.Field(Of String)("Key") Equals DT2.Field(Of String)("Key") Select new with { .Key = DT1.Field(Of String)("Key"), .Data1 = DT1.Field(Of String)("Data1"), .Data2 = DT1.Field(Of String)("Data2"), .Data3 = DT2.Field(Of String)("Data3"), .Data4 = DT2.Field(Of String)("Data4") }).ToList()

thank you for your help I will work on these as well. I am still very new to LINQ in the second option could you highlight the variables that need to be changed I get the gist but am messing up the syntax a lot .

Thanks again for your help .

1 Like

Hi @bcorrea,

I have a similar issue. However, I tried different solution as yours unsuccessfully.
On the right = dt1; on the left = dt2
I should to compare rows value of the column “Domaine valorisation” (dt1) with all rows value of the column “centre de coûts” (dt2). If its equal, i need to extract corresponded value in the column “Société” (dt2) and add it in the corresponded row of column “groupe valorisation” (dt1)
For example, for the first row of dt1 I need to have the value “520” in the column “groupe valorisation”
You can see my workflow test below the tables.
Do you know how I can resolve this issue please ?
.dt1.dt1 dt2

logic2

Instead of that IF condition, you can use Lookup Data Table instead, see here https://docs.uipath.com/activities/docs/lookup-data-table

Thanks @bcorrea

Finally, I found the solution here :

really? i dont see a relation in that post…

Sorry It’s a topic mistake, because I posted 2 different topics.
But yes, I found the solution as below: