Join DataTables to find rows with differences

Hi all,

I have 2 datatables which I want to compare, and return only the rows that contain different data in 1 or many columns.

For example,

dt1
ID | FirstName | LastName
12345 | John | Lennon
23456 | Paul | McCartney

dt2
ID | FirstName | LastName
12345 | John | Smith
23456 | Paul McCartney

I want the returned datatable to be only the rows where the ID is matched, but contains different data (for example LastName here, but ANY of the fields different to the original)

Result =
ID | FirstName | LastName
12345 | John | Smith

What is the best way to achieve this? Join DataTable activity or is Linq query going to be more effective?

The idea is that I will receive a data file each day which is also hosted in a sharepoint list, and I want to check for differences, and only update rows in the sharepoint list which have a new detail.

HI,

How about the following?

 arrDr = dt2.AsEnumerable.Except(dt1.AsEnumerable,DataRowComparer.Default).Where(Function(r2) dt1.AsEnumerable.Any(Function(r1) r1("ID").ToString=r2("ID").ToString)).ToArray()

Sample
Sample20240911-3.zip (10.5 KB)

Regards,

Join Data Table will get you the rows where ID matches. Then you could use Remove Duplicate Rows to get rid of rows where the data is the same.

I tried using Join Datatable on “ID” = “ID” from both dt, but result is just all rows because all the ID are the same.

If all the IDs are the same then what are you trying to do? Why would all the IDs be the same? Also, did you set it to left join? That way you get all rows from the left datatable and only matching data from the right.

This is because I am maintaining some employee data. Their ID will always be the same, but I may need to update their details (Job title, status, etc) as new data files come through. So I need to identify them on the ID but then update any of the fields with new info.

In this example, I was expecting to identify that “12345” has an updated LastName value, so we will update that

I tried left join, but my result datatable is just the dt1 and a blank dt2 alongside it

I tried this, but I seem to only be getting a resulting datatable which contains all rows - basically a clone of dt1. Whereas I expect only a datatable containing 1 row in this example for ID “12345” showing the row with new LastName value

@Jon_G

  1. Do a join datatable with all columns you want to check with and use a full join or outer join
  2. Now after the join use filter datatable and filter for empty value on a column from dt2 so that you will get all non matched data present in dt1

Cheers

You may try this solution:
table1.AsEnumerable.Except(table2.AsEnumerable, DataRowComparer.Default).Union(table2.AsEnumerable.Except(table1.AsEnumerable, DataRowComparer.Default)).GroupBy(Function(r) r(“ID”))(1).CopyToDataTable

This is close, unfortunately it only returns 1 record depending on the column index I use in the end of the syntax. (the number “1”)