Extracting values that appear in one data column and not in the other data table

Hi all, hope you can help me solve this:
I have 2 data tables from 2 different sheets.
I need to go through the data tables and focus on 1 column from each table: “Customers” from dt1 and “Customer names” from dt2. I need to extract the additional names that are in “Costumers” and not in “Customer names”.

Please note:

  • There will[probably] duplicate names on both columns that should be removed.

  • Not sure if the names will be parallel to each other…

Please advise.
Thank you!

HI,

Can you share input sample and expected output as file? It’s no problem if dummy data.

Regards,

Hi @hp321

You can try this way:

1. Read Excel Files
   - Read Range Workbook (Excel 1) → dt1
   - Read Range Workbook (Excel 2) → dt2

2. Remove Duplicates
   - Remove Duplicate Rows (dt1)
   - Remove Duplicate Rows (dt2)

3. Merge DataTables
   - Merge DataTable (dt1 and dt2) based on "Customers" and "Customer names"

4. Filter Differences
   - Filter Data Table (Merged DataTable)
     - Filter condition: Customer names is an empty string

5. Write Output
   - Write Range (Result DataTable)

Hope it helps

@hp321

Please try this in assign activity

Dt1 and dt2 are the datatables containing each table data

requireddt = Dt1.AsEnumerable.Where(function(x) Not dt2.AsEnumerable.Any(function(y) y("CustomerNames").ToString.Equals(x("Customer").ToString)).Distinct(function(x) x("Customer").ToString).CopyToDataTable

If you think there might not be 1 rows also which is not matched then to avoid exception first use if condition and instead of .CopyToDatatable use .Count>0 and then on the then side use .CopyToDataTable and on else side you have no unmatched rows at all

Cheers

Hi @hp321

Can you try with this

dtResult = (From row1 In dt1.AsEnumerable()
Let custName = row1(“Customers”).ToString().Trim()
Where Not dt2.AsEnumerable().Any(Function(row2) row2(“Customer names”).ToString().Trim().Equals(custName))
Select dt1.Clone().LoadDataRow(New Object() {custName}, False)
).CopyToDataTable().DefaultView.ToTable(True)

Thank you @Yoichi ,

Here is dummy data. Expected output is Dan and Kathy. [after removing duplicates, these 2 names don’t appear in the first column of the 2nd dt]
DummyData.xlsx (9.6 KB)

Hi,

How about the following? This returns string array.

arrResult = dt1.AsEnumerable.Select(Function(r) r("Cleint name").ToString).Except(dt2.AsEnumerable.Select(Function(r) r("Client").ToString)).ToArray()

Sample
Sample20231226-3.zip (9.9 KB)

Regards,

Hi @hp321

Use the below in Invoke code acitivity:

' Assuming dt1 and dt2 are your DataTables
Dim distinctCustomers_dt1 = dt1.AsEnumerable().Select(Function(row) row("MEMBER NAME").ToString()).Distinct().ToList()
Dim distinctCustomerNames_dt2 = dt2.AsEnumerable().Select(Function(row) row("Patient").ToString()).Distinct().ToList()

' Find additional names
Dim additionalNames = distinctCustomers_dt1.Except(distinctCustomerNames_dt2).ToList()

' Create a new DataTable to store the additional names
 resultDataTable = New DataTable("AdditionalNames")
resultDataTable.Columns.Add("AdditionalName", GetType(String))

' Add additional names to the DataTable
For Each name In additionalNames
    resultDataTable.Rows.Add(name)
Next

Below are the invoked arguments

Sequence8.xaml (8.9 KB)

Output:

Hope it helps!!

Hi all,
Any ideas why the first dt is null?

@hp321

Check if you have passed the correct sheetname

Regards

Thank you @Parvathy, @sanjay3, @Anil_G, @Yoichi for your helpful answers!
You each really helped me!

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.