How to find missing rows in present in one Excel and not the other

Hello,

I have want compare two separate cols from two excel files and:

  1. Write the missing records (those in one Excel and not in the other one) to a third Excel File
  2. if there are duplicates records in 1 above, get a count of the duplicates.

First Excel with the col to be read for comparison
The image shows a spreadsheet with a single column labeled "CAT_NO" that contains a list of numbers ranging from 1 to 7, with duplicate entries of 2, 4, and 6. (Captioned by AI)
Second Excel with the col to be read for comparison
2ndSorce

I have tried two methods with no luck, but while trying the third method, I got this error
error

Here is the code throwing the error

Attached is my workflow for reference.

getRecsNotInBothExcelSheets.zip (18.4 KB)

@Yomi_Oluwadara,

Read both the excel data using ready range activity.

Remove additional column from second datatable.

Compare both datatable with below code

Thanks,
Ashok :slightly_smiling_face:

@ashokkarale

While running the workflow, I got an exception from the invoke code activity. I checked the argument and they look great

Hi @Yomi_Oluwadara , From my point of view, there is no need for invoke code, as this would make it harder for someone that does not have a background in coding to understand. Find the alternative below:

FYI, this would be the output:
image

Step 1

  • Same as above, obtain the information in 2 separate DTs, and initialize a Dictionary of String, Int, where we will keep the duplicate keys and its count:

image

Step 2

  • Lets start with finding the missing values, using the linq you tried to use above, just tweaked it a bit to make it work, and not fail if there is no results (CopyToDatatable will throw a SE if there is no results)
dt_Source2.AsEnumerable.Where(Function(x) NOT dt_Source1.AsEnumerable.Select(Function(y) y("CAT_NO").ToString).Cast(Of String).ToArray.Contains(x("Cat_Acct_Num").ToString)).AsDataView.ToTable

image

Step 3

  • Finding duplicates if the activity above found any missing row. Looping the resulting rows in a for each, and storing the number of occurrences in a dictionary to be used later (Can be enhanced with a Linq, but for understanding purposes, for each is easier)
  • Assign Save To:
dict_DuplicateCounts("Duplicate key"+CurrentRow("Cat_Acct_Num").ToString)
  • Assign Value to Save:
dt_Source2.AsEnumerable.Where(Function(x) x("Cat_Acct_Num").ToString.Equals(CurrentRow("Cat_Acct_Num").ToString)).Count

image

if there are no duplicates, the .AsDataview.ToTable will return an empty datatable, therefore there will be no loops on the for each, and the dictionary will return empty, while if you leave it as above, with the CopyToDatatable, you need to catch the error.

Hope this helps! Attaching sequence for your reference.

BR,
Ignasi

Sequence2.xaml (10.2 KB)
TestData.xlsx (9.2 KB)

1 Like

The last solution worked as expected

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.