Compare excel sheet with DataTable and write unique rows to Excel1 and duplicated to Excel2

Hi guys,

I have few PDF files.
For each file i am read OCR Text and this text i am adding to data row.
Next i am creating two excel files (one for unique rows, other one for duplicates)

let’s say i have 3 PDF and data which i got using OCR is:
pdf1: [AA, BB, CC] unique one → write in ExcelUnique
pdf2: [AA, BB, RR] unique → write in ExcelUnique
pdf3: [AA, BB, CC] duplicated with pdf1 - write in ExcelDuplicates

Now if i check for each row in DT_ExcelUnique and compare it with row from DT dons’t work as i expected because:

First run, excel is empty (0 rows) and row from DT(pdf1) i unique so row is saving in ExcelUnique.xlsx as first row
Second run, excel contains 1 row but row from DT(pdf2) is also unique so row is saving in ExcelUnique.xslx as second row.
Third run, excel contains 2 rows (from pdf1 and pdf2) (so for each loop will execute twice):

  • for first row in ExcelUnique, row from DT(pdf3) is duplicated with pdf1 - so duplicate will write in ExcelDuplicates.xlsx
    for second row in ExcelUnique, row from DT(pdf3) is different so it means its unique - so this row will save to ExcelUnique - which is wrong because first row from excel already is e qual to this value and i don’t need it.

How to fix it? Any clue?

image

@luksusq
maybe it is easier to handle when collecting all data rows into 1 Datatable and doing the unique, duplicate analysis later. for finding the duplicates, uniques or first from duplicate find some starter help here:
FindDupsUniquesFirstFromGroup_By1Col.xaml (8.0 KB)

1 Like