Removing duplicates by comparing two excels

Hi all,
I want to remove the duplicate rows of one excel by comparing with another excel.
Scenario is there are two excels, In 1st excel in ‘A’ column there are list of values which need to compare with another list of values present in another excel.If there are same values then i need to remove that row in the excel.
Can anyone please provide me a solution to solve this issue.

Thanks in Advance

hey @Shivaraju,
you first need to get two DataTables by reading both the excel sheets. Also, you need two nested(one inside other) foreach loops (first one for iterating through rows of first DataTable and second one to iterate through rows of second DataTable). Now, inside the body of inner foreach take an “if” acitivity.
The condition you need to give here is "if data in all the cells of the ith row of first DataTable is equal to the data in all the cells of jth row of second DataTable using “AND” because rows should match exactly.
Further, in the body of “if” activity use “remove data row” activity to remove that common row from second DataTable. And then write this DataTable to the second excel file.

Hope this helps.
Thanks

2 Likes

@Shivaraju Another approach is like that …

  1. Get 2 Datatables as Yogesh said.
  2. Merge that 2 Datatables
  3. Remove Duplicate Rows
  4. Write your new Datatable to an excel file with Write Range.

Let us know which ways work better to you or if you have any questions,

1 Like

Thank you @yogesh.chand and @carmen

Excel1.xlsx (10.1 KB)
Excel2.xlsx (9.9 KB)

I tried above two methods, but unable to delete the duplicate records.
Can anyone please provide me a hint to solve this issue.

what did you try? Could you share your workflow? Because I just did what I said to you and it is working ok.

@Shivaraju,

Can you please check this link for removing duplicates rows in excel in details. Please let me know still you’re facing this problem.

Thanks,
Arunachalam.

2 Likes

RemoveDuplicates.zip (17.7 KB)
Could you please check it once…

It is working fine to me … I cannot see the problem …did you get any error? the only thing is that you shoud add maybe the header to the column in the final excel

Compare Two data tables & get matches.
Here the Dt_OutPut contains the rows which is matched.

Dt_OutPut=(From row In dt1.AsEnumerable() where (From a In dt1.AsEnumerable() Join b In dt2.AsEnumerable() On a(“ColName_Dt1”).ToString() Equals b(“colName_Dt2”).ToString() select a).Contains(row) select row).CopyToDataTable()

Note: In the above query , Dt1 and Dt2 is compared and it took the matched rows from Dt1. Incase if u want to Get the matched rows from Dt2 then instead of select a you have to give select b.

To get Unmatched rows from dt1

dt1=dt1.AsEnumerable().Except(Dt_OutPut.AsEnumerable(),DataRowComparer.Default).CopyToDataTable.

To get Unmatched rows from dt2

dt2=dt2.AsEnumerable().Except(Dt_OutPut.AsEnumerable(),DataRowComparer.Default).CopyToDataTable.