How to identify duplicate rows

Dear Experts

I have an excel file with duplicate items in it
If duplicate items exist, then remark with “Duplicate” in Column Test

May I know how can make it happen?

D365-2.xlsx (10.7 KB)

Thanks

@yangyq10,

If you are getting the updated data in dtResult, just use Write Range activity to write it back to the excel.

@ashokkarale

It doesn’t allow me to do so


o

@yangyq10

Type this in the Write Range activity.

dtResult.CopyToDataTable

@ashokkarale

I am sorry, I revised the question…

try below in Invoke code activity

Dim dt_Temp As System.Data.DataTable
dt_Temp = dt_Test.Copy
dt_Test.AsEnumerable.ToList.ForEach(Sub(row)
row(1) = If( (From rw In dt_Temp.AsEnumerable Group rw By k=rw(0).ToString Into grp=Group Where grp(0)(0).ToString = row(0).ToString Select grp.count)(0) >1,"Duplicate","")
End Sub)

@AkshaySandhu

Thank you. I have 2 questions

  1. I can’t write it back correctly, do you know why?

  2. If the target column in different column, how to change the number in coding accoridngly?


@yangyq10,

Check if there are any blank columns in your datatable before the data columns.

row(1).ToString in this we are referring column by index of it, if you want to refer it by name you should use row("Column1").ToString

@AkshaySandhu

I try when column position is different and result not correct

Can you help show me which index I should change?

I only need to identify duplicate items in only one column which is marked with pink color

D365-2.xlsx (11.0 KB)


Hi @yangyq10

Use this below method

nameColumnIndex as int32 = InputDT.Columns.IndexOf("K1")

duplicateColumnIndex as int32 = InputDT.Columns.IndexOf("Test")

InputDt as Datatable=

 (From row In InputDT.AsEnumerable()

           Let isDuplicate = If(InputDT.AsEnumerable().Count(Function(r) r(nameColumnIndex).ToString().Equals(row(nameColumnIndex).ToString())) > 1, "Duplicate", "")

           Select InputDT.Clone().LoadDataRow(row.ItemArray.Take(duplicateColumnIndex).Concat({isDuplicate}).ToArray(), False)).CopyToDataTable()

Try below code:

Dim dt_Temp As System.Data.DataTable
dt_Temp = dt_Test.Copy
dt_Test.AsEnumerable.ToList.ForEach(Sub(row)
row("B") = If((From rw In dt_Temp.AsEnumerable Group rw By k=rw("A").ToString Into grp=Group Where grp(0)("A").ToString = row("A").ToString Select grp.count)(0) >1,"Duplicate","")
End Sub)

In above row("A").ToString represent ID (“G000150436000011” etc.) column and row("B") represent Duplicate status column (“Duplicate” or “” (Empty String))

1 Like

@AkshaySandhu

Sorry for late reply. I keep testing this morning
Yes, it works!
Thank you~

@sarvesh.b

Thank you, let me try it later :smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.