LinQ to find duplicates

I have an excel file with following column :
Column A : Reference number,
Column B : ID number,
Column C : email id and
Column D : subject

Now i want to use LinQ formula to find second and greater than second duplicate from Column A and if duplicate found write in Column E as duplicate Reference number Then I want same to be done for Column B and if duplicate found write in Column E as duplicate ID number.

Plz help with sample code.

Hi @Omkar_Shete2

  1. Read the Excel file and store the data in a DataTable variable named dtData.
  2. LINQ query to find second and greater than second duplicate in Column A
duplicatesColumnA = From row In dtData.AsEnumerable()
                        Group By referenceNumber = row("Reference number")
                        Into grp = Group
                        Where grp.Count() > 1
                        Select New With {
                            .ReferenceNumber = referenceNumber,
                            .Count = grp.Count()
                        }
  1. Update Column E with “Duplicate Reference number” for second and greater than second duplicates in Column A
For Each duplicate In duplicatesColumnA
    referenceNumber = duplicate.ReferenceNumber.ToString()
   count = duplicate.Count

 For Each row In dtData.AsEnumerable().Where(Function(r) r("Reference number").ToString() = referenceNumber).Skip(1)
        row("Column E") = "Duplicate Reference number"
  1. LINQ query to find second and greater than second duplicate in Column B
duplicatesColumnB = From row In dtData.AsEnumerable()
                        Group By idNumber = row("ID number")
                        Into grp = Group
                        Where grp.Count() > 1
                        Select New With {
                            .IDNumber = idNumber,
                            .Count = grp.Count()
                        }
  1. Update Column E with “Duplicate ID number” for second and greater than second duplicates in Column B
For Each duplicate In duplicatesColumnB
    idNumber = duplicate.IDNumber.ToString()
    count = duplicate.Count

    For Each row In dtData.AsEnumerable().Where(Function(r) r("ID number").ToString() = idNumber).Skip(1)
        row("Column E") = "Duplicate ID number"
  1. Write the updated DataTable back to the Excel file

Hope it helps!!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.