Remove the duplicates in Excel based on two column values

Hi Everyone, I want to remove the duplicate records in an Excel file by comparing the values of 2 columns and only keep unique records,

image
here is the file now I want to remove Duplicate records from the file where Invoice number and Supplier Number column values are same. Ex. in the pic I want to remove 4th row as row 3 and row 4 have same supplier and invoice number. Rest the data will be kept as it is
Can someone please provide me the query

@Vineet_Joshi

1. Read Range (Output: DataTable)
2. Assign uniqueRows = (From row In dataTable.AsEnumerable()
                       Group row By key = New With {Key .Column1 = row.Field(Of String)("Column1"), Key .Column2 = row.Field(Of String)("Column2")} Into Group
                       Where Group.Count() = 1
                       Select Group.First()).CopyToDataTable()
3. Write Range (Input: uniqueRows)

Hi @Vineet_Joshi

Try this

distinctDataTable = yourDataTable.AsEnumerable() _
    .GroupBy(Function(row) New With {
        Key .Col1 = row.Field(Of String)("Column1"),
        Key .Col2 = row.Field(Of String)("Column2")
    }) _
    .Select(Function(group) group.First()) _
    .CopyToDataTable()

Hi @Vineet_Joshi

Try this:

dt_Output = (From row In dtInput.AsEnumerable()
                Group row By Key = New With {
                    Key .Column1 = row.Field(Of String)("Column1"),
                    Key .Column2 = row.Field(Of String)("Column2")
                } Into Group
                Where Group.Count() = 1
                Select Group.First()).CopyToDataTable()

dt_Output is of DataType System.Data.DataTable
Hope it helps!!

getting this issue "Unable to cast object of type ‘System.Double’ to type ‘System.String’

Hi @Vineet_Joshi

Try this:

dt_Output = (From row In dtInput.AsEnumerable()
                Group row By Key = New With {
                    Key .Column1 = row.Field(Of Double)("Column1"),
                    Key .Column2 = row.Field(Of Double)("Column2")
                } Into Group
                Where Group.Count() = 1
                Select Group.First()).CopyToDataTable()

Hope it helps!!

@Vineet_Joshi

Change the Datatype to double if you have number data

row.Field(Of Double)("Column1")

the data is in string format

Hi @Vineet_Joshi

If possible could you share a excel sample and expected output.

Regards

@Vineet_Joshi

Please share the sample excel file

image
here is the file now I want to remove Duplicate records from the file where Invoice number and Supplier Number column values are same. Ex. in the pic I want to remove 4th row as row 3 and row 4 have same supplier and invoice number

Hi @Vineet_Joshi

→ Read Range Workbook
image
Output-> InputDt

→ Use syntax in Assign:

dtExtractedText = (From row In InputDt.AsEnumerable()
                  Group row By Key = New With {
                      Key .InvoiceNumber = row.Field(Of Object)("Invoice Number"),
                      Key .SupplierNumber = row.Field(Of String)("Supplier Number")
                  } Into Group
                  Select Group.First()).CopyToDataTable()

=> Write Range Workbook dtExtractedText
image

Please try this @Vineet_Joshi

Hope it helps!!

Hi @Vineet_Joshi

Try this

DataTable.AsEnumerable() _
    .GroupBy(Function(row) New With {
        Key .InvoiceNumber = row.Field(Of Object)("Invoice Number"),
        Key .SupplierName = row.Field(Of String)("Supplier Name")
    }) _
    .Select(Function(Group) Group.First()) _
    .CopyToDataTable()

Input:

image

Output:

image

Regards,

@Vineet_Joshi

Please try this, it is working

@Vineet_Joshi

uniqueRows = (From row In dataTable.AsEnumerable()
                       Group row By key = New With {Key .Column1 = row.Field(Of Object)("Invoice Number"), Key .Column2 = row.Field(Of String)("Supplie Name")} Into Group
                       Where Group.Count() = 1
                       Select Group.First()).CopyToDataTable()

Output:

image

please share the sequence once @rlgandu

Hi @rlgandu , can you please share with me the sequence file? I need it for verification Thanks

Hi @Vineet_Joshi

Check the below workflow
Sequence33.xaml (10.7 KB)
AAAA.xlsx (9.6 KB)

Sheet1 is Input and Sheet3 is Output.

Regards

@Vineet_Joshi

Sequence10.zip (1.9 KB)

@Parvathy @lrtetala @rlgandu please share with me the screenshots of the arguments passed and properties I am not able to open the workflows which you shared