Remove Duplicate Rows based on Column

Hi,

I have this excel sheet where I want to:

  1. Remove all duplicate rows based on the column named “PO”.
  2. Keep all the duplicate rows ONLY based on the column named “PO”.

I tried the first part to remove duplicate rows but it is keeping a record of one of each row and and deleting the rest.
I used this:
dt.AsEnumerable().GroupBy(Function(x) convert.ToString(x.Field(of object)(“PO”))).SelectMany(function(gp) gp.ToArray().Take(1))

Modified_ASOS_June.xlsx (21.8 KB)

Can you help me please?
RemoveDuplicateRows (1).xaml (6.4 KB)

You can use LINQ statements to do remove dups.
Like
dt.AsEnumerable().GroupBy(Function(r) r.Field(of String)("<Col Name>")).Select(Function(s) s.First()).CopyToDataTable - This can be used to remove all dups with in the specified column.

dt.DefaultView.ToTable(True) - It will create a new Datatable with unique rows with respect to every column.

Both return a new Data table with no dups.

4 Likes

Thanks for the help!

dt.DefaultView.ToTable(True) - It will create a new Datatable with unique rows with respect to every column.

How can I get unique rows with respect to a SPECIFIC column then?

Hi…you can try like this…

InputDT.DefaultView.ToTable(True,“Column Name”)

1 Like

Please check with this first LINQ statement @Yudhisteer_Chintaram1

How do we distinct the range based on one column and keep another column intact.

Hey I have another solution:

You can group unique values (removing duplicates) for each column and finally update them in a new table.
Example:

  1. Build DataTable with duplicates: vDt1
    Table vDt1 with duplicates
    Note: vDt2 is a copy of vDt1 but empty

  2. For each():
    List of items= vDT1.AsEnumerable().GroupBy(Function(x) x(“Valor1”).ToString)
    Item: groupby_result

    Body:
    First, you have to analyze if there are rows in the vDt1 <= Number of item of the group
    If : vDt1.Rows.Count <= vIntRowValue
    Then: ADD DataRow with empy values
    Finally
    Assign:
    vDtDatosOCR.Rows(vIntRowValue)(“Value1”) = groupby_result.Key.ToString

Finally, we obtain the table without duplicates.
vDt1 - Non duplicates on column "Value1"

Note: You have to do the same for each column
Greetings