Remove Duplicate - Single Column

Hi,

I am facing difficulty in removing duplicate. I am having 30 columns in the excel sheet and if there is a duplicate in a specific column even if other columns are not matching rows should be deleted.

I have used Dt2.DefaultView.ToTable(True,“Column1”) which works to remove duplicates for Column1 as per my requirement but writes back only Column1, other 29 columns are not written back into the sheet

I have used Dt2.DefaultView.ToTable(True,“Column1”,“Column2”,“Column3” …“Column30”) which is removing duplicates for Column1 if the data across all columns are matching and writes back all 30 columns. This is not matching my requirement.

Regards,
Balaji Nama

@balaji.nama
can you share the Excel or some sample data. Also give some details on what is the input and what is the expected output along the samples

Thanks

Hi,

I cannot share the data. But, can provide details on what am working on. Input is a report downloaded from a tool hence consists of 30 columns and 4000 rows. I need to remove duplicate from a specific column and delete all the duplicate rows irrespective if the data duplication across other 29 columns. But, unable to remove duplicates using the codes based in one column it considers all columns hence output is not correct.
Dt2.DefaultView.ToTable(True,“Column1”)

Regards,
Balaji Nama

@balaji.nama
when it confiodental data fine, do not upload,
Regardless from this you can share dummy data and doing the explanation on this.

What I have understood so

Col1,col,col3
A,B,C
A,CD,s
CD,g,o

CD is a value triggering that is a duplicate
But CD can be on every col, we dont know, due the number of cols, a flxible dynamic approach is searched.

Can you give a feedback on the requirment understanding? Thanks

1 Like

Referring to the below example -
Col1,col,col3
A,B,C
A,CD,s
CD,g,o

I have tried below 2 commands -

  1. Using Assign activity - Dt2.DefaultView.ToTable(True,“Col1, col, col3”)
    Output is coming as all 3 rows for me since col, col3 values are not matching for row 1 and row 2 even if Col1 value matches for row1 and row2 and output would be all 3 columns and all 3 rows.

  2. Using Assign activity - Dt2.DefaultView.ToTable(True,“Col1”)
    Output removes duplicates and only 2 rows are displayed but along with it only Col1 data is displayed

  3. Expected Output -
    Col1,col,col3
    A,B,C
    CD,g,o
    2 rows and 3 columns

@balaji.nama

Expected Output -
Col1,col,col3
A,B,C
CD,g,o
2 rows and 3 columns

I take this into following requirement:
Find all rows where a any value from Col1 is present in other Columns

For this scenario find starter Help here:
balaji.nama.xaml (9.6 KB)

refer to dtResult for the filtered data

1 Like