Remove duplicates from a specific column

Shailesh123 · September 3, 2019, 1:22pm

Hello everyone ,
I have to remove duplicates only from a specific column so that my Bot does not pick the repeated value in the next iteration. Currently I am using Invoice key and Unique reference to login to an application make a search . The Invoice Key has duplicates whereas Unique reference doesn’t have duplicates. How can I eliminate duplicates only from Invoice Key column keeping the actual Data table structure intact without deleting other rows in the Data table ?

Any sort of input would be much appreciated . Thanks !

chenderson · September 3, 2019, 1:34pm

Good morning Shailesh,

Does this post help resolve your question?

balupad14 · September 3, 2019, 1:39pm

Hi @Shailesh123,
In this link I have shown 2 way to get the distinct values from datatable.

Regards
Balamurugan.S

Shailesh123 · September 3, 2019, 1:43pm

@chenderson-Good morning . The mentioned post would be helpful if I want to remove all the other row values as well. In my current scenario I do not want to remove values from other rows. I just want to delete duplicates from a Column so that it does not effect rest of the other values from my Data table. Does that give you an idea of what I actually needed?

chenderson · September 3, 2019, 1:52pm

I’m still a little unclear Shailesh.

If row 1 and 2 have the same value in ColumnA, do you want the value in row 2 (for this column) set to a blank/empty string?

Shailesh123 · September 3, 2019, 2:15pm

Hi @chenderson- in my scenario only out of 50 columns, only one Column is having duplicate values and I want to delete/empty those cells which have repeated values from that specific column only . How can i do that is what I meant to ask.

Regards
Shailesh

Dave · September 3, 2019, 2:17pm

so you have a datatable with 50 rows and 50 column. There is a duplicate value in column 1 rows 1 and row 2. So you want the output to have 50 rows, but the value in row 2, column 1 should be blank? Or did you want to completely remove row 2 altogether?

Shailesh123 · September 3, 2019, 2:19pm

@Dave - I do not want to remove row 2 completely . I just want to remove value in row 2 column 1 which is duplicate and nothing else.

Shailesh123 · September 3, 2019, 2:27pm

I do not want to remove row 2 completely . I just want to remove value in row 2 column 1 which is duplicate and nothing else

Dave · September 3, 2019, 2:29pm

I think you can use LINQ to achieve this although I haven’t tested it. I am assuming the field you are comparing is of type string. I am assuming your datatable is a variable called dt

Create a variable of type Enumerable i’ll call RowsWithDuplicates

Assign RowsWithDuplicates = dt.AsEnumerable().GroupBy(Function(x) x.Field(Of String)(“YourColumnName”)).Select(Function(y) y.First())

Now use a for each activity to iterate through and update the column with an assign activity. Assign row.item(“YourColumnName”) = string.empty

I haven’t tested, so syntax might be off

Shailesh123 · September 3, 2019, 2:54pm

@Dave I am unable to iterate through it as it a generic collection while I need is a datatable as the output .

Dave · September 3, 2019, 3:07pm

Well I did forget to include that you wanted duplicates. The code above is selecting everything EXCEPT the duplicates. I had a chance to test this one and it works:

Assign RowsWithDuplicates = dt.AsEnumerable().GroupBy(Function(x) x.Field(Of String)(“YourColumnName”)).Where(Function(y) y.Count() > 1).Select(Function(y) y.First)

This is of type IEnumerable<DataRow>

Then in a for each loop, change the TypeArgument to DataRow.

Assign row.item(“YourColumnName”) = string.empty

I have tested it and this will work. It iterates through the collection of datarows directly from your datatable. Your output is still the original datatable. Here is a sample .xaml
Shailesh123.xaml (9.4 KB)

EDIT: I realized after posting that this will only work if there are only 2 rows that are duplicated. If it is 3+ it will still leave duplicates. This can be resolved by putting it all within a while loop, or updating the assign activity to get IEnumerable<iEnumberable<datarow>> type and doubling the for each loop. I attached the updated .xaml showing the second method here: Shailesh123-updated.xaml (10.7 KB)

Dave · September 3, 2019, 6:59pm

@Shailesh123 did the solution provided work for you?

Shailesh123 · September 4, 2019, 5:38am

@Dave Hello Dave it did partially solve my problem . But now I want to read the relevant value of row 2 . For example : if Row1 has a Key I want to basically read Row2 as a Value for Row1.
So that when I pass Row1 value as a variable , I can read Row2 value as another variable which is equal to row1 .

Dave · September 4, 2019, 12:53pm

Do you always want the row below it? If so, do it in a for each row loop.

For each row in dt
If (put your condition here to check for key value)
Assign row.item("YourColumnName") = dt.Rows(dt.rows.indexof(row)+1).item("YourColumnName").ToString

This will loop through the entire table and assign the value from the row below it into the current row for the column specified. Keep in mind if the last row meets your if condition it will throw an error, because there is no row below it to grab the value from

sharathraju489 · November 26, 2019, 5:19pm

Here is a detailed article on that
# How To Remove Duplicates From Datatable In UiPath

Regards,

Topic		Replies	Views
Remove duplicates in DT for specific columns and keep all column and values Studio datatable , excel , studio , question , activities_panel	5	3482	August 27, 2021
Remove of duplicates from a datatable column Studio studio , question , activities_panel	11	1141	February 2, 2023
How to remove duplicate values in a specified column in a datatable? Studio activities , question	5	495	July 30, 2023
Selection of Datatable Rows Activities uiautomation , question	2	106	May 22, 2024
Remove rows or duplicates from datatable Help	5	7288	November 12, 2019

Remove duplicates from a specific column

Related topics