Remove of duplicates from a datatable column

Hi All,
I have datatable which has close to 53 columns , in that I need to remove duplicates from one particular column based on two other columns logic. I am attaching below the input file few columns which are to be considered

Sequence No Layer Number Company Name
3 2 Syndicate
3 3 Hannover
3 3 Syndicate
3 4 Hannover
3 4 Swiss
4 2 Syndicate
4 3 Hannover
4 3 Syndicate
4 4 Hannover
4 4 Swiss
in this data I have to remove the duplicates if any at each layer. for eg at layer two we have two occurance of Syndicate so we should pick only one and that too with the highest sequence number. Requesting for suggestions as I m badly stuck with this

Hi @Yoichi ,
Any suggestions from your end would be of great help.

1 Like

@Ritika_Singh

Please try this…So What I did is I did a orderby on the two columns you need then grouped on 3rd column and retrieved the first row from each as that has the highest number

Input:
image

Code:
dt.AsEnumerable.OrderByDescending(function(x) Cint(x("Column2").ToString)).ThenByDescending(function(x) x("Column3").ToString).GroupBy(function(x) x("Column3").ToString).Select(function(x) x.First).CopyToDataTable

Output:
image

Hope this helps

cheers

Hi @Ritika_Singh ,
please go through below topic you will get your solution ,
# How To Remove Duplicates From Datatable In UiPath

mark as solution if it is useful

Regards
Mohini
Happy Automation…!!!

Hi, Thank you for your response I have one more condition to validate, not in every case there will be duplicates at layer no , so is there a way to find out first whether there are duplicates and then remove

@Ritika_Singh

If you see the input and output…if there is a duplicate it would remove else it would use the same which is there…so it if no duplicate then it gives that row as well in the output

In the screenshot spr is not duplicate but even that is present in the output

Do you want to know separately if its duplicate or not?

Cheers

Hi , This is not working as per my requirement , if at a particular l;ayer there are additional data which is there in other layer also it should not be eliminated , we need to check duplicate at each layer. For eg :slight_smile:
Layer 2: abc
layer 2: def
layer2 : abc

output

layer 2 : abc
layer 2: def

@Ritika_Singh

Could you please share an excel file with input and the expected output

As I see as per the current query if at same layer two values are there then it picks only one

And orderby I did on Layer you can change to sequence if needed

cheers

Actually there are multiple scenarios that can come up in the datatable : I need to finf]=d a common fix if there are duplicates it handles if not then nothing :slight_smile:

Input:
Sequence No Layer Number Company Name
3 2 Syndicate
3 3 Hannover
3 3 Syndicate
3 4 Hannover
3 4 Swiss
4 2 Syndicate
4 3 Hannover
4 3 Syndicate
4 4 Hannover
4 4 Swiss
3 5 Swiss
Output:
Sequence No Layer Number Company Name
4 2 Syndicate
4 3 Hannover
4 3 Syndicate
4 4 Hannover
4 4 Swiss
3 5 Swiss

@Ritika_Singh

Please try this

dt.AsEnumerable.OrderByDescending(function(x) Cint(x("Sequence No").ToString)).ThenByDescending(function(x) x("CompanyName").ToString).GroupBy(function(x) x("CompanyName").ToString+x("Layer Number").ToString).Select(function(x) x.First).CopyToDataTable

Hope this helps

cheers

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.