How to output duplicates and retain one copy in the data table?

I have a SourceDT, I want to capture the duplicate rows and store it on DT2 and retain one row of the duplicate in SourceDT.

INPUT →

SourceDT:
Unique Identifier | Name
123A | Zeus
789C | Poseidon
123A | Zeus
123A | Zeus
456B | Athena
789C | Poseidon

OUTPUT →

DT2:
Unique Identifier | Name
123A | Zeus
123A | Zeus
789C | Poseidon

New SourceDT:
Unique Identifier | Name
123A | Zeus
789C | Poseidon
456B | Athena

How can I implement this in LINQ or using UiPath activities?

Hi @anthonyjr

Try out this for your NewSouceDT

NewSourceDT= SourceDT.AsEnumerable.GroupBy(Function(row) row(“ColumnName”)).Select(Function(row) row.First).CopyToDataTable

for duplicates
If you want to find the array of duplicate values from a particular column of a table then
In a assign activity

List_variable = yourdatatablename.AsEnumerable().Select(Function (a) a.Field(of string)(“yourcolumnname”).ToString).ToArray().GroupBy(Function(x) x).Where(Function(y) y.Count() > 1).ToList()

1 Like

@anthonyjr

Welcome to our UiPath community.

Try below Linq query to get the required output.

dtOutput = ( From row in sourceDT Group row by a = row("Unique Identifier").ToString.Trim, b = row("Name").ToString.Trim into grp = Group Where grp.Count > 1 Select grp.First).CopyToDataTable

1 Like

What data type should I browse for List_variable?

Hi @anthonyjr ,

Is this the expected output?

image

If so, then here is a workflow I’ve developed for you →
SplitDuplicates.xaml (8.3 KB)

Along with the codes used for each operation →
Duplicates:

dt_sampleData.AsEnumerable().
	GroupBy(Function(g) String.Join("",g.ItemArray.Select(Function(s) s.ToString))).
	Where(Function(w) w.Count()>1).
	SelectMany(Function(sm) sm.Skip(1)).CopyToDataTable()

Unique:

dt_sampleData.AsEnumerable().Distinct(DataRowComparer.Default).CopyToDataTable()

This is performed assuming you want to compare the entire row.
If its different, I can create another sequence for you.

Kind Regards,
Ashwin A.K

1 Like

Sorry I have mistake on my post. I tried your codes for duplicates with the data table below but it only output 1 copy of the duplicate. Hoping you can still help me with this.

SourceDT:
Unique Identifier | Name
123A | ZeusA
789C | PoseidonA
123A | ZeusB
123A | ZeusC
456B | AthenaA
789C | PoseidonB

Expected DT2:
Unique Identifier | Name
123A | ZeusB
123A | ZeusC
789C | PoseidonB

Hi @anthonyjr ,

I’m not sure if I’ve understood your query correctly.
Lets assume that there are three duplicate rows, you want the last two rows to be transferred to the second datatable correct?

Also, when we say “duplicate”, are you referring to duplicates in the first column, or the entire row has to treated as a single unit during comparison?

Kind Regards,
Ashwin A.K

Yes, I want the last two rows to be transferred to the second datatable.

Also yes, I’m referring to duplicates in “Unique Identifier” Column.

Best Regards,
Anthony Jr.

did u get a chance to try my above linq

@anthonyjr

Hi @anthonyjr ,

Alright then, could you confirm if this is the expected output for the Unique(last table)?

image

Duplicates->

dt_sampleData.AsEnumerable().
	GroupBy(Function(g) g("Unique Identifier").ToString).
	Where(Function(w) w.Count()>1).
	SelectMany(Function(sm) sm.Skip(1)).CopyToDataTable()

Unique->

dt_sampleData.AsEnumerable().GroupBy(Function(g) g("Unique Identifier").ToString).Select(Function(s) s.First()).CopyToDataTable()

SplitDuplicates_v1.xaml (8.3 KB)

Kind Regards,
Ashwin A.K

I did but I get an error.

Maybe because I’m using the wrong data type for the “List_variable”. I used List data type, what should be the correct one?

Thank you so much sir @ashwin.ashok ! You have a clean and easy to understand code.
Your solution is detailed too. Thanks again!

1 Like

FindDuplicates.xaml (8.4 KB)

Sorry for late reply here u go a small sample @anthonyjr

Thanks
Nikhil

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.