Skip duplicates from database

I would like to skip the duplicate records in the database because I occasionally receive duplicate records from the five separate excel files that I receive. I have attached the results from database..number 4&7 are duplicates from different files…Please assist…thank you

@pabaleloh,

You can use remove duplicate rows activity to remove duplicate rows.

If you look at the screenshot there are no actual duplicate rows, the timestamps are different as are the last two columns for rows 4 and 7.

@pabaleloh just search the forums and you’ll find how to remove duplicate values from a datatable.

2 Likes

@pabaleloh

use like this..as I see onlu the number before time stamp is common

dt.AsEnumerable.GroupBy(function(x) x(0).ToString.Trim.Split(" "c).First).Select(function(x) x.First).CopyToDataTable

cheers

Which one you want to retain 4 or 7 ? Because the other columns have different values. Is that okay to loose those values ?

Hi @pabaleloh,
you can go with @Anil_G or use below syntax to remove the duplicate values in the table.
(From p in dt.Select() where(From q in dt.Select() where Convert.ToString(q("ColumnName")).Contains(Convert.ToString(p("ColumnName")).Split(" "c)(0).ToString()) Select q).ToArray.Count>1 Select p).ToArray.CopyToDataTable()

Regards,
Arivu

1 Like

Hey @pabaleloh can you try below expression to remove duplicate rows

dtSheet.AsEnumerable().GroupBy(Function(x) x.Field(Of String)("Col
")).Select(Function(y) y.First()).CopyToDataTable()