Duplicate entries with Different Timestamp in Excel

Hi Everyone,

I am looking for ways to manage this problem.

I have an excel containing duplicate data. I want to remove the duplicate data and retain only the row with the latest timestamp.

Please refer on the data below:

image

As you can see the data have 2 entries fro Andro Carlo

I want to remove the data row with timestamp of 2/27/2021 16:44 and retain the 2/27/2021 16:50

Thanks and Advance

@Alvin_Apostol1 - Could you please share the sample sheet (after removing the sensitive/confidential data)??

Hi @prasath17 I edited now the data

@Alvin_Apostol1 - It is the image…That’s not helpful. Please attach the sample sheet. The reason why I am asking is, I would like to see how the timestamp column has been showing in the excel?

SampleData.xlsx (8.7 KB)

Hi @prasath17 please see attached for the sample sheet.

Hello @Alvin_Apostol1
Maybe you can try this method. Although I think this is not the best solution, it can solve your problem.
First, I read range and sort datatable (Descending), then I get the closet timestamp.
Second, I remove duplicate ColumnA then delete ColumnA which is empty.

Forum0520.zip (9.0 KB)

@Alvin_Apostol1 - Please try like this…

image

Read Range ==> Preserve Format is ticked…(Please let me know if this is slowing down your process when you run against real data)

DtInput is the Ouput variable from the Read Range…

 Dtinput.AsEnumerable().Groupby(Function(a) a.Field(Of String)(“Column1”).ToString).Select(Function(x) x.last()).CopyToDatatable()

Output

image

Hope this helps…

Hi @Alvin_Apostol1

Try below linq expression. Change Min/Max function accordingly

InDT.AsEnumerable.GroupBy(Function(r) r("email")).SelectMany(Function (g) g.Where(Function(r) DateTime.ParseExact(r("Time").ToString,"dd/MM/yyyy HH:mm",Nothing)=g.min(Function(c) DateTime.ParseExact(c("Time").ToString,"dd/MM/yyyy HH:mm",Nothing)))).CopyToDataTable()