I know I’m making this more complicated that necessary, but I can not figure out the solution!
How do I iterate through a data table to delete duplicate names from a data table and keeping the newest record? About 50 rows of data.
John Taylor 07/14/2019
John Taylor 08/ 12/2019
John Taylor 08/12/2019
The list isn’t huge and there could be more than a couple duplicates for the same name.
This process is data scraping from a web site, doing some filtering to keep only the columns I need, and I have it sort the table so that the duplicate names are grouped together.
Sort the datatable on date and then get unique keys using default view property,
Datatable.DefaultView().ToTable(<Boolean Value to remove duplicate", “”)
In your case,
where the parameters in ToTable denote -
true => remove duplicate
“Name” => reference column name to remove duplicates.
Thank you for your help! I actually used the Remove Duplicate Rows activity to remove duplicates. The incoming data is a bit messy.
I’m needing to keep the row with the most recent date –
Resource Name | Modified Date
John Taylor | 07/14/2019
John Taylor | 08/12/2019
Like in this example I need to delete the row with John Taylor 07/14/2019.
So the result would be
John Taylor | 08/12/2019.
And there is like 50 rows that I need the RPA to sort through.
u can do it? i have a similar problem :c
Did you find a solution ?
I am having the same issue
You can try adjusting the following. It worked for me.
dtRaw.AsEnumerable.GroupBy(Function(r) r(“Name”).ToString).Select(Function(g) g.OrderBy(Function(r) DateTime.ParseExact(r(“Date”).ToString, “MM/dd/yyyy”,System.Globalization.CultureInfo.InvariantCulture )).Last).CopyToDataTable()