I need to remove duplicates from a DataTable using LINQ. Specifically, I need to filter out all duplicates for a single column, maintaining the initial Datatable strucutre. Other methods, such as DefaultView, do not work as they print out only one column (the one containing duplicates).
Any suggestion would be highly appreciated, code examples would be terrific.
Hey @ncompa @Lucas.Pimenta answer will also work he just posted a c# version of Lambda Expressions and workflow supported Vb version so to use that in studion you have to follow the way mentioned by @florin.stan
@Lucas.Pimenta answet will also work just some modification will be needed (VB version) so studion will take it
what uipath activity are you putting this in? i am trying to do the same thing getting rid of duplicates but am unsure where you put this code. i have used Assign with LINQ but it gives me compile errors with this snippet you have.
(From x In dtNew.AsEnumerable()
.GroupBy(Function(i) i.Field(Of String)(“Employee ID”))
.Select(Function(g) g.First))
.CopyToDataTable()
Error is : ‘)’ expecteed but i dont see where. I added paran’s at the end and before From and after First
If i just use it the way you had it earlier without my additional paran’s it tells me “End of expression expected”
Awesome, that works, thank you! Additionally i need to add a second “ProjectID” column to the Select and GroupBy. I’m trying to add that in as well but cant seem to get it right. I’m not sure if i need to add all the Function stuff again and with different letters (i.e. “e” and “g” like you used) What would the syntax be for adding a second field to that query?
If you’re trying to first remove all employee id duplicates and then remove all project id duplicates, @Mr_JDavey’s solution is the one you want. If you’re looking to remove all duplicate pairs of employee id and project id, you can make use of anonymous types:
dataTable = dataTable.AsEnumerable()
.GroupBy(Function(i) New With {
Key.EmployeeID = i.Field(Of String)("Employee ID"),
Key.ProjectID = i.Field(Of String)("ProjectID")})
.Select(Function(g) g.First)
.CopyToDataTable()
that worked! Is it possible to specify returning only those fields in the select part?
I tried Select(Function(g(“Employee ID”), g(“ProjectID”)) but that didn’t work.
Thanks florin.stan - you’ve been a great help!
If you can help with the syntax on a follow-up question that would be great…
Now I have 2 datatables with the same structure and I want to write a linq to return where there is a record in dtNew but not in dtPrior based on comparing the 2 fields Employee ID and WBS Level3. I have it working with just one field but can’t seem to get it working with both fields. Here is what I have so far that isn’t working:
(From x In dtNew.AsEnumerable() Where not (From y In dtPrior.AsEnumerable() select y(“Employee ID”)).Contains(x(“Employee ID”)), y(“WBS Level3”)).Contains(x(“WBS Level3”)) select x).CopyToDataTable()
I would want the result to again be Employee ID and WBS Level 3