Remove Duplicates from one column based on a different column. Keeping the original

Hi all,
Hope you can help me solve this:
I have a Data Table where I need to remove duplicates in the column “Games” only if:
the column “Type” is NOT empty for that row.

Meaning: for each row in the DT we need to check if the “Type” value is not empty. If not empty then add to the “Games” column to remove duplicates.
In the end I need the original values back in the “Games” and “Type” column and Data table with the only change of the removed duplicates based on the “Type” column.

DummySampleRemoveDupBasedOnColumn.xlsx (9.1 KB)
Hope you can solve this,
Thank you!

Hello

Can you provide an output/final example?

Cheers

Steve

@hp321

Try this

Dt = Dt.AsEnumerable.GroupBy(function(x) x("Games").ToString + x("Type").ToString).SelectMany(function(x) If(IsNothing(x(0)("Type")) OrElse String.IsNullOrEmpty(x(0)("Type").Tostring),x,x.First.AsEnumerable)).CopyToDataTable

Cheers

Thank you @Steven_McKeering , the expected output is in the attached excel sheet.
Thank you @Anil_G. Please see the error I get:
image
Can you please advise?

I see the a description of the expected output but please add a screenshot with the highlighted rows you want to keep/remove.

When working with datatables it always easiest to show before and after examples.

You will have an answer very soon….

Plenty of great people who can help on the forums like @Anil_G :wink:

DummySampleRemoveDupBasedOnColumn.xlsx (10.8 KB)
image
The firs Cards should be removed because there are duplicate cards with at least one more with not an empty “Type”.

Thank you @Steven_McKeering , I added the file again and highlighted the row to be removed as well as a 2nd sheet for the output.

@hp321

Please try this

Dt.AsEnumerable.GroupBy(function(x) x("Games").ToString + x("Type").ToString).SelectMany(function(x) If(IsNothing(x(0)("Type")) OrElse String.IsNullOrEmpty(x(0)("Type").Tostring),x,x.Take(1))).CopyToDataTable

cheers

Thank you @Anil_G,
Now it runs, though still keeps some duplicates that have value in “Type”. note that “Type” is a date…Any suggestions?

@hp321

Use like this

Dt.AsEnumerable.GroupBy(function(x) x("Games").ToString + Cdate(x("Type").ToString).ToString("MMddyyyy")).SelectMany(function(x) If(IsNothing(x(0)("Type")) OrElse String.IsNullOrEmpty(x(0)("Type").Tostring),x,x.Take(1))).CopyToDataTable

cheers

Thank you @Anil_G,
I’ve modified the code you provided so that it won’t throw an error if value in “Type” is empty to this:

Dt.AsEnumerable.GroupBy(Function(x) x(“Games”).ToString + If(IsDBNull(x(“Type”)) OrElse String.IsNullOrEmpty(x(“Type”).ToString), “”, CDate(x(“Type”).ToString).ToString(“MMddyyyy”))).SelectMany(Function(x) If(IsNothing(x(0)(“Type”)) OrElse String.IsNullOrEmpty(x(0)(“Type”).ToString), x, x.Take(1))).CopyToDataTable

Though I’m still getting duplicate values in the “Games” column,even though they have value in “Type”…
Can you please help with this?

@hp321

Check how the dates are being read in the datatable…

or send the exact file with dummy ddata…as on my side I am getting as expected

cheers

Hi,
I’m trying to use the below in assign to dt_GoodData. [dt_ItemsEmpty is the original dt I’m reading]. Any ideas why this is throwing errors?

dt_ItemsEmpty.Clone()
For Each row As DataRow In dt_ItemsEmpty.Rows
	If Not String. IsNullOrEmpty (row("Type").ToString()) Then
		Dim game As String= row("Games").ToString()
		If Not dt_GoodData.AsEnumerable().Any(Function(x)x("Games").ToString()=game) Then
			dt_GoodData.ImportRow(row)
		End If
		Else
			dt_GoodData.ImportRow(row)
		End If
	Next row

Hi @hp321 ,

Use this code

In_DT.Clone()

Dim uniqueCombinations As New HashSet(Of String)()

Dim distinctRows As New DataTable()
distinctRows.Columns.Add(“Games”)
distinctRows.Columns.Add(“Type”)

For Each row As DataRow In In_DT.Rows
Dim games As String = row(“Games”).ToString()
Dim type As String = row(“Type”).ToString()

Dim combination As String = $"{games} {type}"

' Check if the combination is unique
If uniqueCombinations.Add(combination) Then
    ' If unique, add to distinctRows DataTable
    distinctRows.Rows.Add(games, type)
End If

Next

’ Assign the distinctRows DataTable to the Out argument
Out_DistinctRows1 = distinctRows

and parameters are

final output store in DT1.

i have checked the code and working fine. Still not get will do video and publish in My yt channel.

Regards,
Pavan Kumar

Hi @hp321 ,

if need linq then follow the below code

DT.AsEnumerable().GroupBy(Function(row) $“{row(“Games”)} {row(“Type”)}”).Select(Function(Group) Group.First()).CopyToDataTable()

image
This is input which i have used