Delete Duplicates in a Column in Excel sheet

I have the attached file and it contains column Name “To”. I need to delete the duplicates in this column,.
Email Extracted Data.xlsx (12.0 KB)

I tried some of the details examples

DTeMails.AsEnumerable().GroupBy(Function(i) i.Field(Of String)(“To”)).Select(Function(g) g.First).CopyToDataTable
and dt = dt.Asenumberable().Group(Function(a) a.Field(of string)(“yourcolumnname”).ToString).Select(Functions(b) b.First()).CopyToDatatable()

but I couldn’t make it to work.
Appreciate your help on this.

@VegitlX_HuNteR,
Can you go through this link and see if heplful

@VegitlX_HuNteR
following could help:

  • more precise definition on defining of deletion of duplicates:
    • deletion of entire duplicates, keeoing one record…
  • the query syntax of LINQ

Lets play with following:
(From d in YourDataTableVar.AsEnumerable
Group d By k=d(2).toString.trim Into grp=Group
Where grp.Count > 1
Select grp.First()).CopyToDataTable

with this pattern we can control and adjust:
Where grp.Count = 1
catch all Non Duplicates - Removing the duplicates we can do within a second step and set operation Except

Where grp.Count > 1
Select grp.First())

Take from all duplicates the first

Where grp.Count >= 1
Select grp.First())

Take all non duplicates and take from all duplicates the first

So feel free to give it into a play round. In case of you need more help let us know more requirements on the sample input, sample expected output

2 Likes

Hello,

Please find the attached xaml and do let me know if this works for you.

Nagi

Duplicate_Columns.xaml (7.5 KB) Email Extracted Data.xlsx (11.7 KB) Output.xlsx (8.9 KB)

2 Likes

I’m getting this error

.

@VegitlX_HuNteR
set proper columnname in Excel or use Columnindex 2

This is the excel file

and this is the value of dtArray= dt.AsEnumerable().GroupBy(Function(x) x.Field(Of String)(“To”)).Select(Function(y) y.First()).ToArray()

@VegitlX_HuNteR
I dont get you.
The downloaded / shared Excel didnt had a “To” ColumnName but if you added thats OK
The LINQ:

dtArray= dt.AsEnumerable().GroupBy(Function(x) x.Field(Of String)(“To”)).Select(Function(y) y.First()).ToArray()

is a different one from my post. So what is your Question?

When I download excel, it didn’t have any column names, so I have added column names and wrote LINQ. Are you getting any error while running the attached xaml?

@VegitlX_HuNteR
find some starter help, showcasing:

  • retrieve duplicates
  • retrieve non duplicates
  • retrieve nonduplicates and first from duplicates

StarterHelp here:
FindDupsUniquesFirstFromGroup_By1Col.xaml (8.4 KB)

It is now working the one that was provided by @nagireddy18. I just recreated the script in my machine and run and it worked.

Thanks everyone for your help.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.