Duplicate rows in data table


#1

Hi all!
I have a data table with 23 columns and I need to find duplicate rows. For each duplicate row I need to add a flag on both identical rows on a 24th column. Whats the best approach? I have been searching the forum but I only find examples comparing one or two columns. I need to check all 23 columns. Is it possible to treat the 23 columns as one string like “Column1,Column2,…” and compare this string to other rows?
Any suggestions?

What I’ve done so far is to sort the data table on an ID and added the 24th column. Now I would like create the big string and loop through the data table. No idea how to do this though or if there are better ways to do this,…

Br
Cris


#2

@cristian_ivanoff

refer this post

Regards,
Mahesh


#3

Hi @cristian_ivanoff,

Refer this post,

DataTable

Regards,
Arivu


#4

Great! Thanks @arivu96 @MAHESH1

(From p in dt.Select() where( From q in dt.Select() where string.Join(",",q.ItemArray).Equals(string.Join(",",p.ItemArray)) Select q).ToArray.Count>1 Select p).ToArray.CopyToDataTable()

This returns a data table with all the duplicate rows. Any ideas how to just add a flag on a new extra column to show that the row is a duplicate? I would like to do a left join to my initial data table and just add a “D”.


#5

@cristian_ivanoff
Add DataColumn byusing Add DataColumn activity to dt name it Flag
Extract the Duplicate rows and store in a datatable dt1

Get the distinct values by using
dt2=dt1.DefaultView.ToTable(True)

Run for each row for datatable dt

in if condition
(From p in dt2.Select()
where string.Join(",",p.ItemArray).Equals(string.join(",",row.ItemArray)
Select p).ToList.Count>0

then Assign row(“Flag”)=“flag”

Regards,
Mahesh


#6

OMG! This is working perfect!

Thanks guys…Data tables are not so fun to work with :smiley:

Br
Cris


#7

Hi @cristian_ivanoff,
You can use “Remove duplicate rows” Activity also to remove the duplicates. Below I have attached a sample .

File : DatatableSample.zip (2.1 KB)

image

Regards
Balamurugan


Remove duplicates in excel
#8

hi,
please follow below tutorial to remove the duplicate rows in the data table.

thanks,
Karthik.


#9

Hi Guys,
If the duplicate names exist in column, how to select the first row of the duplicate column.

Thanks & Regards,
Jeevitha


#10

@JeevithaBA

Do the Group by for the Datatable based on column which is having duplicate values and take the first value of that group.

Let us take the datatable as dta as your original datatable

dtb =  (From p In dta.Select
          group p by ID=p("Column_ Name").ToString To Group
          Select Group(0)).ToArray.CopyToDataTable

So now dtb will contain the First Rows of Duplicate values.
Replace Column_Name with the Column name having Duplicate Values.

Regards,
Mahesh


#11

Hi Mahesh,
Thank you for your response,
I have tried to add that but iam getting error


#12

@JeevithaBA try below code. It might be useful.

DataTableName.AsEnumerable().GroupBy(Function(a) a.Field(Of String)(“column_name”)).Select(Function(b) b.First).CopyToDataTable()


#13

I have html table attached below, in that i need to click on first "Confirmation statement " , and " Annual return " row . I have tried to it ,But i got stucked with the logic pls help me outtest2.xaml (24.5 KB)


#14

Hi @JeevithaBA,

Have you tried this ?
dt=dt.DefaultView.ToTable(True,“Column1”)

Below it has conversation related this.

Regards
Balamurugan.S


#15

Is it possible to share link xaml file which you attached is having error


#16

Main.xaml (40.1 KB)


#17

Thank you for your response could u please check this

Thanks & Regards,
Jeevitha


#18

Yes i tried that


#19

@JeevithaBA

You first Extract the Table and store it in a datatable.
Then Filter And take only the First occurence of the rows, and add one more column which contains the row index of First Occurence.

Then run Foreachrow for the Filtered Datatable, and make a dynmaic selector based on the row Index stored in a column inside ForeachRow.

Regards,
Mahesh


#20

OK. thank you @MAHESH1.