How to count the presence of duplicate values in the excel file and pass that count value in another excel file?

How to count the presence of duplicate values in the excel file and pass that count value in another excel file?

Use a for each loop on the rows. assign the value to a variable to check other rows for duplicates

CheckDups = row.item(“ColumnName”).tostring

Then have another for each loop going through the same column and an IF statement. Obviously it’ll find itself as a match but you can handle this

IF row.item(“ColumnName”).tostring = CheckDups

THEN

Counter = Counter+1

Read Excel Data into DataTable, DT1.

Assign Activity,
DT2 = DT1.DefaultView.ToTable(true, "ColumnName")

Assign Activity,
DuplicateCount = DT1.Rows.Count - DT2.Rows.Count

5 Likes

Hi,
read the excel using excel application scope read range(dt1) and then build one more datatable of same number of columns using build datatable(dt2) and try to add the rows from dt1 to dt2 using for each of dt1.
And then for each of dt1 add one for each inside for each of dt1 and use that for dt2. use a if condition to check if rowdt2(0).equals(rowdt1(0)), if yes then increment the counter else it will check with other values.
Let us know if this helps.
Regards,
Pavan H

can you please type the sample code for this of how exactly to implement?

not the most elegant solution but it will provide the number of duplicates in a single column datatable

TestDups.xaml (11.6 KB)

@ronanpeter i have an excel file in which the names are duplicates so i have to select the name and print its occurence in new excel file
for ex:
Name :Sandeep Count: 10
like this.
thanks and regards ,
sandeep

This will give you that output

TestDups_Names.xaml (12.5 KB)

@ronanpeter it must take excel as input and in that excel file the second column is name. Please help.

Remove the first Generate DataTable activity and replace with a Excel Application scope to read your file and output it to the same Datatable (dt_Words).

image

The solution subsequently uses the column name (“Name”) for the checking of duplicates, so you shouldn’t need to do any other changes

Its giving me error as read range does not exist. Can you please use any random .xlsx file which contains multiple occurrence if same name in the same column and the count of those duplicates are shown?

Thanks and regards

TestDups_NamesExcel.xaml (15.1 KB)

Employees.xlsx (8.5 KB)

Ok. This is an excel version.

@ronanpeter can you tell me how to take both in different table like name in a different column and count of that name in a different column, likewise i’d like to know if we can use select query to get the duplicates and count of duplicates.

Thanks and regards,
Sandeep

You need to understand the For Each loops. They are looking at each row, but you tell it which column to cycle through.

row.Item(“Name”).ToString

So just update the column name to that which you are looking to count duplicates in.

As for the comparing that against another column for duplicates, you can just update the CheckDup variable in a similar way.

I don’t know what you mean by the select query question. If you are looking to use the information output, e.g “Name: Sandeep Count: 3”, then just use the data in the final datatable holding the count results (dt_TotalDups). It is already in place for you to output to another file or add to the existing file etc. Whatever you need.

This final example just adds the results to your existing excel in a new column. Please mark as solution

TestDups_NamesExcel (1).xaml (17.2 KB)

Employees.xlsx (9.4 KB)

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.