How do I identify duplicate values in an Excel column?

Hello All,

I want to identify if there is any duplicate values in column “ID”. And if there’s any, I want to write “Duplicate” in the “Remarks” column.

How do I achieve this ?

image

Hi @jijorhohim98

I have a question , Do you have any other condition to write Duplicate
because as per you screen shot you wrote Ajith 101 as duplicate while Ajay 101 is success

we can write Ajay 101 as Duplicate and ajith 101 as success

How to do you differentiate this name is duplicate and other is not you must have a condition for that can you tell that ?

Regards
Sudharsan

Hi @jijorhohim98

  1. Read the data into Datatable(dt) using Read range
  2. Add a data column as remarks if already not present
  3. Create a Datatable using build datatable(out_dt) as the first dt
  4. Loop through the dt using for each row in datatable
  5. inside use a if condition - out_dt.AsEnumerable.Select(Function(row) row(“ID”).ToString.Equals(CurrentRow(“ID”).ToString)).Count.Equals(0)
  6. In the yes side use add data row and add the values of current row to new and set remark as success on the No side set the current row values and the remark as Duplicate
  7. Write the out_dt to Excel usign write range

cheers

Hi @Sudharsan_Ka,

Name can have duplicate values. I just want to identify duplicates in the column “ID” and add remark as “DUPLICATE”

Yes i am talking about id only @jijorhohim98

Okay so you need to see if the id is already assigned to any name and if assigned the id assigned to other name should be Duplicate right?

Regards
Sudharsan

What if there is no duplicates and I want to leave the remarks cell blank ?

The column “ID” doesn’t have anything to do with the other columns. If a value is repeated in the column “ID”, I want to remark as “Duplicate”.

Hi @jijorhohim98

Then after the loop use another if before writing out_dt.AsEnumerable.Where(function(row) row(“Remarks”).Tostring.Contains(“Duplicate”)).Count.Equals(0)

In Yes don’t write in NO Write the data

cheers

Add duplicate remark for all the duplicates right?

Regards
Sudharsan

It didn’t work

Hi @jijorhohim98

Please post what dint work because thats very vague to understand what failed

cheers

Remarks part didn’t happen and no errors in the error list too.

Hi @jijorhohim98

Did you do the add row part?

if yes what did you pass in it?

cheers

I already have the remarks header

Hi @jijorhohim98

Yes I understand you have remarks column but in For loop did you add a add row ?

can u send a screenshot of your workflow

cheers

Hi @jijorhohim98 ,

Maybe An Alternate method using Linq GroupBy could work for your case :

OutputDT = (From r In DT1.AsEnumerable
Group r By k=r("ID").toString.Trim Into grp=Group
From g In grp.DefaultIfEmpty
Let Status =  If(grp.First.itemArray.SequenceEqual(g.itemArray),"Success","Duplicate")
Let ra = g.ItemArray.Take(g.ItemArray.Count-1).Append(Status).ToArray
Select OutputDT.Rows.Add(ra)).CopyToDataTable

Here, DT1 is the Input Datatable, OutputDT is the clone of the Input Datatable.

Visuals :
image

Debug Panel :
image

Let us know if this doesn’t work.

we focussed on:

only with suggestion from below

we assume that you can work on datatable base, once you had read in excel with e.g. a read range - dtData

Assign Activity:
arrDuplicatesRows | Array of DataRows - DataRow() =

(From d in dtData.AsEnumerable
Group d by k=d("ID").ToString().Trim() into grp=Group
Where grp.Count > 1
Select g = grp.Skip(1)).SelectMany(Function (x) x).toList

For each Activity: item in arrDuplicatesRows | TypeArgument: DataRow

  • Assign Activity: item(“REMARKS”) = “DUPLICATE”

As we have handled by grouping data have a look below. We can also do it within a NON-LINQ approach

when also Remark Sucess is to set, we can handle and adapt as well

what’s inside the build data table ?

@jijorhohim98 ,
It’s the same Sample Data (DT1) that you have provided.