Removing duplicates in the same column

Hi!

I’d like to remove duplicate items in the subsequent rows, but not sure what is the ideal way to do it? I have tried assigning rows = row.Item(0).ToString.Contains(Items) and then using if row 1 = row 2 then duplicate.
But if I have 100 rows, it doesn’t seem to be the most feasible way.

image

In this example, I just want to remove James’s row because Apple has already been taken by Sam.

Please help!

3 Likes

This is logical but its doable can u share me your input excel and output uhat you want @wootberries24

cheers

3 Likes

Hey… @wootberries24

You can do something like this. Since you are checking the duplicates on items column follow the below steps

  1. Read range to read the excel
  2. Now sort the datatable based on the items using sort datatable activity
  3. Now add a for each row
  4. Now create a string variable…

Why we need a string variable is, this variable will hold the value of the item of previous row of the current row inside the for each row. Basically we are comparing the item of the previous row with the current row…

  1. In the for each row, first place a if activity to check the condition…

Row(“Item”).ToString.Equals(ItemofPreviousRow)
If true do nothing…
If false
Add the row to another datatable using add data row activity…

After the if condition, add an assign activity and assign the item value of the current row to the string variable for next iteration…

In this scenario, the new datatable that is created will have no duplicates

1 Like

Hi, I have included the sheet names to be ‘Input’ as initial and ‘Output’ to be the final values.

TestDuplicate.xlsx (9.6 KB)

1 Like

Hi!

What should I assign the value to ItemofPreviousRow?
After this run I will be removing the data to a clean slate so there is no need for the next iteration, it is a one time run.

Do I firstly assign the first row as row1 = row.Item(0).ToString.Equals(items), then subsequently I will use row(“Items”).ToString.Equals(row1) as ItemofPreviousRow?

Sorry I’m still quite confused, but I get the gist of it!

no… sorry for the confusion…

In the for each row…
First we check whether the value of row(“item”).toString is equal to our variable ItemofPreviousRow…

After the if condition, we assign the value of the row(“items”).ToString to the variable…

Here, what I mean by the next iteration is not the next time you run the process. What I meant by the next iteration is the iteration of the loop. For each row loop activity will iterate through all the rows of the datatable. So in each iteration, we first check whether it is equal to the other value in the variable, then we assign the current value to the previous variable. and go to the next iteration and check the same…

2 Likes

Hi,

What is the VB expression value of itemofPreviousRow?

Thank you so much!

its just a variable… no special expression

2 Likes

hello @wootberries24
you can do this without using For Each Activity and only using an Assign* activity by using linq. Use this code in an assign activity

(From roww In DT.AsEnumerable() Group roww By id=roww.Item("Items") into gg = Group Select gg(0)).copytodatatable
this will group your required column and will not take the second repeated value.

The workflow should look like this

3 Likes

Thank you so much! This worked for me :slight_smile:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.