How to Remove Duplicate Rows in Excel?

I have tried many different variations of the activity “Remove Duplicate Rows”
Every time I run it (I am only wanting to give it 1 column to look at as the range), it doesn’t change anything.

Can someone please
1: Tell me how this activity is intended to work (Yes I’ve read the help page).
2: Give me, preferably, a simple solution. Stuff that requires expression’s beyond “if statements / conditions” are beyond me.

we would do this with grouping the data on the Description column value

But let us know if you are looking for a StudioX addressed approach

My brain just melted reading this entire post you directed me to. Why is there so much technical stuff needed within the Expression editors? (I must apologise - I find this possible solution far too complex for me to realistically implement.)

Assign activity:

dtGrouped =

(From d in YourDataTableVar.AsEnumerable
Group d by k=d("Description").toString.ToUpper.Trim into grp=Group
Select r = grp.First()).CopyToDataTable

thats all. Write dtGrouped afterwards to another excel worksheet

So. In light of my incapabilities. And me being unable to use an activity for its intended use (Removing Duplicate Rows). I will go with the current solution I came up with, and post the result if I am successful…

So. For each row
Where the Unque ID is Empty (Because the activity that adds new descriptions also removes the placeholder Unique IDs for each new description added)

Find all Activity is used. If any value is returned, than the original etc was detected, and therefore the current row is a duplicate.

If Unique ID is empty
ㅤㅤㅤ-> Than
ㅤㅤㅤFind value = current row in Description
ㅤㅤㅤIf a value is returned
ㅤㅤㅤㅤㅤ-> Than
ㅤㅤㅤㅤㅤDelete row = Current Row Description
ㅤㅤㅤㅤㅤ-> Else
ㅤㅤㅤㅤㅤAdd a new placeholder value into Unique ID
ㅤㅤㅤ-> Else

can you also give us sample excel input and output

from given HowTo you can maybe adapt the NonLinq approach

  • prepare a list of distinct ids / a datatable with distinct ids col/rows - dataDistinct
  • loop over dataDistinct
    • use current looped ID for filtering
    • process the filter result representing the group members (all rows of the same ID)
1 Like

I just realised. The “Delete Row” Activity has a “What to delete” dropdown option called “All duplicate rows” :joy:.
If my current strange attempt doesn’t work, I’ll see if I can create a solution using that option.
It’s almost like the developer team is so big, they re-invented an activity within their own software.
(I take that back, remove, and delete are 2 different things).

I might end up doing this instead, since the Find Activity always returns a value

For Each Row
If Unique ID is empty
ㅤㅤㅤ-> Than
ㅤㅤㅤFill in Placeholder Value to Current Row Unique ID (So that it doesn’t delete itself)
ㅤㅤㅤFor Each Row 2
ㅤㅤㅤㅤㅤIf Unique ID is empty & CurrentRow2 Description = CurrentRow Description
ㅤㅤㅤㅤㅤ-> Than
ㅤㅤㅤㅤㅤDelete row = Current Row Description
ㅤㅤㅤㅤㅤ-> Else
ㅤㅤㅤ-> Else

Bad news, my attempt at using Manoj_Batra’s solution, did not work, even though the bot didn’t complain about any errors.

Ok. Good news. The “Delete Rows” - All Duplicates Activity by itself works. (Didn’t yesterday when I added it to the monstrosity of a workflow I’ve been making.)

Anyway. I will attempt to make the solution surround this activity. Which is a pain, because I can only make it delete rows, as apposed to removing the rows entirely.

Step 1: Insert the Description twice during the main workflow process. (2 description columns)
(This data is within Tab3)

Step 2
For each row
Set Description 1 to DesVariable

For each Row2
If Description2 = DesVariable
→ Than: Place CurrentRow2 into say, Tab4 (Remove CurrentRow2 to save time on the next duplicate search).
→ Else: Continue

Step 3
Delete Rows: All Visible rows (in Tab3)

Step 4
The new data won’t have a Unique ID, Therefore…
For Each Row - If Unique ID = “” (Equivalent of Null)
Than: Continue
Else: Delete row = CurrentRow

Step 5
Add the Duplicate free data, to Tab1 with the rest of the Final Data by “Amend Write DataTable”

This should work. I sincerely hope there are no problems with this.

Ok. After coming back to this problem with fresh enthusiasm, I decided to do what I did with the Delete rows activity. Force it to work, by making the simplest workflow possible, where it is the only activity to do anything within the workflow.

To my relief and frustration. The “Remove Duplicates” activity finally works again.
I think the issue was adding it to an overly complex workflow which inhibited its functionality in some way. I will now see if I can recreate my workflow in a simpler way.

Thank you for the help everyone. I hope these methods posted here by Manoj_Batra and Peter help people figure out how to remove duplicates.

I have yet to make the Remove Duplicates activity work within the workflow I intend it to be used for.

The point of this question was to have a unconventional way to remove duplicate rows, that would be more successful than the multiple option UiPath provides. I may have marked the above as a solution. but please try the other posts by Pete and Manoj Batra.

I would not have come closer to a solution without their help.
Thank you everyone.

(My bot still doesn’t work :joy: :tired_face:, every time I add the table extraction to the workflow, it stops working once I add the remove duplicates activity - and visa versa. Something else is wrong, but I can’t really make a post about it.)

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.