Identify the number of duplicates items in a column and their row index

Hi Everyone,

I would like to identify the number of duplicates in the below Column Name “ID” and their row indexes also. With that number I need to iterate the values.

ID
10001
10001
10001
10001
10001
10001
10002
10002
10002
10003
10003
10003
10003
10003
10004
10004
10004

Thanks
Zackariya

correct me if i misunderstood the question
no of duplicates for 10001 is 4 in this example isn’t it?

dataTable.Select(" ID= ‘10001’").count

No…the total count here for 10001 is 6.

How to iterate it and find other values count as well?

sorry it’s a typo.

can you check this xaml data.xaml (16.5 KB)

it may help you

The attached is a solution which uses a dictionary of distinct items and increments the value of each key when it it is found. It outputs the final key value pairs to indicate how many instances of each exist in the original list.

Main.xaml (10.6 KB)

image

You suggest in your instructions that you want to then do something with this information using indexes? what exactly are you trying to achieve as I don’t understand?

1 Like

Hi @ronanpeter, thanks for your sample. Basically below is my excel file data.

In the ID column, I have to consider the repeated values as one record and have to copy the corresponding columns and get their receipt date, receipt no and receipt amount to check against the PDF file using screen scraping.

For ex, ID 900003 appears 3 times here and I have consider the 3 rows as one record, I need to get the corresponding receipt data, receipt no and amount to cross verify against the PDF file and if there are matches I have to update the record as matched and verified. After that I have to move to ID 900004 and so on and so forth. Can you give an idea on how to achieve this?

Thanks
Zackariya

Your requirements need to be complete before I could provide a solution. Because if all you want to do is check if, for example, a PDF contains the receipt_date, then all you would need to do would be to read the PDF to a string and then in a For Each Row Activity against your data, check if the receipt data is contained in that string. Based on that populate against your table as a match or no match.

However, I am assuming that there is more to your requirements here? Otherwise your request to find the number of occurrences of each ID is not exactly relevant as I explained above.

Do you have a multiple PDFs you are checking against rather than just one for example? Is there a PDF for ID_90001, ID_90003 etc? This would make more sense.

In such a scenario you would use the list of unique IDs I created in the workflow I provided. You would filter the datatable for each unique ID to grab just those related to each specific PDF and take it from there, comparing one against the other. You could re-add the Match/No Match result to your original datatable using the concatenation of the the ID & Receipt_No columns which appear to provide a Unique RowID.

Hello All,

I am in a similar situation, except I am just trying to calculate the total amount for each unique value in the first column.

In other words, I have a table with two columns: 1. Transaction Number, and 2. Amount.

There are multiple lines with the same Transaction Number, and I need to find the sum of the Amounts of each Transaction Number (i.e., the total amount for each unique transaction number).

Any assistance you can offer would be much appreciated - thanks!

@mattelstonschwartz
Give a try to LINQ with GroupBy and the benefit from one of the aggregating functions
https://www.tutorialsteacher.com/linq/linq-grouping-operator-groupby-tolookup

My apologies - my level of coding experience is extremely low.

Are there any other solutions that rely exclusively on UiPath, as opposed to SQL or VBA?

No Problem You can doit with plain uipath activities
Define a diktionary as a variable
Use column 1 as a dictuonary Key
Iterate oder the datatable rows
Asign activity used in the iteration
dictvar(row(0)) =
dictvar(row(0)) +row(1)