How to Compare CSVs having same Keywords

Hey guys.
I have two CSV files listing some products and their prices. I have to compare the CSVs based on the modelnumber/name and should write an output csv file displaying the same models with both the prices.

PS: The model number has alphabets,integers and special characters.
Out of the list of products, I need to compare those products which are similar in both the CSVs and should fetch their corresponding prices.

I guess I am stuck with the if condtition in the workflow. Totally confused how to compare the cells with the same names…

I have attached the screenshot of the input and desired result.
I am new to UIPATH and have just started learning things…
It would be great if some one could help me with this ASAP
It’s an URGENT work!!!
Thanks in Advance!!

1 Like
  1. Read both files by using Read CSV activity and put those data into different datatable (set different datatable as output of Read CSV activity, here I name them “dtCSV1” and “dtCSV2”)
    Make sure both datatable’s variable scope is set to “Main”.

  2. Use For Each Row to process all the data in dtCSV1 and use Get Row Item to get “product name” and “price” in CSV1 (here I set the product name to new variable “prodName1” and price to “price1”)

  3. You can filter dtCSV2 with “prodName1” by using DataTable.Select method, like this.
    arrayRowsOfCsv2 = dtCSV2.Select(“product name='” + prodName1 + “'”)
    (This returns array of DataRows and I set it to arrayRowsOfCsv2)
    [see other posts for more detail, there’re lot of posts related to filtering datatable]
    How to use (like ,= operator) with column in Datatable.Select

  4. Now you can get price in the CSV2. I assume there’s only one row that matches filter criteria, then you can get price by
    arrayRowsOfCsv2(0).Item(“Price”)

  5. Finally you can export it to CSV or Excel file. There’re 2 options in my mind. One is to use “Write Cell” for each product, it can be easy. The other is to build new datatable and set retrieved data to it, and then use Write CSV or Write Range to export the new datatable to a file.

I maybe have time to create sample flow later but anyway I just post this since you’re really in haste.

2 Likes

Thanx a lot @a-taniya. This was of great help to me. I am working on it. when you are free please do share the workflow with your idea, so that i can compare with mine. :slight_smile:

Put 2 CSV files under D:, then this flow generates newCSV.csv file in the same directory.
sample-2CSVs.zip (4.5 KB)

I’m using a bit older version of UiPath, I hope there’s no difference with your version…

2 Likes

@a-taniya hey, thanx for the help. The workflow is working fine. But what if the order of the product name is different in both CSVs? I tried changing the product name ie:in one csv it is "short pants xl and other it is “xl short pants”. Is there any way to compare the two csv based on a particular keyword. Let it be model number which may be same for both products. Here when we rearrange the words of product name INDEX ERROR pops up.

hmm…are there any regularities regarding product names in 2 CSV files?
For example, if in CSV1 product names are always “item name” + “size” while in CSV2 they are always “size” + “item name”.
I think it’s not so difficult when there’s regularity.

Or…are the product names in each file stable??
For example, Product A’s name in CSV1 is ALWAYS “short pants xl” and in CSV2 is always “short pants(xl)”.
Then I have an option; we can create dictionary file for storing relationship of product names in 2 CSV files.
We can get product name in CSV2 by searching with product name in CSV1 with the dictionary file, then we can filter CSV2 with the product name in CSV2.

(image of dictionary file)
prodNameInCsv1, prodNameInCsv2
short pants xl, short pants(xl)
shirt grey, grey shirt
long skirt, elegant skirt

1 Like

tanx for the healp @a-taniya. I will give a try with the dictionary method. I am new to this. if I am not getting it again, will ask for help again. :wink: :wink: . i hope you don’t mind it.

Sure :slight_smile:
I hope more experienced members give you better solution, though.
Since I’m not familiar with VB.

1 Like

or you can try to iterate throw the array and check if contains the product .

Assuming there is a strict definition of product.

@amithvs I’d go with similar to what @a-taniya proposed (add something to match values), but do it dynamically instead.

What we know/starting point:

  • Keys in both tables will consist of same words
  • Words in keys can be in mixed order
  • There is no definition of what is the “true” identifier in the key string

Some assumptions:

  • All words will always appear in both lists
  • There are no special characters (handling could be added separately to the HashCalculate)

What we could do:

  • Calculate a unique hash value for each product that is a product of partial hashes of each word (ignore case, use whitespace and dash (-) as separators)
  • Join both tables on new key
  • output results

Ending with something like this
(note - option strict on makes anonymous types unusable, hence the tuples)
HashCalculate.xaml (6.5 KB)
HashMatchTest.xaml (18.6 KB)
hashmatch.xlsx (9.2 KB)

Put all 3 in the same folder, run from HashMatchTest.
Result:

EDIT: One thing to note - it is theoretically possible that the hashing will coincide to produce the same value for different names. Not really likely (not worth worrying about, to be honest), but still possible.

3 Likes

Amazing !! this is pure gold. thanks!!!

1 Like

awesome… This is great!!> Thanx a lot buddies @andrzej.kniola @a-taniya @beesheep

Really awesome work! and that’s why such forums are stupendous most of the times!! A great opportunity to learn things for a newbie like me !! :slight_smile: