Help/Advice with coverting data structure

Hi everyone. This is a part of my automation I have been struggling with hope anyone can share some insight.

I have an excel sheet with value amounts from different PDF’s written in this structure:

  • 1432,47
  • 2257,76
  • 404,9

My goal is match different the values with the respective PDF’s the number is found in.

I have a part of my automation which loops though different PDF’s and returns the total amount of each. Then saves the amount with the corresponding PDF File paths into a dictionary. The amount values are string manipulated to output the same structure for further processing. I built a template using demo PDF’s and it works however I’m having a bit of trouble with Real data used. The PDF’s are all differnent structure, I’m able to pull the amount from each but some additional data is extracted from the the PDF’s such as blank spaces, some additional numbers.

Output of PDF’s:
R 1,432,47
R2257.76
ZAR 403,90

What would be the best way to save the data which is Amount and File Path? As I would need to further process the Amount into a sructure which will be compared to the Excel file data?

Hi!

What you need to do is bring your data to the same standard so you can compare easily. I recommend converting your strings to decimals.

Your biggest problem is that you have data with different GroupSeparators and DecimalSeparators. In one case (1,432,47) you even have the same character as GroupSeparator and DecimalSeparator - that’s just stupid! (Is it 143247.00 or 1.43247 or 1432.47?!?)
No way we can convert without knowing which one is which. Try to clean up your data!

In the other cases you can identify what is the DecimalSeparator and convert accordingly. Refer to this post on how to do that: Can't convert currency Culture Info - #5 by loginerror

Happy automating

Hi!

Okay, as soon i extract the amount from PDF’s, which then gets saved into a dictionary. Key would be amount, Value would be file string.

How could I isolate the Key amounts, so I could process them further? Sometimes the data will vary such as:

  • Space value
  • Amount
  • number which is not needed

Thank you for your reply

Unfortunately I don’t really understand your requirements and I’m going on vacation now. Hope someone else can help