Hi everyone. This is a part of my automation I have been struggling with hope anyone can share some insight.
I have an excel sheet with value amounts from different PDF’s written in this structure:
My goal is match different the values with the respective PDF’s the number is found in.
I have a part of my automation which loops though different PDF’s and returns the total amount of each. Then saves the amount with the corresponding PDF File paths into a dictionary. The amount values are string manipulated to output the same structure for further processing. I built a template using demo PDF’s and it works however I’m having a bit of trouble with Real data used. The PDF’s are all differnent structure, I’m able to pull the amount from each but some additional data is extracted from the the PDF’s such as blank spaces, some additional numbers.
Output of PDF’s:
What would be the best way to save the data which is Amount and File Path? As I would need to further process the Amount into a sructure which will be compared to the Excel file data?