Data Scraping Same Table from Different PDFs


I have successfully scraped a table from a PDF that contains credit card charges (date, company, amount) and written the table into excel. I now need to do that across multiple PDFs. I have added a working “for each file” activity to start, but I cannot get the data scraping to work for any of the other PDFs besides the original (it just writes blank in excel for those PDFs). I have tried going in UiExplorer and making a dynamic file name but that doesn’t help either. Any suggestions?

Many thanks!

Is the PDF structure exactly the same? How are you scraping the table? It’s hard to help diagnose what’s going wrong without more information such as example data

Hi Dave,

The PDFs are structured exactly the same. The table I am scraping is on page 3/4 of the PDF in all instances. It is personal credit card data so I can’t share the exact files but I have taken screenshots of the tables I’m scraping. The scraping works when I set it up for one file, but always fails on the second file.

How exactly are you scraping the data? Are you reading it all as text and using string manipulation? Are you able to find a table element somehow?

We need to figure out what method you are using for the first one, and see if it is able to be used again for a new file. Depending on how you are scraping it may or may not be possible using your existing method

I am using extract structured data to find certain elements as shown below.

Ok, you have to show the properties that the extract structured datatable is using. Compare the selector from the PDF that works to the PDF that doesn’t work. What is different between the two? Can you create a selector that works for both?

Or is it able to find the selector in the other PDFs, but just returns no data? If so, then you need to alter the properties of the ‘extract structured data’ so it is pulling the correct data.

It is very difficult to diagnose the issue with such a limited amount of info, sorry