Extract data from a standardised form in PDF especially the Table into excel in individual cells


I’m trying to extract data from the table inside a standardised form from PDF into excel and also extract the number given beside the title Job Code. For the table data to be extracted into excel, it would need the column headers and all the data in the correct format not all in just one cell. Any suggestions for the best approach please?

Hey @ciaramkm

Is it a digital PFD ?
If yes, use data scraping.

If it is scanned document, data scraping will not work, In this particular scenario use OCR to get text from each cell and convert it to dt.

It will be in both digital and printed versions. Will have to do both scenarios, will try the digital first. So for the digital part would it basically be:

  1. Read PDF Text
  2. Data Scraping
  3. Extract Structured Data of the table part
  4. Excel Application Scope
  5. Append Range

It doesn’t seem to work correctly without having the PDF window open and also it doesn’t output the column headers. Is there an activity to open the PDF window before it can data scrape properly then?

1…In order t work with data scraping, you need to open PDF.
2… You can open PDF using start process

Thank you for your help :slight_smile:

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.