How to Iterate Sheets in PDF file to extract Invoice Numbers and Dates

Hi all,
I would like to seek kind help on the following:

  1. I received a pdf file containing multiple invoices.
  2. I need to iterate through all the invoices to extract the Invoice numbers, dates and PO numbers.

A rough idea that I have in mind after combing through the web is to firstly read the pdf file and convert it to text. Then perform a search with Regex. Finally put the extracted invoice numbers, dates and PO Numbers into a “datatable”.
I would be grateful to get advice and guidance on how to achieve the task, especially on iterating the “search” with Regex and populating the “datatable” in sequence.
My_Invoice.pdf (177.2 KB)

Hello, take a look at this. Great PDF extraction tool. Which uses RPA, Ai, OCR and machine learning. This gives you a robust solution.

1 Like