Extract Data from one PDF file containing Multiple pages of Invoices

I am using the AI Center and extracting invoice data from multiple invoice types using the machine learning extractor. The issue I have is that the invoice pdf files contain multiple pdf pages with different deliveries / invoices in the file and each page needs to be read and extracted to excel.

I want to find a way to work through each page and extract the invoice data and write it to excel instead of having to split into individual pages and then merge back together.

I tried to read pdf page then configured Regex that found a word that was the bottom of each page and then set number variable to increment the page number. I then setup a while condition to try to run through each page and write a row to excel for each invoice page.

It found the right number of pdf pages but instead of extracting the data from each of the 4 invoices in the pdf file it added a 4 rows in the excel file all containing the data from only the first pdf invoice page. Essentially it extracted the first invoice 4 times and ignored the other 3 pages.

Any help would be greatly appreciated. I am pretty new to UiPath only using it for about 20 days.

Hi @chris5163 ,

I do not think this is achievable without Splitting the Pages. The Reason is the Capabilities Provided by the DU Invoices Model extracts only one Field Data that is matched.

We would end up Using the Custom DU Model, with Labelling Datasets, Training and then Deploying the Model again. Which I think in your case, is not recommended at all.

It would be better to Split Pages and achieve the desired result.

Alos, If the Documents are all PDF files, then I do not think you would Require Regex to Extract PDF Pages Count, we can use Get PDF Page Count Activity.

Yeah, I was hoping there was a way to do it without having to split each time and run individually since the files are received from the suppliers daily in a single pdf format and have to be totaled this way it just adds additional manual steps to the process which is what we are trying to get away from by using UiPath.