How to read multiple invoices from 1 pdf file?

I have 1 pdf file with multiple invoices. Is it possible to extract information from all invoices using ML?

1 Like

Hello @toffi.poffi ,

This is an interesting scenario.

First, if you have a lot of invoices, maybe you can use an activity to Split the pdf , to create a pdf file per invoice.

Than, you can use For each (file) activity to use the files as transaction. (ReF template)
Than, you might be able to use Document understanding, or Regex or String operations to get the data you wanted.

Didn’t tested yet the ML, might be possible.

I hope it helps.


It was simple You just put all pdf in on file Then Use
Use assign activity
Step 2: Create a variable that was in Array String then use

Directory.GetFiles("C:\Users\*\*\UiPath\Form Question 2\Chethan","*pdf")

Use For Each and that should be in the string

use files variable into the for each

now you can read the multiple pdf files

Note: when you give the path address you have to give the full path of the folder from C:\user…

If You get the solution mark it as solved

Hi, me too, need to read all information from 5 invoices in 1 pdf (5 pages), is it possible?

You can use Intelligent Keyword Classifier to split the document. Then you can simply iterate from it’s results and extract from it. After splitting you can also display Classification Station to let the user confirm if the split is ok or not. A flow would look like this:

He doesn’t have multiple PDF files. He has one PDF file that contains multiple invoices, and he wants to individually process the multiple invoices that are in the one PDF.

Intelligent Keyword Classifier can split a single PDF. It will return an array of ClassificationResult, through which you can iterate and use to extract from a particular item/page range in that document. Remember that the Data Extraction Scope also accepts a “ClassificationResult” as input and will run the extraction on that specific region only. See the sample flow I posted above.

Use Document understanding bro you can read all the 5 pages