Read multiple pages of a PDF file

Hello, I write because I have a question, there is a project in which the robot must read approximately 56 invoices on the same number of sheets in only 1 PDF file, these invoices have small changes such as 1 more column or 2 more columns, my question is tool they recommend me to use for this process since Document Understanding (DU) that I think the one I would use is paid per purchased page and we could not afford that in the project.

Hello @IOrlando and welcome to the UiPath community! :slight_smile:

Could you please provide a sample of the invoice (if you can) and the expected elements you need to obtain.

You might be able to use Regex to obtain the information you need.

Hello, thanks for your answer, this is a type of invoice that we can find the others would have fewer columns 1 or 2 less we do not know

, I remind you that they are in the same file. thanks for your time

Are you able to upload the PDF?

What is expected field/number/column you need?

1 Like

Data is name, date and total.

I cannot upload the pdf because it is new. (that notice appears to me)

Hello

As discussed (offline) you can use Regex to obtain the date, name and total of the invoice.

Here is the date link for the Regex101.com site.

I would suggest writing the information to Excel and processing a request for each line :slight_smile:

2 Likes

Hi I have been working on document understanding and I am stuck at some point. I am trying to extract table information from my pdf(native). My pdf is of 15 pages and every page has same structure table but with different data. I am only able to extract tabular data from just one page(whichever I select in manage template). How can I extract tabular information from all the pages of a single pdf.

Hello

You need to get the PDF read into UiPath.

Go to manage packages-> click official -> search for “PDF” -> install.

Then insert and use a “Read PDF” activity on the file.

(I am assuming you won’t need OCR)

Then use a write line activity.