Read multiple pages of a PDF file

IOrlando · August 5, 2020, 12:07am

Hello, I write because I have a question, there is a project in which the robot must read approximately 56 invoices on the same number of sheets in only 1 PDF file, these invoices have small changes such as 1 more column or 2 more columns, my question is tool they recommend me to use for this process since Document Understanding (DU) that I think the one I would use is paid per purchased page and we could not afford that in the project.

Steven_McKeering · August 5, 2020, 12:19am

Hello @IOrlando and welcome to the UiPath community!

Could you please provide a sample of the invoice (if you can) and the expected elements you need to obtain.

You might be able to use Regex to obtain the information you need.

IOrlando · August 5, 2020, 12:27am

Hello, thanks for your answer, this is a type of invoice that we can find the others would have fewer columns 1 or 2 less we do not know

, I remind you that they are in the same file. thanks for your time

Steven_McKeering · August 5, 2020, 12:28am

Are you able to upload the PDF?

What is expected field/number/column you need?

IOrlando · August 5, 2020, 12:30am

Data is name, date and total.

I cannot upload the pdf because it is new. (that notice appears to me)

IOrlando · August 5, 2020, 12:52am

Steven_McKeering · August 5, 2020, 3:42am

Hello

As discussed (offline) you can use Regex to obtain the date, name and total of the invoice.

Here is the date link for the Regex101.com site.

I would suggest writing the information to Excel and processing a request for each line

sugam90 · September 1, 2020, 12:46pm

Hi I have been working on document understanding and I am stuck at some point. I am trying to extract table information from my pdf(native). My pdf is of 15 pages and every page has same structure table but with different data. I am only able to extract tabular data from just one page(whichever I select in manage template). How can I extract tabular information from all the pages of a single pdf.

Steven_McKeering · September 1, 2020, 9:13pm

Hello

You need to get the PDF read into UiPath.

Go to manage packages-> click official → search for “PDF” → install.

Then insert and use a “Read PDF” activity on the file.

(I am assuming you won’t need OCR)

Then use a write line activity.

Arshpreet_Singh · April 25, 2022, 6:16pm

Hi Sugam, I am facing the same issue in Document Understanding. Were you able to get a solution for this? Any help would be much appreciated. Thank you.

Rahul_Unnikrishnan · April 26, 2022, 3:48am

Hello,

Did you tried the Table extraction method. If its not working plz give a try to CV table extraction. It should work well here.

Topic		Replies	Views
PDF extraction from specific page only Studio	2	1247	August 29, 2021
Extract Data from one PDF file containing Multiple pages of Invoices Studio excel , database , pdf , activities , studio , question , ml , ai_center , tools	2	3192	April 11, 2022
Invoice Processing to extract tabular data Document Understanding datatable , pdf , studio , document_understanding , document_processing	2	1526	April 15, 2021
Extract characters from PDF with various pages Studio studio , question , activities_panel	11	616	October 26, 2023
I need to extract all the details from invoices pdf and line item describtion quantity and all the fields and i need to do this for all pdf files in the folder Studio studio , question , activities_panel	23	3147	June 30, 2021

Read multiple pages of a PDF file

Related topics