How to get information on each page of PDF

Hi!

I have a multi-page pdf that I must take a specific piece of information from each page and create an excel spreadsheet containing that info.

To make things easier to understand, let’s just say that the pdf is multiple invoices in one pdf file. Each page is a new invoice with a new invoice number. The layout is the same on each page. I need to take each “invoice number” from each page and compile it into one list.

Thanks in advance!

Hi,

If you can extract invoice number using Read PDF text activity and string manipulation, the following helps you.

Regards,

3 Likes

Ok, I can try that. I was just trying to avoid string manipulation since I’m not very familiar with coding or UI Path.

1 Like

Hi @heblightning

You can try with document understanding feature in Uipath

I’m going to try and use this method:

Use the page count to get an int32 and use that in the ForEach activity so that it attaches to the open pdf and automates taking the necessary information off each page using the hotkey activity (space) so that it goes to the next page and repeats the process appending to an excel doc the information I need as it goes.

Did you try using document understanding using the form extractors to extract data from different pages or did try to use read pdf with OCR.

I did not try using the document understanding. I was not able to find it.

I ended up using PDF Page Count to get an int32.
I then made a while loop with the condition being totalPages>=counter.
I used click to double click on the information I needed and Hotkey Ctrl + C, Hotkey right (so it moved to the next page).
From there I attached to a notepad and Hotkey Ctrl + V, Hotkey enter.

So it took the necessary information copied and pasted it into a new text file. Now I just need to figure out how to get that text into an excel file.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.