UiPath only able to read blocks of text in PDF instead of specific values

UiPath has the ability to read PDF text, however, in the PDFs I am working with the software is only able to recognize blocks of text as opposed to distinct values. Below is a screenshot of an example where I am trying to capture the Meter #, and the bot can only output a block of text indicated by the red box. Is there a better way to pull out specific values from a PDF like this instead of pulling in a block of text, and using regex to extract the values I want?

I would post file but new users don’t have the ability to post files yet.

Any help is greatly appriciated!

You used the activity Read PDF Text or with OCR?

Hey @bcorrea!

I have used both methods, but the issue I am having is the PDF document is 70 or more pages long, so when I do the Read PDF Text, for example, the output is a string that includes every single page of the document. Additionally, I still need to pick out specific values so even if it is all in a text format, i would still need to use regex to pick it apart.

1 Like

If you know which page the text is, you can use Range property and set it there so only that will be read.

1 Like

I should clarify the ultimate goal is to find specified text on each page of the document.

1 Like

ok, what i see in your screenshot is exactly what? the pdf itself? if your pdf is text based (you can in acrobat reader, search for like meter and it gets found), this should be easy, you just need to decide, do you need to know from each page each value was found on, or just need a collection of values?

1 Like

The screen shot is showing a block of text UiPath is able to detect when I screen scrape. A full page looks like the following:

I would, for instance, be trying to find Meter # on each page of the 70+ page document.

1 Like

oh so your choice is scraping directly from the opened pdf in like adobe acrobat? So will be easy to get only the meter information, use Anchor base activity and simple get text and be happy.

1 Like

Hi @Robert_Schauer,
I came across the document you shared in a previous topic. It was really helpful!
I’m working on a similar task and was wondering if you could please share more related documents or guidance.
It would be a big help. Thank you in advance!