Best Practice to Get Partial Data From PDF

pdf
activities

#1

Been following Academy’s Foundation Training and don’t really get it.

  1. From the training I got this understanding, am I on the right path?

There are many ways we can read PDF.

If you want to read all of them: use activity Read PDF for text data and Read PDF with OCR for images data.
If you want to read partial data of it, use Get Text (with or without anchor.)

  1. I noticed Get Text can only read the data when the required data is seen on screen (application is open, application window not minimized, not overlapped by another app). If I understand correctly, this invites many possibilities or error on running workflow. How about if I want to read the Invoice No no page 1 and Total on page 3? Should I manually scroll it like here in How to scroll down the pdf page?

Is this how it is supposed to be in UiPath (that’s why I have to make sure the application is opened, not minimized, not overlapped, using Try Catch etc) or is there any better way to do this? (I read about using another input/output type like Simulate Click or Send Window Message, but this seems not there for Get Text). What understanding am I missing?

Hoping for some enlightenment.


#2

Hi all,

I also ran into this issue with Get Text only retrieving visible parts of a PDF. Any suggestions / best practices on how to approach such situations (as scrolling seems to me rather error-prone…)

I too would really appreciate some inputs :slight_smile: thanks!


#3

Hi,
I found practical to read all text from PDF and then retrieve required data by using RegEx

Cheers
Josef


#4

Thank you for your advise, @J0ska.
Is there any references on how to practice this?