Lesson 10 Practice 1 Can't find individual elements from pdf

This lesson is to retrieve elements from a pdf document.

First I attach window to the Acrobat Reader PC then Anchor Base and Find Element, the walkthrough implies the find element should be able to select the date label from the pdf document but I only seem to be able to activate the entire document rather than individual elements within the document.

Maybe its the pdf viewer I’m using (Acrobat Reader DC) which is a free download maybe it doesn’t grant access to individual elements within a pdf. I don’t know is anyone aware the requirements for the viewer for pdf scraping to work.

image

2 Likes

Same trouble. Have anyone can explain this for us ?

Read pdf or scrape data ocr data scrape element scope there are many ways to do.

If i work with PDF i use first read pdf and then find element atd…

If you used Adobe Acrobat Reader, go to Edition → Accessibility → Read option (or do ctrl + shift + 5)
You have to change the order of reading, and after that you will be selector by selector of your pdf and not the entire panel.

Forget to say, you had to choose left/right top to bottom

It doesn’t work for me either.
Seems like ‘Find Element’ and ‘Get text’ activities don’t work for cloud PDF Readers like Adobe Reader DC (tried also with Foxit).
Can you recommend any solution for lesson 10 as I believe majority of users would use free PDF reader soft ?

1 Like

Hi Anna!

Have you tried using Find Image activity?

If the PDF is opened with Adobe Acrobat, there might be a few steps to take before you can extract specific elements using UiPath studio methods. Start Acrobat and press ctrl+k. That opens the Preferences popup. Select Reading, out of the categories on the left panel. If the dropdown called Reading Order options is set to the Acrobat recommended option, “Infer reading order from document”, then, that only allows you to see the document as one whole selector, so we need to change it. In the dropdown, select “Use reading order in raw print stream”. That enables us to see each individual element in the PDF. Then, on the left panel, click Accessibility. In the Other Accessibility Options section, uncheck the first two boxes, “Use document structure for tab order when no explicit tab order is specified”, “Enable assistive technology support”, and click OK.

4 Likes