Extracting Specific PDF Elements (Anchor Base/Find Element only selects Entire Page)

ADP Master Control Sample.pdf (449.4 KB)

Here is a sample of a pdf I’m trying to extract data from. I’m trying to get the 401k Contribution per Person for the entire document. (Starts on page 8, right side). I want to select the Y 401k for find element and the number to the left for the contribution amount with get text. However when I try and select the Y 401k for find element, the entire page is selected.

Does anyone think getting the 401k contribution per person is a task UIPath can do with this type of PDF? Or have any ways to grab just the specific element and not the entire page? I’ve also tried this with find image but it gives me gibberish results.

you can do these steps:

  1. Open web recording

  2. go to Text>Scarpe> Scrape Relative

  3. Select the text “Y 401K” and click “indicate” to indicate relative region, and select the numeric part on left.

  4. UIPath selects the “Google ocr” method. update the settings as below:

  5. it will extract the numeric value.

This works perfectly for one occurrence. Thank you!

However I’m having trouble figuring out how to get it to find this for all occurrences on all pages. I can only get it to grab 1. It will need to go through every page and get them all (which is sometimes 100s of pages).

Is this possible? It would help me SO much to get this working. Thank you

hi @ds56,

You can send hot key Ctr + F, then type text “y 401 k”
image

and press find until the “Element Exist” activity finds:

image

So you will know you have to look the scrapping logic for 30 times.

Hope that solves it.

Thanks! I created a do while loop to loop through the searches and screen scraping. Unfortunately I found that the scrape relative worked well with a couple of them but most of the google ocr returns are either the wrong number, a number that doesn’t exist on the sheet, or missing the first part of the number.

It looks like theres at least an 80% chance of errors so it doesn’t look like UI Path won’t be able to help with this data extraction.

Unless you have another idea?

Anyone, please look into this issue. Appreciate it