Relative Scraping


#1

Hi,

I have a problem when trying to get values from PDF document. When trying to use Get Text activity, whole PDF document gets selected (blue), so theres no elements to select for numbers.

I have used Citrix recordings Relative scraping. It works fine, but in PDF document is a table, and the numbers what i want are at below table. If table rows remain same, getting value works fine. But if number of table rows change, it gets values from wrong place, from table. So Get OCR text activity doesn’t sync with Find Image activitys coordinates. I have attached pic from PDF. I use Find Image activitys “image” words"KELA tilausva…4,5%" from left side and the value where Get OCR is pointed is on right side “-2.03”

Set Clipping Region activity has Direction set to “TRANSLATE” and Size is (-957, 0, 832, 0)

In picture table has only one row but other reports might have them 10 or more…

-mikko


#2

If at all possible, try to get the report in a different format as that’d be your best course of action.

Since it isn’t recognizing the table, have you tried using the ‘Read PDF Text’ activity? If that doesn’t work, then have you tried the ‘Read PDF With OCR’ activity?

Both of these will output a single string of the PDF (or portion of the PDF, if you change the input range). Then you can use string manipulation to find your value.

You’ll definitely need to play around with the settings if you need to use OCR, & it won’t be 100% accurate. The text looks pretty standardized though, so it might end up ok as long as the alternating white/gray doesn’t screw things up.


#3

Got it working. Recorded it again and it started working :slight_smile:

I tried Read PDF Text and Read PDF with OCR activity. I think that it might be better solution than messing up with Acrobat. Just need to study string manipulation, maybe in UiPath forum has some threads about it.

Thanks Dave for help!

-mikko