Image PDF scraping

qwerty123 · October 9, 2017, 7:26am

Hi

I need to scrape some details from a “scanned - image” PDF for which I used Anchor base activity with Find Image (pointing to the label) as the anchor and Get OCR text as the activity to extract value.

However when I execute this flow, I’m getting an error saying “Value does not fall within the expected range.”
This error is coming from the Get OCR text activity.

Is it even possible to use Anchor Base for Scanned image pdf?

Please suggest.

ovi · October 9, 2017, 9:51am

Hi!

Could you share a sample of you scanned pdf? I think this error might occur because it doesn’t recognize what you are indicating.

Tiberiu_Niculescu · October 9, 2017, 11:14am

Try using the “read pdf with OCR” activity to get the full text, then, do some string manipulations using substring and other methods to get the needed details.

Chaithanya · January 11, 2018, 10:54am

Do you have any examples related to string manipulations, if so please attach here

Chaithanya · January 18, 2018, 5:28am

Hi, I am new to UI Path and need some help in extracting Text out of Scanned Image PDF which is stored in particular location.PDF document contains similar structure of images and i need to extract specific text (i.e Name, Age, Father Name…) of that image document and the extracted information should be stored in .txt file or excel file.

harjyot123 · June 14, 2018, 1:07am

Hi,

I am trying to scap data from a digital pdf, however i tried different OCR methods and data scrapping. I am unable to identify which check boxes are checked. Any ideas?

Topic		Replies	Views
Error at looping through pdf files of different formats to scrap data Help selector , pdf , activities , data_scraping	4	1340	December 23, 2019
Get text in PDF is giving unpredictable output Help pdf , activities	13	5182	February 4, 2019
Extract Text from Scanned Images PDF Help	1	3731	May 11, 2018
I try to extract a specific data from pdf Studio pdf , question	2	864	March 7, 2020
Scanned PDF files Help	8	3429	May 13, 2019

Image PDF scraping

Related topics