PDF OCR scrape partial region?

  • What’s the sequence of activities to tell the robot roughly where to look for a certain piece of text and then scan it using OCR?
  • Also how do I extend this to include functionality to look for certain features to improve where it is looking for the text?
  • To open the PDF in IE, do I use Open Browser or Open Application?
  • When doing this, does it scan what’s on the screen after opening the program used to open PDF, or does it perform OCR on the document itself?
  • Do I need to include behaviour to scroll through the PDF if I want to scan the document at 100% size, and how do I do this?
  • Sometimes IE will open the PDF with an additional side frame from Acrobat, will this be an issue?

PDF is a scanned printed document previously typed out in MS Word, follows a certain template, need to extract some variable text from it, will be processing several of these documents and the exact location to be scanned might vary by just a little due to printing etc. (Won’t be posting a sample document though.)

  • Use ‘Open Application’ Activity to open application.
    In properties pane - Argument: location of PDF file
    File Name: Location of AcroRd32.exe
  • Need Use Any one of OCR method to read data from PDF
  • If you info is at the bottom of the page pass ‘pgdn’ key along with ‘Hot keys’ activity.

I am currently looking at the output of Scrape Relative, after Find Image how does the Get OCR Text know where to scan? How does Find Image pass location info to Get OCR Text activity? Currently the Clipping Region input is empty

What are some other activities I should know of for the bot to know what to look for, specify the region to be scanned and actually scan it?

How to account for minor variations in case the document is a bit off in one direction?

What if I press Page Down and the part I want to scan is partially cut off?

Sometimes IE will open the PDF with an additional side frame from Acrobat, will this be an issue?

How do I set it to zoom at 100%? I only see the Zoom In and Zoom Out buttons but no zoom percentage value