OCR on a PDF: Targeting using Coordinates not accurate

Alkaros · December 12, 2016, 1:23am

That might be a poorly worded title so to detail the problem a little more.

I have a large amount of PDFs all in the same format that I wish to read and output their information in a CSV. My sequence opens the PDF in PDF-XChange viewer.

So for example, a chunk of the PDF looks like the below:

I use the recording mode to try and screen scrape, as seen below:

Using the google OCR, this works fine:

But when I run it in my sequence. It doesn’t return the same value that it shows above. It seems to scrape a similar sized box but a few pixels below, aka:

Should I be approaching this differently?

I’ve put some screenshots in an album as I cannot attach more than one image as a new user. They seem to have got out of order:

[Album] imgur.com

beesheep · December 13, 2016, 4:03pm

Hello,

there is something about it here

the user needs to read the pdf and extract the invoice amount. there is a XAML file, can you please try that one out and let us know your findings?

regards…

Topic		Replies	Views
Read pdf ocr Help	5	1015	January 23, 2019
OCR and image automation Help uiautomation , ocr , activities	6	2170	June 9, 2020
Unable to capture PDF Invoice information using OCR Help pdf , ocr , activities	41	4439	February 23, 2021
Question about OCR Help	5	739	October 28, 2019
Relative Scraping Help	2	3640	January 17, 2018

Most Active Users - Yesterday
Anil_G
ashokkarale
jinal.shah
Gautham_Pattabiraman
postwick
chandreshsinh.jadeja
vrdabberu
Ajay_Mishra
sven.wullum1
Vyshnavi_Nalumachu
More details...

OCR on a PDF: Targeting using Coordinates not accurate

Related Topics