Read Full pdf Text using OCR Vs using Anchor Based Approach

deepak_upreti · March 15, 2018, 8:05pm

Hi Team,

I have multiple scanned pdfs(containing Invoice numbers and other information) that need to automated and objective is to fetch multiple values from pdfs and process them. I have used two different approach:

Read Full Pdf Text Using Goggle OCR
The text is not 100% correct . Also after reading the the scanned pdf text its getting difficult to come with a generic method to pull values from the output string because for other pdfs the position is getting changed . Many anonymous characters are also there.
Anchor Base
It makes pdf to be open in the system . Anchor works on positioning and dimensions of Get Text . that depends on screen resolution or size or version or type of pdf reader used.
So its getting very unlikely that same Xaml file working on my system will work on other systems
too.

Please let me in which scenario I should go for read full Ocr Pdf and where to use Anchor based approach. Or if there is any limitation with Ui Path for scan pdf data extraction.

Cheers ! !

Topic		Replies	Views
Cannot Anchor the text on PDF Forum activities , studio , question	10	533	January 24, 2023
Image PDF scraping Help pdf , ocr , studio	5	4404	June 14, 2018
Get text in PDF is giving unpredictable output Help pdf , activities	13	5199	February 4, 2019
Difference between Anchor base, Read PDF activity and ocr? Help excel , selector , ocr , activities , question	3	1811	November 15, 2019
How to read specific elements from PDFs Help pdf , ocr , activities , question	2	898	November 14, 2019

Read Full pdf Text using OCR Vs using Anchor Based Approach

Related topics