Extracting the data from image based pdf

learner9 · March 20, 2020, 8:10am

Hi,
I have tried different ocr to extract the data from the image based pdf but cant get the accurate data .may i know the another way to do it.

shalinisettu · March 20, 2020, 8:13am

@learner9 use extract image from pdf activity…

Palaniyappan · March 20, 2020, 8:14am

Hi we can try with
—To be simple we can start with READ PDF WITH OCR activity set with some different scales property
Which will give us string as output
Then we can manipulate the string either with split or Regex method

Or

We can use Abby Flexicapture to get the data from pdf but we need license

Cheers @learner9

learner9 · March 20, 2020, 8:23am

i have tried all the ocr but it is not giving the accurate data and for the abby flexi capture i used it also.but found the same issue .will you let me know the diiferent ocr?

Palaniyappan · March 20, 2020, 8:45am

Did we try to change the scale property of the ocr engines and check for the output
Cheers @learner9

Topic		Replies	Views
Extract data from PDF using OCR or Text read activity Help pdf , ocr , activities , question	6	9094	December 6, 2019
How to extract and validate data from PDF files Help pdf , activities , data_scraping , question	16	3827	November 23, 2019
How to perform pdf automation with images Activities pdf , activities , question	5	357	December 26, 2023
Reading pdf data Help pdf , activities , question	4	1199	November 19, 2019
Extract data from PDF using get OCR text Help	2	1131	April 14, 2020

Extracting the data from image based pdf

Related topics