Read PDF with OCR


#1

Hi everyone, I am using “Read PDF with OCR” to read a scanned PDF. But as a result the activity is only reading first page of PDF file. Do have to change any elements to read entire PDF.


#2

@ash_kettchup, Hope you followed this ,

Range - Should be either “All” or empty to read the entire pdf.

Many more references from forum…

Regards,
Dominic :slight_smile:


#3

Your suggestions helped me, thank you.

But in my scope of work i need to extract data from different PDF files.

issue i am facing is:

Some PDF files output is good with native scraping and others require to use OCR. Is there any way that i can build a LOGIC that, BOT has to take any PDF in folder as input and decide the better way of reading it(native or OCR) by its self.


#4

@ash_kettchup, May we know on what basis your classfication of good/not good is based ?

Like scenario wise,

  1. Readable pdf - good
  2. Scanned pdf - not good

BTW is there any chance of getting to know the pdf type ? (Based on filename or something else). If its so we shall make it accordingly.

Regards,
Dominic :slight_smile:


#5

Thank you @Dominic, There are readable and as well as Scanned pdf’s.
-There is no change of knowing pdf type until and unless we open it.