Check if document is a scanned pdf or not

SenzoD · May 18, 2020, 8:37am

Hi All,

I am working with PDF documents, 90% of the time they are machine-generated documents so I do not need the OCR to extract text from it, and sometimes they are scanned documents and I have set up the intelligent OCR to extract required text and that works fine.

Now I want my robot to first check if the document is scanned and then direct it to the intelligent OCR workflow for data extraction otherwise just use the normal read PDF workflow. How can I achieve this?

supermanPunch · May 18, 2020, 8:59am

@SenzoD If the Read PDF Text Activity returns empty for the Scanned Documents, you can Start the Intelligent OCR workflow. Does it return Empty Value?

SenzoD · May 18, 2020, 9:01am

@supermanPunch, thanks man, that sounds like a good idea, i have not checked if it returns an empty string but i will test it and let you know.

r.rydzewski · December 17, 2021, 9:30am

???

Topic		Replies	Views
Reading PDF with Read PDF and Read PDF with OCR Studio studio , question , designer_canvas	1	804	March 13, 2021
Reading a scanned file PDF (instead of reading PDF) Studio studio , question , activities_panel	8	721	June 17, 2022
Check if a pdf is scanned or in native format Help pdf , activities , question	5	3005	January 16, 2020
Intelligent OCR - Saving Validation Definitions Document Understanding	7	1412	April 1, 2020
Extract Pdf using Read Pdf Text Studio uiautomation	4	665	November 14, 2022

Check if document is a scanned pdf or not

Related topics