2 type of pdf format

Kuenzang · February 4, 2019, 4:46am

Hello expertise.
There are two type of pdf format, one is text format and another one scaned and converted to pdf, so is there any logic to separate these type of pdf format.
example11-23-2018_DELEGAIT-MULTIRATIONAL_RETAINER_Billing.pdf (179.3 KB)

anil5 · February 4, 2019, 4:51am

Refer the below link

anil5 · February 4, 2019, 4:53am

Native PDF - Data Scraping can be used to extract the data.
Scanned PDF - Read PDF Text with OCR.

If you have idea of Regular Expressions, You can use Regex in both Native and Scanned PDF to extract the data

Kuenzang · February 4, 2019, 4:59am

Can we seperate this two type pdf with a condition?

Kuenzang · February 4, 2019, 5:00am

How will robot reconize that its scaned pdf or native pdf format?

anil5 · February 4, 2019, 6:17am

Hi,

Use Read Pdf text activity, and check if the output string length is greater than 0, if the output string length is greater than 0, its native pdf otherwise Scanned pdf.

Kuenzang · February 4, 2019, 6:35am

Thank you so much.

system · February 7, 2019, 6:35am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Read PDF Text activity is not working for PDF in Text format Help	4	6471	September 18, 2018
Read PDF with OCR Academy Feedback	5	6073	January 29, 2020
Extract Pdf using Read Pdf Text Studio uiautomation	4	563	November 14, 2022
PDF to text file Robot robot , question	4	383	January 12, 2023
Unable to read pdf using Read pdf text Activity Something Else feedback	10	1909	February 10, 2022

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

2 type of pdf format

Related Topics