There are two type of pdf format, one is text format and another one scaned and converted to pdf, so is there any logic to separate these type of pdf format.
example11-23-2018_DELEGAIT-MULTIRATIONAL_RETAINER_Billing.pdf (179.3 KB)
Refer the below link
Native PDF - Data Scraping can be used to extract the data.
Scanned PDF - Read PDF Text with OCR.
If you have idea of Regular Expressions, You can use Regex in both Native and Scanned PDF to extract the data
Can we seperate this two type pdf with a condition?
How will robot reconize that its scaned pdf or native pdf format?
Use Read Pdf text activity, and check if the output string length is greater than 0, if the output string length is greater than 0, its native pdf otherwise Scanned pdf.
Thank you so much.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.