Hello Everyone,
I am doing Read PDF with OCR activity as my pdf is in Image so I want to convert it but not getting a proper output :(. If anyone please help me out !
Regards,
Hemal
Hello Everyone,
I am doing Read PDF with OCR activity as my pdf is in Image so I want to convert it but not getting a proper output :(. If anyone please help me out !
Regards,
Hemal
Hi
OCR is never reliable and therefore we can’t expect 100% correct response. Licensed OCR engines which primarily run on Cloud like Google Cloud OCR or Abby Cloud are better in terms of efficiency.
However do note that they are paid.
yes, true. So what do we do? What if there is handwritten text?
Hello Hemal,
From my perspective OCR is just a last resort option when in comes to automation. In my own tests only 3% of the PDF where scanned successfully. If you have more than one PDF do your math (haha). Therefor I would suggest (depending on how much PDF’s) to copy paste the content into a word document and read it out of that one.
Have a great day!
Handwritten text cannot be processed by OCR as they work well only for electronic texts. You can try looking for any suitable ICR engine (Intelligent Character Recognition).
However I haven’t worked on it yet.
For ICR try parascript. I have not integrated with RPA before but was successful in BPM applications.
It works best when there is something like a table on confined set of parameters to check address, such as Address Database, list of possible values. Names , email etc are very difficult as there are infinity possibilities.