Hi,
I’m using ReadPDFText acivity to read PDf documents which contain non English text.
It works well, but when I try to read a digital signed PDf document (invoice) the output is mostly gibberish.
Attached is the main part of the output in a txt file (part of the output was OK but I removed it because it contains private information) FileContent.txt (1.8 KB)
Is there a way to correctly extract the text from a digital signed pdf ? Please advise.
Thank you, I tried read pdf with ocr, both with Google OCR and Microsoft OCR, but the result contains only a small part of the content.
Is there a chance that other OCR’s may bring better results? how can I know which OCR engine I need?