I’m trying to read the text from a pdf using the uipath.pdf.activities read pdf with ocr and the microsoft engine. The problem is that it seems to only read the last page or the first 5 or 6 pages. Because I’m reading multiple files with different amount of pages I can not specify the range. I saw some other topics with similar issues, but no solutions.
I have tried range “1-end”, “All”, “1-11”, “1-18”, “1-”+Pagecount.tostring
All give the same not complete result.
I do notice some faulted microsoft ocr acivities in my output, but no exception messages
I know I could try to work around this by reading each page separate in a loop and the adding the text together, but I would prefer to have this working as it is supposed to be.
Thank you for your help
By the way for me i fixed the issue where it only reads the first page by changing the place i read the text from the OCR engine to the read pdf from OCR activity
Have you tried using any other OCR engines apart from microsoft one? I think it would be a good idea to try out omnipage, google ocr’s etc and then seeing the results as not all engines would be able to successfully OCR everything from the documents, it depends on multiple factors. So its always good to try out different engines to see any change in the outputs.
We have used Read pdf with ocr activity for n number of pages in pdf and it worked for all of those.