Read PDF with OCR only reads couple of pages or last page

Barend · September 30, 2020, 9:45am

Hi,

I’m trying to read the text from a pdf using the uipath.pdf.activities read pdf with ocr and the microsoft engine. The problem is that it seems to only read the last page or the first 5 or 6 pages. Because I’m reading multiple files with different amount of pages I can not specify the range. I saw some other topics with similar issues, but no solutions.

I have tried range “1-end”, “All”, “1-11”, “1-18”, “1-”+Pagecount.tostring
All give the same not complete result.
I do notice some faulted microsoft ocr acivities in my output, but no exception messages

My microsoft ocr engine properties:

My workflow:

I know I could try to work around this by reading each page separate in a loop and the adding the text together, but I would prefer to have this working as it is supposed to be.
Thank you for your help

Pablito · October 1, 2020, 9:35am

Hi @Barend,
Have you tried to experiment with other engines, changing dpi etc?

MARIODC · April 19, 2021, 6:32pm

Hi,

Any updates?

I have a similar scenario but with OCR Framework, is reading 6 first pages only…

Vidu_Senanayake · July 27, 2021, 11:15am

By the way for me i fixed the issue where it only reads the first page by changing the place i read the text from the OCR engine to the read pdf from OCR activity

sonaliaggarwal47 · July 27, 2021, 2:03pm

Hi @Barend,

Have you tried using any other OCR engines apart from microsoft one? I think it would be a good idea to try out omnipage, google ocr’s etc and then seeing the results as not all engines would be able to successfully OCR everything from the documents, it depends on multiple factors. So its always good to try out different engines to see any change in the outputs.

We have used Read pdf with ocr activity for n number of pages in pdf and it worked for all of those.

Regards
Sonali

Barend · July 29, 2021, 12:19pm

Hi, I think I fixed it by using a different engine, but it has been a long time ago. So, I don’t remember the exact sollution

system · August 1, 2021, 12:20pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Uipath read pdf with range Activities pdf	7	95	August 6, 2024
Read PDFwith OCR is not extracting all the pages using Tesseract OCR Engine Help activities	5	2114	June 29, 2019
Read PDF text Issue Help activities	3	1132	May 25, 2018
Read PDF with OCR only reads first page Studio studio , question , activities_panel	1	1418	March 18, 2022
Read Pages from a PDF Help pdf , activities , error , question	1	805	November 26, 2019

Read PDF with OCR only reads couple of pages or last page

Related topics