OCR works slow

renjutom · April 20, 2020, 10:56am

Hi,

I’m trying to extract specific data from PDF using Get OCR Text method and insert that data in excel sheet. I have to fetch data from multiple PDF’s containing multiple Pages. It is taking too much time to extract data from each PDF. Is there any way to reduce the time to fetch data from PDF using OCR method?

Kind Regards,
Renju

Ioana_Gligan · April 20, 2020, 11:23am

What OCR engine are you using? I recommend you try multiple ones and the decide on the best one for your purpose, both as far as speed and quality go.

renjutom · April 20, 2020, 11:46am

I’m using Microsoft OCR and Tesseract OCR. Both are taking more time for execution

Ioana_Gligan · April 20, 2020, 11:50am

There are multiple better alternatives than Get OCR Text, if you are looking for the entire text of a PDF document.
For this purpose, you should try the “Read PDF Text” or “Read PDF With OCR” activities from the UiPath.PDF.Activities package.

If you want to extract some very specific data, I recommend you use the UiPath.IntelligentOCR.Activities package and the Document Understanding framework that UiPath is offering.

You can have a look at a sample such workflow in here: How to use the IntelligentOCR Package .

Hope this helps,

Ioana

Topic		Replies	Views
Extract data from PDF using get OCR text Help	2	1171	April 14, 2020
Get Text Activity takes too much time Help activities	1	1864	January 4, 2018
Why Document understanding takes too much time For giving the Expected Output Studio activities , studio , document_understanding , uipath	3	611	August 9, 2023
Extraction of data from multiple images in PDF/Doc Files Studio studio , question , picture_in_picture	4	1417	August 25, 2021
How to automate read pdf with ocr on about 60 pdfs one after the other Studio studio , question , activities_panel	16	2055	July 29, 2021

OCR works slow

Related topics