OCR works slow

Hi,

I’m trying to extract specific data from PDF using Get OCR Text method and insert that data in excel sheet. I have to fetch data from multiple PDF’s containing multiple Pages. It is taking too much time to extract data from each PDF. Is there any way to reduce the time to fetch data from PDF using OCR method?

Kind Regards,
Renju

What OCR engine are you using? I recommend you try multiple ones and the decide on the best one for your purpose, both as far as speed and quality go.

I’m using Microsoft OCR and Tesseract OCR. Both are taking more time for execution

There are multiple better alternatives than Get OCR Text, if you are looking for the entire text of a PDF document.
For this purpose, you should try the “Read PDF Text” or “Read PDF With OCR” activities from the UiPath.PDF.Activities package.

If you want to extract some very specific data, I recommend you use the UiPath.IntelligentOCR.Activities package and the Document Understanding framework that UiPath is offering.

You can have a look at a sample such workflow in here: How to use the IntelligentOCR Package .

Hope this helps,

Ioana

1 Like