Unable to Scrape anything using Tesseract OCR

Dear All,

Another related post - Tesseract OCR not working (standalone + Screen Scraper) - has been extremely helpful in helping to dig deeper. I am posting these details here instead of the other one in case it may be considered closed.

I am having similar issues with Tesseract OCR as in the referred post. However, the fix does not seem to be the same in my case. I would greatly appreciate any insights that you may have.

#UIPath Studio Community 2019.6.0.
#.NETFramework,Version=v4.6.1
#Windows 7 Ultimate

Below are the steps I have carried out as mentioned in this thread:

1) Installed the debug build package as suggested by @florinszilagyi

While checking Vision Host logs, I would find that while running, the Tesseract package fails with the error:

"10:37:38.4147 Info Starting scrape. Image size: 195150.
10:37:38.8668 Info Scrape options: {“ExtractWords”:false,“Timeout”:null,“ComputeSkewAngle”:false,“Profile”:3,“Language”:“eng”,“Scale”:2.0,“FilterRegion”:null,“Engine”:0,“EngineImplClass”:null,“EngineOptions”:“{"AllowedCharacters":"","DeniedCharacters":"","Invert":false}”}
10:37:38.9478 Info Input language:eng, translated language:eng
10:37:38.9598 Info Getting tesseract language for C:\USERS\ADMINISTRATOR.NUGET\PACKAGES\UIPATH.VISION\1.6.0\BUILD\tessdata and language eng
10:37:38.9908 Warn Cannot initialize TesseractEngine with provided path C:\USERS\ADMINISTRATOR.NUGET\PACKAGES\UIPATH.VISION\1.6.0\BUILD\tessdata and language eng
10:37:39.0758 Error Error initializing Tesseract Engine.
10:37:39.0998 Fatal UiPath.Vision.OCR.OCRException: TessErrorLoadEngine —> UiPath.Vision.OCR.OCRException: TessErrorLoadEngine

  • at UiPath.Vision.Engines.TesseractLegacyEngine.Initialize()*
  • — End of inner exception stack trace —*"

2) Removed Read-only status of the tessdata folder.

3) I am the admin. So the runs are in admin mode.

4) My build path is “C:\USERS\ADMINISTRATOR.NUGET\PACKAGES\UIPATH.VISION\1.6.0\BUILD”. Hence the special character case probably do not apply in my case.

5) The file “eng.traineddata”(as shown below) is available in the “C:\USERS\ADMINISTRATOR.NUGET\PACKAGES\UIPATH.VISION\1.6.0\BUILD” path.

6) I have tried re-copying the “eng.traineddata” file into the tessdata folder as suggested by @florinszilagyi.

7) While I have installed most available UiPath packages (especially those from UiPath) related to OCR, I could not install the below “Tesseract OCR” package by Google:

Please help.

1 Like