How to add Polish language in Tesseract OCR

Hi all,

I need to add polish language in Tesseract OCR in UiPath. I have already added Polish traineddata in folder tessdata by instructions from Installing OCR Languages but it won’t work.
I added file on location: C:\Program Files\UiPath\Studio\tessdata , and also added it to location C:\Users\username.nuget\packages\uipath.vision\3.1.4\build\tessdata

I’m constantly getting errors.


Please, can someone help me with this issue.

Hey,
polish letters works fine in this activity. Can you check if you have installed polish keyboard and language on your PC ?

Does the user running the Studio and Robot process have access to the training file?

Yes, it has access to the training file

When I leave language field empty it doesn’t read special letters from polish alphabet. The issue is that I need special letters for addresses and names of clients

I’m using this activity and works fine for my example with “ł” letter.
image
From which package you use ?

1 Like

Hi

Upgrade your packages and give a try

Go to design tab-> Manage Packages-> all Packages sand upgrade all the available packages and try once

Cheers @mkriznjak

Same activity package… but which OCR do you use in it?

Ok.
So, I left language properties empty and it’s working fine for “ł”. The issue do you have only for “ó” or for anything else ?
What operation system do you have and which regional settings ? I will try check on similar environment.

1 Like

Hi,

Do you use Community Edition? If so, Can you try to put train data as the following?

C:\Users\[UserName]\AppData\Local\Programs\UiPath\Studio\net461\tessdata\pol.traineddata

It works for Japanese trainning-data

Regards,

1 Like

Hi,

No it’s not a community edition of UiPath.

But thanks!

The issue is when I need to read city and street names also client surnames, but I found it out when bot tries to read city Łódź, he reads it as Lodz.

OS is Windows 10 Pro

Capture2

Could you check this option ?
image

1 Like

Same as yours.
Capture3

Hey @mkriznjak ,
I think I found the solution. On my PC it worked well :slight_smile:
So, firstly you have to go here:
C:\Users\ userName \ .nuget\packages\uipath.vision\3.1.4\build\net461\tessdata
and you have to copy polish language:
pol.zip (8.6 MB)
Next you have to restart you UiPath Studio and everything should works fine :slight_smile:



I hope it helps you :crossed_fingers:

5 Likes

Hi,

In my environment (21.10.4 Enterprise), it also works in the following path. If still doesn’t work, can you try this?

C:\Program Files\UiPath\Studio\net461\tessdata

img20211228-2

Regards,

Ok. I have 21.4.5 Enterprise and I found this package in C:\Users\ userName \ .nuget\packages\uipath.vision.…
For Community Edition I also checked and I had the same location as in the Enterprise.
I wonder what it depends on ? :thinking:

1 Like

Hi,

In my environment, all the following works.

Enterprise(21.10.4)

C:\Program Files\UiPath\Studio\net461\tessdata
C:\Users\[UserName]\.nuget\packages\uipath.vision\3.1.4\build\net461\tessdata

CE(21.10.4)

C:\Users\[UserName]\AppData\Local\Programs\UiPath\Studio\net461\tessdata
C:\Users\[UserName]\.nuget\packages\uipath.vision\3.1.4\build\net461\tessdata 

However, the following doesn’t work, even though UiPath guide at Installing OCR Languages shows.

Enterprise(21.10.4)

C:\Program Files\UiPath\Studio\tessdata

I suppose it depends on internal logic of UiAutomation, we need some try if need to find anywhere except UiPath guide shows.

Regards,

Hi @pikorpa

It’s working, I placed tessdata folder on location C:\Users\username.nuget\packages\uipath.vision\3.1.4\build\tessdata, but I needed to put it here

C:\Users\ userName \ .nuget\packages\uipath.vision\3.1.4\build\net461\tessdata

Thanks a lot!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.