Hello,
I’m trying to extract text in Arabic from an image, but I get the following error:
“Tesseract OCR: Error performing OCR: InvalidInputLanguage”
How can I fix this?
Hello,
I’m trying to extract text in Arabic from an image, but I get the following error:
“Tesseract OCR: Error performing OCR: InvalidInputLanguage”
How can I fix this?
Thats indeed strange.
Seems the help link on the activity is broken, but the UiPath Documentation is here
Links to a Github for the languages supported
‘ara’ seems valid.
Have you tried other languages to see if they also give a similar error? If some work and others dont that would be odd.
Hi @Samira_Rahme,
You have to check if the arabic language is added to Tesseract :
If it doesn’t, download it from the official repository: GitHub - tesseract-ocr/tessdata: Trained models with fast variant of the "best" LSTM models + legacy models
Then place it inside the tessdata folder.
hi, @Samira_Rahme To fix the “InvalidInputLanguage” error in Tesseract OCR for Arabic, download the ara.traineddata file from the Tesseract GitHub and place it inside the tessdata folder of your UiPath or Tesseract install. Restart Studio and make sure the Language is set to ara. This should solve the issue—Tesseract needs the correct language file to work with Arabic text.
I have done this, but the error still persists.
I have done this but the error still persists.
hi, @Samira_Rahme If you’ve added the Arabic language file but still get the error, try these:
Check you put the file in the right tessdata folder (sometimes it needs to be in your user AppData path, not just Program Files).
Make sure you downloaded the real .traineddata file, not a webpage or something else.
Restart UiPath Studio and your PC after adding the file.
If you still have problems, try other OCR engines like Google OCR for Arabic as a temporary workaround.
These steps usually fix the problem. Let me know if you want more help!