Tesseract OCR Combine Languages

Hi everyone,

I am using the Read PDF with OCR activity with Tesseract OCR on a bilingual document (Greek and English).

Individually, both languages work perfectly in the Language property field (“ell” works, and “eng” works). However, I need to use them simultaneously.

When I try standard Tesseract multi-language syntax, UiPath throws an “Invalid input language” error:

“ell+eng” (Error)
“ell eng” (Error)
“ell, eng” (Error)

It seems the UiPath activity validation strictly expects a single language code and rejects the + delimiter.

Is there a workaround to pass multiple languages to Tesseract without running the OCR activity twice?

Thanks!

Hi @andreas.theodoridis

You may give a try on the following

Instead of using the activity directly, you can call Tesseract where multi-language is supported.

tesseract input.pdf output -l ell+eng

Use Start Process or Invoke Code then Pass -l ell+eng as an argument

Thanks,
Christopher