UiPath Extended Languages OCR Public Preview

We’re excited to announce our latest OCR engine, the UiPath Extended Languages OCR, which can digitize documents in over 200 languages. This new OCR engine performs significantly better for Chinese, Japanese, and Korean than our current OCR engines. Additionally, it can now process documents in Thai, Vietnamese, and all major languages from India, as well as languages that use the Cyrillic alphabet and Greek, as well as over 200 other languages.

We recommend starting with the UiPath Document OCR when selecting an OCR engine. However, if you encounter documents in a language that the UiPath Document OCR does not support or you notice that it is not performing well, you can switch to one of our other OCR engines, such as the UiPath Extended Languages OCR. For more information about supported languages, navigate to our documentation: Document Understanding - OCR.

How to use the UiPath Extended Languages OCR in Studio

The Extended Languages OCR can be installed using IntelligentOCR package version 6.18.0-preview.

Once installed, search for the Extended Languages OCR in the Activities pane. Drag and drop it into the Digitize Document activity and run your Document Understanding workflow to test it.

The endpoint and API key are automatically populated in the Project Settings.

How to use the UiPath Extended Languages for document annotation

When creating a project, click Advanced Options to select the Extended Languages OCR. Both Classic and Modern projects will digitize documents using this OCR.

The new OCR is also available in the AI Center within the Document Manager:

Current limitations

  • Once a document is imported into Document Manager/Document Type sessions, it cannot be re-digitized without losing the annotation information. The only way to re-digitize documents is to export and reimport them.
  • UiPath Extended Languages OCR is not yet available on Automation Suite. Currently there’s no exact timeline for when it will become available.
14 Likes