UiPath Document Understanding for CPU - in Preview for On Premises deployments

Hello to all friends of Document Understanding!

One of the most important characteristics of any OCR engine is its speed on CPU and we have made significant investments in the past year to improve the speed of Uipath Document OCR when running on CPU. This has been a common request especially from customers with On Premises deployments of AI Center or Automation Suite.

We have heard these requests and today we are releasing in Preview the UiPathDocumentOCR_CPU ML Package for AI Center On Premises! For On Prem connected deployments it should become visible automatically. For Offline deployments follow these instructions.

This is an ML Package which can be deployed in exactly the same way as the tried and trusted UiPathDocumentOCR ML Package, with the following differences.

  • it will only run on CPU
  • it is optimized to run on CPU, so you should see a 3-4x speedup when running in workflow, and 5-10x speedup when using it to import documents into Document Manager
  • accuracy is slightly lower than the UiPathDocumentOCR ML Package, and it is similar to the OCR.LocalServer Studio package
  • handwriting support is not included on this release - it is only in the Cloud version at this time

This should help to increase the throughput of your Document Understanding automations.

Happy automating!

Your friendly Document Understanding team.

9 Likes

Can UiPath provide any quantification of the difference in accuracy? i.e. evaluation predictions correct against ground truth

This will depend a lot on the kinds of documents you throw at it. On hard documents like crumpled receipts photographed with a smartphone in poor lighting, the accuracy will be 2-3% lower. On relatively clean scanned or native pdfs, the accuracy will be about the same, or maybe only 0.1-0.3% lower. If you have a combination of the above, then the overall accuracy might be about 1% lower. On our internal testsets what that means is that the number of errors increases by anywhere between 5-30%. Less for clean docs, more for hard to read docs.

Best,
Alex.

1 Like