This topic goes in-depth about the improvements to Document Understanding. To read about other products, please navigate to the main topic here.
We’re happy to share with you our latest work - all the bits & pieces coming together in our 21.10 release, which comes with enhancements of current functionality (from allowing users to view a relevant page range in the Validation Station, to performance & experience improvements when it comes to the digitization of file), Document Understanding Process Studio Template to the redesign of the Taxonomy Manager as well as new functionality - enabling users to run the UiPath OCR on their local machine. Keep reading for details
It is finally out! We have worked on redesigning the Taxonomy Manager to enhance your experience, allowing you to easier manage document types and fields, organized by categories and groups - and on top, define custom colours & hot keys for fields, so that working with them in the Validation and Classification Station is better than ever!
Have you ever wanted to start from a best practices template when creating a Document Understanding Process? Or just have a template which you would then modify as per your needs (without adding all DU Activities, one by one)? Well, you can finally do so!
You can finally find a Template for the Document Understanding Process as Studio Template - get it & modify it for your needs - it’s the best way to use the framework!
With this release, we have enabled the “Create Document Validation Action” to send only the relevant page ranges to the Validation Station in the Cloud, so that Knowledge Workers using the Action Center can focus on only the document part of their interest, instead of seeing extra pages which they do not need to review or process!
This also results in increased performance, as only relevant document pages, dom and extraction results are sent to and from the Action Center, to the “Wait For Document Validation Action And Resume” Activity - where they will be picked up for further processing.
Note: The ShowOnlyRelevantPageRange option can only be configured on the “Create Document Validation Action”, having the “Wait For Document Validation Action And Resume” using the same value for it (as configured previously).
We have worked on improving the general performance of the product. Among others in this sense, we have enabled RPA Developers to use locally installed packages of OCR engines for Documents and Screen Scraping, providing for the two corresponding OCR activities (Screen OCR and Document OCR) the option to “UseLocalServer”, not requiring an AI Center hosted model, but rather running on the robot’s local machine. The activities require the UiPath.ComputerVision.LocalServer and UiPath.DocumentUnderstanding.OCR.LocalServer respectively package installed to use the local server mode.
In addition to the Output Folder, we are now providing the possibility of sending data from Validation Station directly to a Dataset in your AI Center Tenant.
When using the Dataset option, the activity will send the data to a folder called “fine-tune” within the selected Dataset, where the Data Manager expects it to be and from where it imports (and removes, after import - so do use both the AI Center Dataset and the local folder property if you want a backup for it).
We have adapted our packages to support .net5 framework:
We have worked on improvements of the digitization results and boxes around them, to ensure a better user experience when using the digitization algorithms both in the Data Manager, as well as in the workflow