UiPath Community 2022.10 Stable Release - Document Understanding

Monica_Secelean · October 12, 2022, 9:20am

UiPath Community 2022.10 Stable Release - Document Understanding

This topic goes in-depth about the improvements in Document Understanding. To read about other products, please navigate to the main topic here.

We’re happy to share with you our latest work - all the bits & pieces coming together in our 22.10 release

Migration to .net5 cross-platform

With this release, we’ve finished our efforts of migrating the packages contributing to the Document Understanding functionality to .net5 cross-platform, enabling their usage on linux robots . We’ve migrated everything except for Omnipage, for which the linux support is currently missing and will come in a future release.

Capability enhancements

Besides the above, we’ve made several improvements to our existing capabilities
With the newest release of the Digitize Activity, one will have the UiPath Document OCR preselected as default OCR engine - allowing user to just adjust the engine if needed & easily get up & running using our own.
We have also worked on exposing the CJK OCR as service - having it now available for usage both in Studio as well as in the Data Manager.

In order to be able to gather more usage data when it comes to users performing Validation Operations in Action Center, we now report the following when working with the Create Validation and Classification Actions Activities:

AssignedToUser
CreatorUser
DeleterUser
LastModifierUser
CompletedByUser

Besides the above, we’ve improved the Validation Station, reporting the confidence score for each table entry, and the digitization algorithm by fixing reported bugs

Digitization & Extraction algorithm improvements

We also focused on enhancing the digitization algorithm, which means that the digitization of native PDFs may be faster & more accurate than before (applied when the “Apply OCR on PDF” flag is set to “auto” or “false”). Shall you find the above is not happening as we would hope, please do bring it to our attention and report eventual issues
Besides that, we introduced “hybrid OCR” approach, which would enhance our current “auto” option for digitizing documents, by processing native PDFs the 2 steps when a native PDF is identified:

Extract the native text
Are there images identified? => Cool, OCR them & extract the text from them as well.
In this way, one benefits from the best results for native PDFs, by natively extracting the printed text & OCRing the images.

Furthermore, we have refactored the Extraction Result object, to enable more flexible and user-friendly usage of it within the workflow by adding a new, simplified way to represent tables: they are now stored separately from other fields, in a flatter structure and with methods that help you access the data. Separately, we have also added new methods that make it easier to consume and modify data in the extraction result.
And to support the work with multi-value fields, we enhanced the ML Extractor & the ML Extractor Trainer, so that these (multi-value) fields sent from the Document Manager can be consumed and used in the activities.

And if the above was not enough, we are happy to report that with this release all our activities are .net6 compatible

Please tell us what you think

Which features excite you the most and you are going to try right away? We want to hear what you think! Please use the button below

Thong_Mai_Tr_ng_Hoang · October 19, 2022, 1:50am

Hi After upgrading the activities packages im unable to drag the Classifier into the Classifier Scope
Error

But only Machine Classifier is able to drag

Ioana_Ungureanu · October 19, 2022, 2:28pm

Thanks for noticing! We have a fix that will be available in the next version.

Senne_Symons · October 24, 2022, 12:42pm

Is the Hybrid OCR feature already part of the activity or not? Because it is mentioned as “we are working on”, so not really clear if it is already available.

Monica_Secelean · October 25, 2022, 5:04pm

Makes sense, will update the post, the hybrid OCR is out/should be working - report otherwise

balaraman.ramiya · October 26, 2022, 7:47am

As I do not see a note here, Is there any plan to bring this output format in Digitization service as well? (Document Understanding - Digitization Service Public Preview)

Monica_Secelean · October 28, 2022, 4:16pm

The plan is to expose all DU capabilities as Rest APIs - and have them work with the same objects like the DU RPA Framework - would this be something you’re looking for?

balaraman.ramiya · October 31, 2022, 9:31am

Yes, you are right. Do we have “detailed” documentation about this api’s with example? Also any plan to provide this function with Apps? like how the apps is integrated with data services.

Regards
Balram

Monica_Secelean · October 31, 2022, 3:26pm

@balaraman.ramiya we have something detailed for the Digitization Service here - however, please note it’s still in preview and we are working on it as we speak
With regards to Apps: any particular scenario you are looking for? We are considering an integration and currently looking at use cases, so your feedback is welcome

luchovelez · November 4, 2022, 12:16pm

got this message
Data Extraction Scope: Request CorrelationId: 646f36f3-613d-42ad-a3f8-bdb038922c67
Request PredictionId: Len69ZQho9HXhcT+fiBPw0rHChpZm9rAh95+SBthi4A=_0e96325a-1d3f-4b8d-97af-44b3ab05f47c
Invalid server response.
Http Response Code: 520
Http Response Content:

previously was working fine and and now throw that error

Seem like Invoice extractor are failing or was changed

Monica_Secelean · November 4, 2022, 12:53pm

@alexcabuz do you know anything about this?

alexcabuz · November 7, 2022, 10:47am

No. Is this using public endpoints, like https://du.uipath.com/ie/invoices ? Or a skill deployed in AI Center?

Alex.

balaraman.ramiya · November 10, 2022, 7:37am

Thanks for sharing and update. Documentation require improvement with samples. For Apps usecases, nothing specific in hand to share. I can visualize this feature can potentially enable for Claim Process, Point of Sale KYC verification and Event Registrations etc using filled forms through scanned /photo on demand. (This will minimize robot usage to extract the information especially for the out of the box models and DU)

Regards,
Balram

srividhya_ramamurthy · January 27, 2023, 3:01pm

Hi Monica , we have built an end-end automation on UiPath using doc understanding framework and when we tried deploying it to the server less bots , we understood it has to be built in cross platform and many of the packages including Doc understanding is not compatible with it. How do you tackle this issue and run bot in the serverless platform. Or in a linux VM provided by our organization.

Monica_Secelean · February 3, 2023, 3:22pm

Hello @srividhya_ramamurthy ,

You are indeed right, we would need cross-platform activities which currently are in progress by the DU team - we have some as part of the Document Understanding package, however, not all and we are still WIP. What activities are you looking for?

Monica

srividhya_ramamurthy · February 7, 2023, 4:46pm

Hi Monica ,

We need Taxonomy, Classification and Extraction activities. Also pls let me know about the rest services of doc understanding frame work and how do we use them . Cuz our utimate aim is to run our bot in the serverless cloud which demands cross platform compatible packages . since web api is well supported in cross platform, we feel doc framework as a service will be the best chou\ice for us to use . pls guide us on it .

Monica_Secelean · February 10, 2023, 10:26am

I’m happy to report we already have a version for Extraction, and the Classification one in progress With regards to Taxonomy: we advise using Forms AI for defining the extraction model, for which the Taxonomy is automatically generated.

We currently work at exposing DU as a service - keep an eye up, we’re planning to preview something soon

Hope this helps,
Monica

karina_bravo · May 19, 2023, 12:18pm

Hi Alex

I have the same issue and I’m using mlskills deployed in AI Center.
This is the error message
Error Message: Request CorrelationId: 4269b4d7-83cb-4767-95b0-1a17e3c11ea0 Request PredictionId: 4vNnn56bM/U7lZPn/6sClLX9hZJC7LBRUOCvolm42qU=_7514867d-b210-4a4e-841f-e0be39673ed5 Invalid server response. Http Response Code: 520 Http Response Content:

Topic		Replies	Views
22.7 Document Understanding Public Preview Product News activities , document_understanding	3	1454	July 29, 2022
Document Understanding - 2022.5 Community Preview Document Understanding document_understanding , document_processing	8	2433	February 1, 2023
UiPath Community 2023.10 Release - Document Understanding Product News	2	1382	November 15, 2023
UiPath Community 2021.10 Stable Release - Document Understanding Document Understanding document_understanding	1	1996	October 15, 2021
UiPath Community 2022.4 Stable Release - Document Understanding Document Understanding document_understanding , document_processing	5	2302	April 19, 2022

UiPath Community 2022.10 Stable Release - Document Understanding

UiPath Community 2022.10 Stable Release - Document Understanding

Migration to .net5 cross-platform

Capability enhancements

Digitization & Extraction algorithm improvements

Please tell us what you think

Related topics