I have two main questions about the Document Understanding roadmap:
1. Legacy DU Deprecation:
Is there an official plan or timeline for deprecating Legacy Document Understanding (AI Center)? Will custom extractors and automated training pipelines remain supported, or should we expect migration requirements soon?
2. Modern DU Data Upload API:
Will Modern (Specialized) DU support an API for programmatically uploading labeled/annotated datasets similar to the dataset APIs in Legacy DU? Is there a roadmap for enabling a fully automated data ingestion and retraining pipeline in Modern DU?
Our team relies on automated training and versioning for IDP projects. We want to future-proof our architecture and align with UiPathâs strategy.
Appreciate any official input or roadmap insights from UiPath engineers or product managers.
Currently depreciation is announced only for OOTB models and not the pre existing models..but yes there might be a change that even they would sunset but no official confirmation yet
Iâve read through the IXP documentation, and from what I can tell, it currently focuses more on governance and integration rather than exposing new model control capabilities directly. I donât currently have access to IXP, so Iâd like to ask:
Will IXP provide a more flexible or automated approach to retraining models or uploading labeled data?
Iâve reviewed this thread, and it aligns with my initial plan. However, my actual goal is to upload labeled data into Modern DU via API.
Thanks, I think this is what Iâm looking for. Iâve read the documentation, but itâs still a bit unclear:
Does retraining always require manual review in the âExceptions for reviewâ section before taking effect, or is it automatically applied unless differences are detected?
there is no labelling alone here. the prompt will help in identifying the value from document. We need to improve prompt to get values instead of retraining or uploading new docs..once a new doc comes you can upload and check the predictions are correct or not and if needed update prompt..no annotation or labelling
There is no current api as mentioned
as of now yes that is the new way to go ahead. It comes into exception or human review only when there is a need so manual review is needed
I understand that IXPâs âUnstructured & Complex Documentsâ capability is fully prompt-based (thereâs no labeling or retraining involved) which makes sense for generative use cases. However, is this the same case for Specialized DU as it still requires retraining? Will the IXP provide a feedback learning for Specialized DU using Generative Extraction?
Iâm sorry, I hope weâre in the same page. I meant of a scenario:
A document is already validated in Validation Station
[UiPath Documentation] UiPath.DocumentUnderstanding.Activities : all documents that were processed using this activity package and were validated in Validated Station are collected automatically and can be used for retraining.
Then used in Project Extractor Trainer to trigger retraining in Modern DU.
[UiPath Documentation] UiPath.IntelligentOCR.Activities (starting with version 6.25.0-preview): to retrain documents processed using this activity package, use the Document Understanding Project Extractor Trainer activity in your workflow. This allows documents to be collected for retraining purposes.
The same document is reviewed (for the second time) in Build section under â Exceptions for reviewâ
[UiPath Documentation]
Note:
Collected documents are not automatically included in the training set. You need to review the documents and confirm their addition in the training set to retrain your model.
Does this mean, we need to manually validate a document twice before retraining?
If you see in IXP the specialized DU redirects to modern DU only..so it works same way where the next scenario comes in
Now coming to modern DU
In validation station the user/business user tries to validate the results.
For the DU exception review ideally developer/Data analyst would review and process it
Yes two times its needed. As there might not be any change in vlaidation station or extracted value can be corrected or added which is not present in document also or may be due to ocr issues it might be wrongly identified..so validation results flow but would not confirm the exception review