Future of Legacy Document Understanding & Labeled Data Upload API for Modern DU

Hi UiPath Team,

I have two main questions about the Document Understanding roadmap:
1. Legacy DU Deprecation:
Is there an official plan or timeline for deprecating Legacy Document Understanding (AI Center)? Will custom extractors and automated training pipelines remain supported, or should we expect migration requirements soon?
2. Modern DU Data Upload API:
Will Modern (Specialized) DU support an API for programmatically uploading labeled/annotated datasets similar to the dataset APIs in Legacy DU? Is there a roadmap for enabling a fully automated data ingestion and retraining pipeline in Modern DU?

Our team relies on automated training and versioning for IDP projects. We want to future-proof our architecture and align with UiPath’s strategy.

Appreciate any official input or roadmap insights from UiPath engineers or product managers.

Thank you!
Azeem

@azeem_r

  1. Currently depreciation is announced only for OOTB models and not the pre existing models..but yes there might be a change that even they would sunset but no official confirmation yet
  2. In the new one there are two ways 1 is Modern and other is IXP…IXP supports prompt directly and would be the next big change. Coming to modern in modern training and all happens differently than the classic one. there is a proposed workaround How To Implement A Document Understanding Training Loop On Cross Platform Activities or Studio Web…
  3. Also there is new retraining releases as well Document Understanding - Retrain extractors
  4. API for upload or annotated dataset is not present as of now ..currently available apis Document Understanding - Use the Discovery APIs

cheers

Hi @Anil_G,

Thanks for the clarification.

I’ve read through the IXP documentation, and from what I can tell, it currently focuses more on governance and integration rather than exposing new model control capabilities directly. I don’t currently have access to IXP, so I’d like to ask:
Will IXP provide a more flexible or automated approach to retraining models or uploading labeled data?

I’ve reviewed this thread, and it aligns with my initial plan. However, my actual goal is to upload labeled data into Modern DU via API.

Thanks, I think this is what I’m looking for. I’ve read the documentation, but it’s still a bit unclear:
Does retraining always require manual review in the ‘Exceptions for review’ section before taking effect, or is it automatically applied unless differences are detected?

Regards,
Azeem

1 Like

@azeem_r

there is no labelling alone here. the prompt will help in identifying the value from document. We need to improve prompt to get values instead of retraining or uploading new docs..once a new doc comes you can upload and check the predictions are correct or not and if needed update prompt..no annotation or labelling

There is no current api as mentioned

as of now yes that is the new way to go ahead. It comes into exception or human review only when there is a need so manual review is needed

cheers

Thanks for the quick response.

I understand that IXP’s “Unstructured & Complex Documents” capability is fully prompt-based (there’s no labeling or retraining involved) which makes sense for generative use cases. However, is this the same case for Specialized DU as it still requires retraining? Will the IXP provide a feedback learning for Specialized DU using Generative Extraction?

I’m sorry, I hope we’re in the same page. I meant of a scenario:

  • A document is already validated in Validation Station

[UiPath Documentation]
UiPath.DocumentUnderstanding.Activities : all documents that were processed using this activity package and were validated in Validated Station are collected automatically and can be used for retraining.

  • Then used in Project Extractor Trainer to trigger retraining in Modern DU.

[UiPath Documentation]
UiPath.IntelligentOCR.Activities (starting with version 6.25.0-preview): to retrain documents processed using this activity package, use the Document Understanding Project Extractor Trainer activity in your workflow. This allows documents to be collected for retraining purposes.

  • The same document is reviewed (for the second time) in Build section under ’ Exceptions for review’

[UiPath Documentation]
Note:
Collected documents are not automatically included in the training set. You need to review the documents and confirm their addition in the training set to retrain your model.

Does this mean, we need to manually validate a document twice before retraining?

Regards,
Azeem

1 Like

@azeem_r

If you see in IXP the specialized DU redirects to modern DU only..so it works same way where the next scenario comes in

Now coming to modern DU

  1. In validation station the user/business user tries to validate the results.
  2. For the DU exception review ideally developer/Data analyst would review and process it

Yes two times its needed. As there might not be any change in vlaidation station or extracted value can be corrected or added which is not present in document also or may be due to ocr issues it might be wrongly identified..so validation results flow but would not confirm the exception review

cheers

Got it. Appreciate all the insights!

Cheers,
Azeem

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.