Train User Validated Data in Action Center that has extra information that's not in the document

Subham_Don · April 15, 2024, 9:07am

Hi Everyone,

I have an ML package trained on top of OOB Invoice model. The project is built using the Document Understanding Template. Currently the boolean flag for training user validated data in the Orchestrator Asset is turned off but I’m planning to turn it on for fine tuning once we achieve a satisfactory amount of confidence. I had a few doubts in mind regarding this.

My taxonomy has column fields like description, qty, unit price. When the DU extraction runs it runs fine extracting these fields. But, there are certain documents which doesn’t have any unit price, just the description and amount. In such cases, the user goes into action center and manually add the values for unit price. There may also be cases where the user might manually add additional charges as line items in the ‘Items’ table. In Action Center, to include such values that are not in the document, you need to point the field to a dummy value in the document and then change its extracted value.

This works fine for the process. But this might cause issues in fine tuning the model if this validated data is used for training.

Is there any way to mitigate this?

Keegan_Kosasih · April 16, 2024, 10:02am

hi @Subham_Don ,

I assume you are using document validation actions for user to add additional charges as line items.

How about creating a different form actions which trigger a small sub-process to save the additional charges as line items in a text file that the main process can consume?

I share your concern that pointing the field to a dummy value could cause confusion in the future.

Jon_Smith · April 16, 2024, 10:11am

If you’ve added ‘dummy’ values to your data to substitute data missing in the form then it is indeed not fit for retraining the model on and you should not use it for training.

Subham_Don · April 16, 2024, 2:50pm

Any alternative to fix this issue?

Jon_Smith · April 16, 2024, 3:29pm

Well you need to filter out the ones that have dummy values somehow and exclude them from the training loop.

Subham_Don · April 16, 2024, 4:32pm

I went through the documentation for Validation Station and found that there’s an option to add a value to field without referencing it. But I don’t see this option in Action center nor in validation station. Is there a setting to enable this? If I can enable this for the line items as well. I think it will resolve the issue.

Topic		Replies	Views
How to continuously retrain Invoices ML model with Action Center Validation Station input? Document Understanding orchestrator , studio , question , ai_center	6	2877	September 18, 2021
Can We Train Document through Action Center to reflect in AI Center? Automation Starter action_center , ai_center	3	1166	August 10, 2022
Extraction Validation Action center (Document Understanding) Document Understanding question , document_understanding , action_center , help , document-validation	2	481	June 20, 2023
Feature Request - DU - Create Document Data Validation/Present Validation Station Activity Activities activities , studio , considering , feedback , document_understanding , action_center , new_feature_request , present-validdation-station	6	876	October 6, 2022
Action Center retrain ML model after Document validation? Automation Starter action_center , ai_center	7	1556	June 30, 2022

Most Active Users - Yesterday
Stef_99
ashokkarale
lrtetala
anjani_priya
Ajay_Mishra
Anil_G
samantha_shah
Hacene_SENDEL
Happydayyy
Sathish_Kumar_S
More details...

Train User Validated Data in Action Center that has extra information that's not in the document

Related Topics