Train User Validated Data in Action Center that has extra information that's not in the document

Hi Everyone,

I have an ML package trained on top of OOB Invoice model. The project is built using the Document Understanding Template. Currently the boolean flag for training user validated data in the Orchestrator Asset is turned off but I’m planning to turn it on for fine tuning once we achieve a satisfactory amount of confidence. I had a few doubts in mind regarding this.

My taxonomy has column fields like description, qty, unit price. When the DU extraction runs it runs fine extracting these fields. But, there are certain documents which doesn’t have any unit price, just the description and amount. In such cases, the user goes into action center and manually add the values for unit price. There may also be cases where the user might manually add additional charges as line items in the ‘Items’ table. In Action Center, to include such values that are not in the document, you need to point the field to a dummy value in the document and then change its extracted value.

This works fine for the process. But this might cause issues in fine tuning the model if this validated data is used for training.

Is there any way to mitigate this?

hi @Subham_Don ,

I assume you are using document validation actions for user to add additional charges as line items.

How about creating a different form actions which trigger a small sub-process to save the additional charges as line items in a text file that the main process can consume?

I share your concern that pointing the field to a dummy value could cause confusion in the future.

If you’ve added ‘dummy’ values to your data to substitute data missing in the form then it is indeed not fit for retraining the model on and you should not use it for training.

Any alternative to fix this issue?

Well you need to filter out the ones that have dummy values somehow and exclude them from the training loop.

I went through the documentation for Validation Station and found that there’s an option to add a value to field without referencing it. But I don’t see this option in Action center nor in validation station. Is there a setting to enable this? If I can enable this for the line items as well. I think it will resolve the issue.