Hi Everyone,
I have an ML package trained on top of OOB Invoice model. The project is built using the Document Understanding Template. Currently the boolean flag for training user validated data in the Orchestrator Asset is turned off but I’m planning to turn it on for fine tuning once we achieve a satisfactory amount of confidence. I had a few doubts in mind regarding this.
My taxonomy has column fields like description, qty, unit price. When the DU extraction runs it runs fine extracting these fields. But, there are certain documents which doesn’t have any unit price, just the description and amount. In such cases, the user goes into action center and manually add the values for unit price. There may also be cases where the user might manually add additional charges as line items in the ‘Items’ table. In Action Center, to include such values that are not in the document, you need to point the field to a dummy value in the document and then change its extracted value.
This works fine for the process. But this might cause issues in fine tuning the model if this validated data is used for training.
Is there any way to mitigate this?