How to use and train custom ML model in Document Understanding

I am using Document Understanding features for my invoice processing user case.
I have 100+ different invoice formate for extracting invoice information like invoice number, Date Total amount, Supplier Name. Also I want to extract tabular format data that is present in the invoice.

When I tried to extract data using the default machine learning extractor It gives me an unexpected result. Sometimes it gives the expected result.

How to use my own custom Ml Model for extracting purposes using Document Understanding?
@Lahiru.Fernando @Palaniyappan

1 Like

Hi @netri

The scenario is clear. Only thing I’m not clear is, when you say sometimes it gives you an unexpected result, what does it mean?
You mean that it gives you different values? or does it run into any errors?

If we take one invoice,

  • if you run the process couple of times, do you get the same output? or does it differ?

When dealing with multiple documents, for some invoices it might work perfectly, but for some it might not extract some fields etc… could happen. The way to optimize that is by training the model based on the training data that can be extracted using Train Extractor scope

Hi Everyone,

I am having the same doubt where all the templates get unexpected results

Example: Gets CUSTOMER CODE if configured to VENDOR CODE (India Invoices Enterprise Endpoint).

If it doesn’t get those details that’s fine! But getting other details is confusing.

Since I have a lot of document types (which are not in a standard format) I don’t think the automatic extractor that UiPath provides will be able to extract the data I want. Is there a method where I can train the ML model by tagging the data I want to extract?

Also, I am unable to find any tutorials on how to use the “Train Extractor Scope”. Can you link some tutorial or documentation on that?

also need this one, how to train the model data? AI Fabric > Out of the box package > invoices > but when running pipeline it needs the dataset? where to get the dataset?

Hi @gtanjr

You have to build the dataset using the Data Manager tool. Data Manager is a part of the insider program. You need to access it through insider and by requesting access