Training AI Model - Document Understanding

Vi15P · January 12, 2022, 10:59am

Hi

I am using the standard out of the box package for invoice and purchase order model to extract information from PDF. However not all the information (fields) are extracted from the PDF. The PDF’s correspond to Invoices and PO’s and are of various formats.

I have trained the models using 15 pdf documents of each type and the evaluation score increased. Even though some of the missing information is now being extracted, the model is still unable to extract some fields (which it previously extracted)

How does the training work? What is the optimal number of documents on which the model has to be trained? Will it ever achieve 100% accuracy or extract all the fields?

system · January 14, 2022, 4:01pm

Hello @Vi15P!

It seems that you have trouble getting an answer to your question in the first 24 hours.
Let us give you a few hints and helpful links.

First, make sure you browsed through our Forum FAQ Beginner’s Guide. It will teach you what should be included in your topic.

You can check out some of our resources directly, see below:

Always search first. It is the best way to quickly find your answer. Check out the icon for that.
Clicking the options button will let you set more specific topic search filters, i.e. only the ones with a solution.
Topic that contains most common solutions with example project files can be found here.
Read our official documentation where you can find a lot of information and instructions about each of our products:
Watch the videos on our official YouTube channel for more visual tutorials.
Meet us and our users on our Community Slack and ask your question there.

Hopefully this will let you easily find the solution/information you need. Once you have it, we would be happy if you could share your findings here and mark it as a solution. This will help other users find it in the future.

Thank you for helping us build our UiPath Community!

Cheers from your friendly
Forum_Staff

alharethmazen · August 31, 2022, 2:59pm

Hello Dear, did you get a solution for this problem as i train the model each time i run the project and select the field manually every time

sharon.palawandram · August 31, 2022, 3:13pm

You cannot expect 100% accuracy from any ML model as there will always be a marginal error. How many variations of invoices do you see?

The number of invoices you need to train depends on the diversity of your invoices &POs. More diversity would mean you might have to train with more data.

If fields are not being extracted as before there might be a imbalance in your train dataset. Try to retrain the model with more data and keep iterating till you get a satisfactory model.

Slgus · September 1, 2022, 6:42pm

Here you have some information about the number of samples for training.

Training and Evaluation Pipelines (uipath.com)

Topic		Replies	Views
Training is not working in Invoice Extraction AI Center question , ai_center	2	1543	August 9, 2021
How to use and train custom ML model in Document Understanding Help activities , question , document_understanding	8	3514	May 15, 2021
Trainable ML model for invoice extraction - Pipeline failed AI Center question , ai_center	5	2474	May 5, 2021
Extract Data from Document Under standing Document Understanding uiautomation	9	340	June 3, 2024
Low accuracy of results - Document Undestanding Document Understanding question , document_understanding	5	1645	September 1, 2020

Training AI Model - Document Understanding

Related topics