I’m facing some issues while using Document Understanding cloud version in AI Fabric. Basically, the final results i’m getting aren’t precise enough to be considered acceptable, even though i’m running a fairly simple and basic test.
I intended to build a ML Model on top of the out-of-the-box Document Understanding template and consume it from a RPA flow in UiPath Studio. I’ve followed documented guides to achieve this, so i will explain the steps i took.
- I requested an Enterprise Trial version to get access to AI Fabric in Automation Cloud
- I downloaded, installed and configuring Data Manager and OCR engine locally.
- I gathered 5 very-structured invoice documents, which had exactly the same layout and information distribution.
- I labelled the documents using Data Manager after creating a unique “regular field” for the final total amount of the invoice. Then i exported the results.
- In AI Fabric, i created a project and a ml package inside of it (using out-of-the-box templates).
- I uploaded the folder, generated by the Data Manager after exporting, as a new DataSet.
- I managed to run a pipeline for training the model created using the dataset
- Finally, i deployed the ML Package with a ML Skill and selected it in Machine Learning Extractor in “Document Understanding Framework” flow in UiPath Studio.
After configuring the fields to be extracted by ML activity (only the field “total”), i’ve tested the flow using 1 invoice with exactly the same layout and even with 1 of the invoices used for training it.
In the validation station step, the field total is filled with the door number of the address of the issuing entity.
I’ve repeated the whole process and included additional invoices with some rotations and artificial content-distribution changes using Photoshop.
The results are exactly the same. My question is whether i’m doing something wrong or not training the model with enough documents in order to work properly. I remark the fact of running a fairly simple test, 5 - 10 invoices which look exactly the same, and extracting 1 field. I’ll attach one sample invoice of the used for training.
I hope you can help me with this issue, thanks in advance !!
Antel-Febrero.pdf (254.3 KB)