ML extractor trainer

Greetings UiPath community :slight_smile: .

I am trying to extract some invoices using document understanding methods.
The problem i am having is that those invoices are not really written in a clean structured way, some are pictures converted to pdf others are receipts and others are tables without lines between the rows and sometimes a “record” takes 2 rows.

I have made a template for some of those cases (receipts , tables, invoice …) but i lack training sample to make the extractor work correctly.

I see there is a train Extractor scope activity but i have no idea how to use it. I can’t find much in the internet about it other than the documentation.
image

I don’t know what to put on as a private dataset or the dataset Endpoint.

Any discussion is welcome! Thanks

Hi @Christodoulos

Will you able to provide few more details related to your build, as in which extractor are you using? what kind of documents are you trying to extract.

To give you a heads up, the activity you are using in the ML extractor trainer. Which needs you to use AI Center, for the Dataset, ML Skill and the Document Manager.

Hope this helps somehow.

Regards.

Hey ,

I have both invoices and receipts and am using 2 ML extractors using the public endpoints for those types.
My documents are not really “well written” and the extractors commonly makes mistakes (merges 2 rows, takes the amount a row below etc) So i was thinking of just retraining the existing endpoints as i validate some of my own documents to help it get trained with those “templates” and have better results in future familiar ones.

The real problem is that i don’t have license for Ai center and i need to realise if this “Retraining” is actually going to drastically improve the extractor and it is worth it

Thanks