ML extractor trainer

Christodoulos · June 15, 2023, 9:26am

Greetings UiPath community .

I am trying to extract some invoices using document understanding methods.
The problem i am having is that those invoices are not really written in a clean structured way, some are pictures converted to pdf others are receipts and others are tables without lines between the rows and sometimes a “record” takes 2 rows.

I have made a template for some of those cases (receipts , tables, invoice …) but i lack training sample to make the extractor work correctly.

I see there is a train Extractor scope activity but i have no idea how to use it. I can’t find much in the internet about it other than the documentation.

I don’t know what to put on as a private dataset or the dataset Endpoint.

Any discussion is welcome! Thanks

DanRagh · June 22, 2023, 7:45am

Hi @Christodoulos

Will you able to provide few more details related to your build, as in which extractor are you using? what kind of documents are you trying to extract.

To give you a heads up, the activity you are using in the ML extractor trainer. Which needs you to use AI Center, for the Dataset, ML Skill and the Document Manager.

Hope this helps somehow.

Regards.

Christodoulos · June 22, 2023, 8:17am

Hey ,

I have both invoices and receipts and am using 2 ML extractors using the public endpoints for those types.
My documents are not really “well written” and the extractors commonly makes mistakes (merges 2 rows, takes the amount a row below etc) So i was thinking of just retraining the existing endpoints as i validate some of my own documents to help it get trained with those “templates” and have better results in future familiar ones.

The real problem is that i don’t have license for Ai center and i need to realise if this “Retraining” is actually going to drastically improve the extractor and it is worth it

Thanks

Topic		Replies	Views
How to train extractors with Document Understanding RPA Discussions machine-learning , general	2	1269	March 16, 2022
How to use and train custom ML model in Document Understanding Help activities , question , document_understanding	8	3396	May 15, 2021
Train Extractors Scope Help ai , ml , document_processing , intelligent_ocr	7	3534	November 8, 2020
ML Trainning Help activities , question , ml , intelligent_ocr	2	1819	March 19, 2020
Trainable ML model for invoice extraction - Pipeline failed AI Center question , ai_center	5	2427	May 5, 2021

ML extractor trainer

Related topics