Hi everyone!
I’m trying to train a model to extract data from PDF contracts using ML Skills.
Since I cannot access prod data, we started the training with only a few scrubbed contracts, but of course the extraction was not good.
So we ran the automation for some prod contracts, and the SMEs started to label them in Action Center (since the extraction was not getting good results). After a fair number of contracts were labeled, we exported the data and imported it to Data Manager.
The next step would be exporting it to a new dataset in Ai Fabric to train a new model.
BUT…
Some fields are missing (maybe the SMEs forgot to label them). So I want to know if it’s possible that we label the missing fields in Data Manager and try to do the training.
I ask this because I actually tried this, but I’m getting this error:
2023-05-26 17:46:35,357 - uipath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset Creation Failed
This is what I did to try resolve the problem:
- Have more than 25 pages (we have about 1300 for 70 different fields)
- Check if we have more than 10 pages labeled for each field (we do)