Label Action Center Data

Hi everyone!

I’m trying to train a model to extract data from PDF contracts using ML Skills.

Since I cannot access prod data, we started the training with only a few scrubbed contracts, but of course the extraction was not good.

So we ran the automation for some prod contracts, and the SMEs started to label them in Action Center (since the extraction was not getting good results). After a fair number of contracts were labeled, we exported the data and imported it to Data Manager.

The next step would be exporting it to a new dataset in Ai Fabric to train a new model.

BUT…

Some fields are missing (maybe the SMEs forgot to label them). So I want to know if it’s possible that we label the missing fields in Data Manager and try to do the training.

I ask this because I actually tried this, but I’m getting this error:
2023-05-26 17:46:35,357 - uipath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset Creation Failed

This is what I did to try resolve the problem:

  • Have more than 25 pages (we have about 1300 for 70 different fields)
  • Check if we have more than 10 pages labeled for each field (we do)

@Slgus

Are you training on the 0 minor or the already trained model?

and did you add the fields every where the new one?

If you have a taxonomy created then even there the fields are to be added

cheers

Are you training on the 0 minor or the already trained model?

0 minor, as it was a first training.

and did you add the fields every where the new one?

I’m not sure I understand the question. I added new fields and labeled them in the batch that was exported from Action Center. The other batch is part of the same environment in Data Manager, so it has the same fields.

If you have a taxonomy created then even there the fields are to be added

I understand, but this is for the code to work. My problem is that the pipeline is not able to train the model. It returns “Dataset creation failed”.