How to resolve the "Dataset Creation Failed" error received while running training pipeline for Document Understanding package in AI Center?
This error generally occurs if data set is not created correctly. The below points needs to be taken care while creating dataset for document understanding model:
- Data labelling needs to be performed on all the documents using "Data Manager". Note the steps given in Import Documents in Data Manager .
- At least 25 document needs to be labelled in order to train the model and each regular field should be labelled on at least 10 documents. Refer this link for using Data Manager:
- Below export requirements should be met. Refer Exporting of Documents in Data Manager .
- For training pipeline, ensure not to select "Make this as test" check box .Just import the documents ,do the labelling and export the it back (do not apply any filter). Then upload that folder to train the model: