Error fix for "Error Message: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure" encountered while running a pipeline?
Issue Description: Getting the below error in the ML logs while running a pipeline.
Error Message: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
Resolution: Validate the below points,
- auto_retraining which allows to complete the Auto-retraining Loop; if the variable is set to True, then the input dataset needs to be the export folder associated with the labeling session where the data is tagged; if the variable remains set to False, then the input dataset needs to correspond to the following dataset format.
- When creating a pipeline, you need to select the dataset sub-folder - so it takes the correct dataset
- Check if the dataset being used for training pipeline is the /export folder as shown below:
Note: When creating a new Pipeline to run, make sure not to select the complete DataSet while creating the pipeline in the "choose input dataset". Structure of data set should be like: Dataset>Export>packageName .