Error fix for "Error Message: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure" encountered while running a pipeline?
Issue Description: Getting the below error in the ML logs while running a pipeline.
Error Message: Document type invoices not valid, check that document type data is in the dataset folder and follows folder structure.
Resolution: Validate the below points:
- auto_retraining which allows the completion of the Auto-retraining Loop; if the variable is set to True, then the input dataset needs to be the export folder associated with the labeling session where the data is tagged; if the variable remains set to False, then the input dataset needs to correspond to the following dataset format.
- When creating a pipeline, you need to select the dataset sub-folder - so it takes the correct dataset.
- Check if the dataset being used for the training pipeline is in the /export folder as shown below:
Note: When creating a new Pipeline to run, make sure not to select the complete dataset while creating the pipeline in the "choose input dataset". The structure of the dataset should be like: Dataset>Export>packageName.