Error Message " Document Type Invoices Not Valid, Check That Document Type Data Is In Dataset Folder And Follows Folder Structure"

Error fix for "Error Message: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure" encountered while running a pipeline?

Issue Description: Getting the below error in the ML logs while running a pipeline.

Error Message: Document type invoices not valid, check that document type data is in the dataset folder and follows folder structure.

Resolution: Validate the below points:

  • auto_retraining which allows the completion of the Auto-retraining Loop; if the variable is set to True, then the input dataset needs to be the export folder associated with the labeling session where the data is tagged; if the variable remains set to False, then the input dataset needs to correspond to the following dataset format.
  • When creating a pipeline, you need to select the dataset sub-folder - so it takes the correct dataset.
  • Check if the dataset being used for the training pipeline is in the /export folder as shown below:


Note: When creating a new Pipeline to run, make sure not to select the complete dataset while creating the pipeline in the "choose input dataset". The structure of the dataset should be like: Dataset>Export>packageName.

1 Like