Error Message " Document Type Invoices Not Valid, Check That Document Type Data Is In Dataset Folder And Follows Folder Structure"

Error fix for "Error Message: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure" encountered while running a pipeline?

Issue Description: Getting the below error in the ML logs while running a pipeline.

Error Message: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.

Resolution: Validate the below points,

  • auto_retraining which allows to complete the Auto-retraining Loop; if the variable is set to True, then the input dataset needs to be the export folder associated with the labeling session where the data is tagged; if the variable remains set to False, then the input dataset needs to correspond to the following dataset format.
  • When creating a pipeline, you need to select the dataset sub-folder - so it takes the correct dataset
  • Check if the dataset being used for training pipeline is the /export folder as shown below:


Note: When creating a new Pipeline to run, make sure not to select the complete DataSet while creating the pipeline in the "choose input dataset". Structure of data set should be like: Dataset>Export>packageName .