Hi !
I’ve created a Pipeline for training the new created Data label and I received the following error:
Train only of DocumentUnderstandingPackage 10.0 launched - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Train only of DocumentUnderstandingPackage 10.0 scheduled - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Train only of DocumentUnderstandingPackage 10.0 started - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Train only of DocumentUnderstandingPackage 10.0 failed - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Error Details : Pipeline failed due to ML Package Issue
2021-06-16 11:33:56,826 - uipath_core.trainer_run:main:66 - INFO: Starting training job...
2021-06-16 11:34:01,260 - uipath_core.storage.azure_storage_client:download:95 - INFO: Dataset from bucket folder training-7529d73a-c168-49b8-b630-cd58f97bb25a/a9d84d39-3bb8-49b7-ac04-969186501871/a90fb08e-6e6f-4a8f-9b44-5db98857b25b with size 43 downloaded successfully
2021-06-16 11:34:01,261 - uipath_core.training_plugin:train_model:109 - INFO: Start model training...
2021-06-16 11:34:01,261 - uipath_core.training_plugin:initialize_model:103 - INFO: Start model initialization...
2021-06-16 11:34:01,262 - root:_valid_doctype_folder_structure:63 - ERROR: images/ directory does not exist / is empty for {'name': 'default', 'folder': '', 'dataset': {'account_name': None, 'folder': '', 'path': '/microservice/dataset', 'dataloader_workers': 0, 'vocabulary_padding_id': 0, 'vocabulary_unknown_id': 1, 'text_pp_remove_symbols': False, 'text_pp_lemmatization': False, 'text_pp_remove_stop_words': False, 'word_embedding': 'unknown_id', 'max_words': 10000, 'max_image_size': [300, 300], 'date_format_classifier_data': ['receipts', 'invoices', 'invoices_au', 'invoices_india', 'utility_bills', 'purchase_orders', 'invoices_japan', 'unknown'], 'replace_patterns': ['date', 'number', 'checkbox'], 'doctype2id': {}, 'clftask2id': {}, 'id2clftask': {}, 'clf_tasks_by_doctype': defaultdict(<class 'list'>, {})}, 'path': '/microservice/dataset/', 'split': '/microservice/dataset/split.csv', 'schema': '/microservice/dataset/schema.json'} dataset
2021-06-16 11:34:01,263 - uipath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
2021-06-16 11:34:01,267 - uipath_core.trainer_run:main:81 - ERROR: Training Job failed, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/trainer_run.py", line 76, in main
wrapper.run()
File "/microservice/training_wrapper.py", line 57, in run
return self.training_plugin.model_run()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 146, in model_run
raise e
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 138, in model_run
self.run_train_only()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 207, in run_train_only
self.train_model(self.local_dataset_directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 111, in train_model
self.model.train(directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 99, in model
self.initialize_model()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 105, in initialize_model
self._model = train.Main()
File "/microservice/train.py", line 24, in __init__
self.opt = self.get_options()
File "/microservice/train.py", line 105, in get_options
opt = preprocess.configure_options(opt)
File "<frozen extraction.model_tag.preprocess>", line 99, in configure_options
Exception: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
2021-06-16 11:34:10,093 - uipath_core.trainer_run:main:66 - INFO: Starting training job...
2021-06-16 11:34:13,486 - uipath_core.logs.upload_log_service:upload_logs_file:56 - INFO: Retry Training Triggered:
2021-06-16 11:34:14,395 - uipath_core.storage.azure_storage_client:download:95 - INFO: Dataset from bucket folder training-7529d73a-c168-49b8-b630-cd58f97bb25a/a9d84d39-3bb8-49b7-ac04-969186501871/a90fb08e-6e6f-4a8f-9b44-5db98857b25b with size 43 downloaded successfully
2021-06-16 11:34:14,395 - uipath_core.training_plugin:train_model:109 - INFO: Start model training...
2021-06-16 11:34:14,395 - uipath_core.training_plugin:initialize_model:103 - INFO: Start model initialization...
2021-06-16 11:34:14,397 - root:_valid_doctype_folder_structure:63 - ERROR: images/ directory does not exist / is empty for {'name': 'default', 'folder': '', 'dataset': {'account_name': None, 'folder': '', 'path': '/microservice/dataset', 'dataloader_workers': 0, 'vocabulary_padding_id': 0, 'vocabulary_unknown_id': 1, 'text_pp_remove_symbols': False, 'text_pp_lemmatization': False, 'text_pp_remove_stop_words': False, 'word_embedding': 'unknown_id', 'max_words': 10000, 'max_image_size': [300, 300], 'date_format_classifier_data': ['receipts', 'invoices', 'invoices_au', 'invoices_india', 'utility_bills', 'purchase_orders', 'invoices_japan', 'unknown'], 'replace_patterns': ['date', 'number', 'checkbox'], 'doctype2id': {}, 'clftask2id': {}, 'id2clftask': {}, 'clf_tasks_by_doctype': defaultdict(<class 'list'>, {})}, 'path': '/microservice/dataset/', 'split': '/microservice/dataset/split.csv', 'schema': '/microservice/dataset/schema.json'} dataset
2021-06-16 11:34:14,397 - uipath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
2021-06-16 11:34:14,402 - uipath_core.trainer_run:main:81 - ERROR: Training Job failed, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/trainer_run.py", line 76, in main
wrapper.run()
File "/microservice/training_wrapper.py", line 57, in run
return self.training_plugin.model_run()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 146, in model_run
raise e
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 138, in model_run
self.run_train_only()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 207, in run_train_only
self.train_model(self.local_dataset_directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 111, in train_model
self.model.train(directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 99, in model
self.initialize_model()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 105, in initialize_model
self._model = train.Main()
File "/microservice/train.py", line 24, in __init__
self.opt = self.get_options()
File "/microservice/train.py", line 105, in get_options
opt = preprocess.configure_options(opt)
File "<frozen extraction.model_tag.preprocess>", line 99, in configure_options
Exception: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
I wanted the mention that the files from Data Manager are exported by clicking the Export button.
Many thanks !