I’m trying to retrain the Invoice out of the box ML model in my AI Center on-prem using the training data generated by the validation station in Action Center.
This is the data structure generated by the validation station:
The Invoice Trainer activity is automatically uploading it from Studio to a DataSet, inside of a fine-tune folder:
I'm not using the Data Manager in any step of the process since I'm trying to retrain an out of the box model and did not customize my labelling. Still, the pipeline is failling with the log bellow. What am I doing wrong?
Train only of Invoices 8.0 launched - Run 260f0253-58fc-42ca-9eda-cfdc4f787f76
Train only of Invoices 8.0 scheduled - Run 260f0253-58fc-42ca-9eda-cfdc4f787f76
Train only of Invoices 8.0 started - Run 260f0253-58fc-42ca-9eda-cfdc4f787f76
Train only of Invoices 8.0 failed - Run 260f0253-58fc-42ca-9eda-cfdc4f787f76Error Details : Pipeline failed due to ML Package Issue
2021-12-09 16:31:01,462 - UiPath_core.trainer_run:main:66 - INFO: Starting training job…
2021-12-09 16:31:07,699 - UiPath_core.storage.local_storage_client:download:113 - INFO: Dataset from bucket folder training-7d8e37df-0ca0-4f11-bff6-37660dcfa5ee/53d39051-e17a-4b8a-bd75-a153260c534e/440fba9a-f7d1-4179-9e09-1409bd1faf85 with size 3 downloaded successfully
2021-12-09 16:31:07,700 - UiPath_core.training_plugin:train_model:109 - INFO: Start model training…
2021-12-09 16:31:07,700 - UiPath_core.training_plugin:initialize_model:103 - INFO: Start model initialization…
2021-12-09 16:31:07,702 - root:_valid_doctype_folder_structure:63 - ERROR: images/ directory does not exist / is empty for {‘name’: ‘invoices’, ‘folder’: ‘’, ‘language’: ‘en’, ‘dataset’: {‘account_name’: None, ‘folder’: ‘’, ‘path’: ‘/microservice/dataset’, ‘dataloader_workers’: 0, ‘vocabulary_padding_id’: 0, ‘vocabulary_unknown_id’: 1, ‘text_pp_remove_symbols’: False, ‘text_pp_lemmatization’: False, ‘text_pp_remove_stop_words’: False, ‘word_embedding’: ‘unknown_id’, ‘max_words’: 10000, ‘max_image_size’: [300, 300], ‘date_format_classifier_data’: [‘receipts’, ‘invoices’, ‘invoices_au’, ‘invoices_india’, ‘utility_bills’, ‘purchase_orders’, ‘invoices_japan’, ‘unknown’], ‘replace_patterns’: [‘date’, ‘number’, ‘checkbox’], ‘doctype2id’: {}, ‘clftask2id’: {}, ‘id2clftask’: {}, ‘clf_tasks_by_doctype’: defaultdict(<class ‘list’>, {})}, ‘path’: ‘/microservice/dataset/’, ‘split’: ‘/microservice/dataset/split.csv’, ‘schema’: ‘/microservice/dataset/schema.json’} dataset
2021-12-09 16:31:07,702 - UiPath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
2021-12-09 16:31:07,709 - UiPath_core.trainer_run:main:81 - ERROR: Training Job failed, error: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/trainer_run.py”, line 76, in main
wrapper.run()
File “/microservice/training_wrapper.py”, line 57, in run
return self.training_plugin.model_run()
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 146, in model_run
raise e
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 138, in model_run
self.run_train_only()
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 207, in run_train_only
self.train_model(self.local_dataset_directory)
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 111, in train_model
self.model.train(directory)
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 99, in model
self.initialize_model()
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 105, in initialize_model
self._model = train.Main()
File “/microservice/train.py”, line 24, in init
self.opt = self.get_options()
File “/microservice/train.py”, line 105, in get_options
opt = preprocess.configure_options(opt)
File “”, line 99, in configure_options
Exception: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
2021-12-09 16:31:16,296 - UiPath_core.trainer_run:main:66 - INFO: Starting training job…
2021-12-09 16:31:16,296 - UiPath_core.trainer_run:main:66 - INFO: Starting training job…
2021-12-09 16:31:22,290 - UiPath_core.logs.upload_log_service:upload_logs_file:56 - INFO: Retry Training Triggered:
2021-12-09 16:31:22,412 - UiPath_core.storage.local_storage_client:download:113 - INFO: Dataset from bucket folder training-7d8e37df-0ca0-4f11-bff6-37660dcfa5ee/53d39051-e17a-4b8a-bd75-a153260c534e/440fba9a-f7d1-4179-9e09-1409bd1faf85 with size 3 downloaded successfully
2021-12-09 16:31:22,412 - UiPath_core.training_plugin:train_model:109 - INFO: Start model training…
2021-12-09 16:31:22,412 - UiPath_core.training_plugin:initialize_model:103 - INFO: Start model initialization…
2021-12-09 16:31:22,414 - root:_valid_doctype_folder_structure:63 - ERROR: images/ directory does not exist / is empty for {‘name’: ‘invoices’, ‘folder’: ‘’, ‘language’: ‘en’, ‘dataset’: {‘account_name’: None, ‘folder’: ‘’, ‘path’: ‘/microservice/dataset’, ‘dataloader_workers’: 0, ‘vocabulary_padding_id’: 0, ‘vocabulary_unknown_id’: 1, ‘text_pp_remove_symbols’: False, ‘text_pp_lemmatization’: False, ‘text_pp_remove_stop_words’: False, ‘word_embedding’: ‘unknown_id’, ‘max_words’: 10000, ‘max_image_size’: [300, 300], ‘date_format_classifier_data’: [‘receipts’, ‘invoices’, ‘invoices_au’, ‘invoices_india’, ‘utility_bills’, ‘purchase_orders’, ‘invoices_japan’, ‘unknown’], ‘replace_patterns’: [‘date’, ‘number’, ‘checkbox’], ‘doctype2id’: {}, ‘clftask2id’: {}, ‘id2clftask’: {}, ‘clf_tasks_by_doctype’: defaultdict(<class ‘list’>, {})}, ‘path’: ‘/microservice/dataset/’, ‘split’: ‘/microservice/dataset/split.csv’, ‘schema’: ‘/microservice/dataset/schema.json’} dataset
2021-12-09 16:31:22,414 - UiPath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
2021-12-09 16:31:22,416 - UiPath_core.trainer_run:main:81 - ERROR: Training Job failed, error: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/trainer_run.py”, line 76, in main
wrapper.run()
File “/microservice/training_wrapper.py”, line 57, in run
return self.training_plugin.model_run()
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 146, in model_run
raise e
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 138, in model_run
self.run_train_only()
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 207, in run_train_only
self.train_model(self.local_dataset_directory)
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 111, in train_model
self.model.train(directory)
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 99, in model
self.initialize_model()
File “/home/aifabric/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 105, in initialize_model
self._model = train.Main()
File “/microservice/train.py”, line 24, in init
self.opt = self.get_options()
File “/microservice/train.py”, line 105, in get_options
opt = preprocess.configure_options(opt)
File “”, line 99, in configure_options
Exception: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.