I have seen the Failre log and pastng for reference.
2022-07-06 00:58:49,757 - uipath_core.trainer_run:main:73 - INFO: Starting training job…
2022-07-06 00:58:53,178 - matplotlib:_get_config_or_cache_dir:484 - WARNING: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-_2c0rldt because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2022-07-06 00:58:54,001 - matplotlib.font_manager:_load_fontmanager:1443 - INFO: generated new fontManager
2022-07-06 00:58:56,783 - uipath_core.storage.azure_storage_client:download:112 - INFO: Dataset from bucket folder training-49aa85ea-2b71-479f-aae1-db1d6c2f3371/9877f90d-d8f6-4f8b-95b6-872d6e8ef9b7/ca150876-0c82-455d-81d1-9e082800ce2b/export/invoice_healthcare_1_22-07-05T212300 with size 44 downloaded successfully
2022-07-06 00:58:56,783 - uipath_core.training_plugin:train_model:114 - INFO: Start model training…
2022-07-06 00:58:56,783 - uipath_core.training_plugin:initialize_model:108 - INFO: Start model initialization…
2022-07-06 00:58:56,785 - root:initialize_package:145 - INFO: Using package type provided by runtime argument with value: invoices
2022-07-06 00:58:56,785 - root:initialize_package:154 - INFO: Initializing invoices package options …
2022-07-06 00:58:56,787 - root:configure_options:158 - INFO: Document type invoices language: en
2022-07-06 00:58:56,787 - root:initialize_tokenizer:49 - INFO: Loading cached BERT tokenizer at /workspace/model/microservice/bert-base-multilingual-uncased_tokenizer
2022-07-06 00:58:56,872 - root:initialize_package:159 - INFO: System-Level Configuration:
2022-07-06 00:58:56,872 - root:initialize_package:160 - INFO: ATen/Parallel:
at::get_num_threads() : 3
at::get_num_interop_threads() : 2
OpenMP 201511 (a.k.a. OpenMP 4.5)
omp_get_max_threads() : 3
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
mkl_get_max_threads() : 3
Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
std::hardware_concurrency() : 4
Environment variables:
OMP_NUM_THREADS : 3
MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP
2022-07-06 00:58:56,873 - root:configure_options:158 - INFO: Document type invoices language: en
2022-07-06 00:58:56,873 - uipath_core.training_plugin:initialize_model:111 - INFO: Model initialized successfully
2022-07-06 00:58:56,873 - root:log_data_version_info:13 - INFO: =========Data version information=========
2022-07-06 00:58:56,886 - root:log_data_version_info:17 - WARNING: Unknown data version:
2022-07-06 00:58:56,887 - root:log_data_version_info:17 - INFO: ==========================================
2022-07-06 00:58:56,888 - root:preprocess_data:603 - INFO: Creating dataset for document type invoices…
2022-07-06 00:58:57,146 - root:preprocess_data:605 - INFO: Doctype invoices Statistics:
2022-07-06 00:58:57,146 - root:preprocess_data:608 - INFO:
Extraction fields:
tag = 9129
tag[billing-name] = 50
tag[invoice-no] = 10
tag[total] = 10
tag[due-date] = 10
Subsets:
subset[TEST] = 10
2022-07-06 00:59:12,167 - root:preprocess_data:676 - INFO: train: (0, 15) pages
2022-07-06 00:59:12,167 - root:preprocess_data:677 - INFO: test: (0, 15) pages
2022-07-06 00:59:12,168 - root:preprocess_dataset:49 - ERROR: Dataset preprocess Failed
Traceback (most recent call last):
File “”, line 48, in preprocess_dataset
File “”, line 143, in init
File “”, line 31, in init
File “”, line 678, in preprocess_data
AssertionError: Training and / or validation set is empty, verify that training / validation split is correctly set
2022-07-06 00:59:12,171 - uipath_core.training_plugin:model_run:150 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset preprocess Failed
2022-07-06 00:59:12,176 - uipath_core.trainer_run:main:90 - ERROR: Training Job failed, error: Dataset preprocess Failed
Traceback (most recent call last):
File “”, line 48, in preprocess_dataset
File “”, line 143, in init
File “”, line 31, in init
File “”, line 678, in preprocess_data
AssertionError: Training and / or validation set is empty, verify that training / validation split is correctly set
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/model/bin/uipath_core/trainer_run.py”, line 85, in main
wrapper.run()
File “/workspace/model/microservice/training_wrapper.py”, line 64, in run
return self.training_plugin.model_run()
File “/model/bin/uipath_core/training_plugin.py”, line 151, in model_run
raise e
File “/model/bin/uipath_core/training_plugin.py”, line 143, in model_run
self.run_train_only()
File “/model/bin/uipath_core/training_plugin.py”, line 212, in run_train_only
self.train_model(self.local_dataset_directory)
File “/model/bin/uipath_core/training_plugin.py”, line 116, in train_model
self.model.train(directory)
File “/workspace/model/microservice/train.py”, line 36, in train
self.process_data()
File “/workspace/model/microservice/train.py”, line 69, in process_data
self.trainer.preprocess_dataset()
File “”, line 49, in preprocess_dataset
Exception: Dataset preprocess Failed
2022-07-06 00:59:47,362 - uipath_core.trainer_run:main:73 - INFO: Starting training job…
2022-07-06 00:59:50,793 - matplotlib:_get_config_or_cache_dir:484 - WARNING: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-2d6j2p4z because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2022-07-06 00:59:51,569 - matplotlib.font_manager:_load_fontmanager:1443 - INFO: generated new fontManager
2022-07-06 00:59:52,830 - uipath_core.logs.upload_log_service:upload_logs_file:87 - INFO: Retry Training Triggered:
2022-07-06 00:59:53,952 - uipath_core.storage.azure_storage_client:download:112 - INFO: Dataset from bucket folder training-49aa85ea-2b71-479f-aae1-db1d6c2f3371/9877f90d-d8f6-4f8b-95b6-872d6e8ef9b7/ca150876-0c82-455d-81d1-9e082800ce2b/export/invoice_healthcare_1_22-07-05T212300 with size 44 downloaded successfully
2022-07-06 00:59:53,953 - uipath_core.training_plugin:train_model:114 - INFO: Start model training…
2022-07-06 00:59:53,953 - uipath_core.training_plugin:initialize_model:108 - INFO: Start model initialization…
2022-07-06 00:59:53,954 - root:initialize_package:145 - INFO: Using package type provided by runtime argument with value: invoices
2022-07-06 00:59:53,954 - root:initialize_package:154 - INFO: Initializing invoices package options …
2022-07-06 00:59:53,955 - root:configure_options:158 - INFO: Document type invoices language: en
2022-07-06 00:59:53,955 - root:initialize_tokenizer:49 - INFO: Loading cached BERT tokenizer at /workspace/model/microservice/bert-base-multilingual-uncased_tokenizer
2022-07-06 00:59:54,028 - root:initialize_package:159 - INFO: System-Level Configuration:
2022-07-06 00:59:54,029 - root:initialize_package:160 - INFO: ATen/Parallel:
at::get_num_threads() : 3
at::get_num_interop_threads() : 2
OpenMP 201511 (a.k.a. OpenMP 4.5)
omp_get_max_threads() : 3
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
mkl_get_max_threads() : 3
Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
std::hardware_concurrency() : 4
Environment variables:
OMP_NUM_THREADS : 3
MKL_NUM_THREADS : [not set]
ATen parallel backend: OpenMP
2022-07-06 00:59:54,030 - root:configure_options:158 - INFO: Document type invoices language: en
2022-07-06 00:59:54,030 - uipath_core.training_plugin:initialize_model:111 - INFO: Model initialized successfully
2022-07-06 00:59:54,030 - root:log_data_version_info:13 - INFO: =========Data version information=========
2022-07-06 00:59:54,044 - root:log_data_version_info:17 - WARNING: Unknown data version:
2022-07-06 00:59:54,044 - root:log_data_version_info:17 - INFO: ==========================================
2022-07-06 00:59:54,045 - root:preprocess_data:603 - INFO: Creating dataset for document type invoices…
2022-07-06 00:59:54,277 - root:preprocess_data:605 - INFO: Doctype invoices Statistics:
2022-07-06 00:59:54,277 - root:preprocess_data:608 - INFO:
Extraction fields:
tag = 9129
tag[billing-name] = 50
tag[invoice-no] = 10
tag[total] = 10
tag[due-date] = 10
Subsets:
subset[TEST] = 10
2022-07-06 01:00:09,418 - root:preprocess_data:676 - INFO: train: (0, 15) pages
2022-07-06 01:00:09,418 - root:preprocess_data:677 - INFO: test: (0, 15) pages
2022-07-06 01:00:09,418 - root:preprocess_dataset:49 - ERROR: Dataset preprocess Failed
Traceback (most recent call last):
File “”, line 48, in preprocess_dataset
File “”, line 143, in init
File “”, line 31, in init
File “”, line 678, in preprocess_data
AssertionError: Training and / or validation set is empty, verify that training / validation split is correctly set
2022-07-06 01:00:09,422 - uipath_core.training_plugin:model_run:150 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset preprocess Failed
2022-07-06 01:00:09,427 - uipath_core.trainer_run:main:90 - ERROR: Training Job failed, error: Dataset preprocess Failed
Traceback (most recent call last):
File “”, line 48, in preprocess_dataset
File “”, line 143, in init
File “”, line 31, in init
File “”, line 678, in preprocess_data
AssertionError: Training and / or validation set is empty, verify that training / validation split is correctly set
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/model/bin/uipath_core/trainer_run.py”, line 85, in main
wrapper.run()
File “/workspace/model/microservice/training_wrapper.py”, line 64, in run
return self.training_plugin.model_run()
File “/model/bin/uipath_core/training_plugin.py”, line 151, in model_run
raise e
File “/model/bin/uipath_core/training_plugin.py”, line 143, in model_run
self.run_train_only()
File “/model/bin/uipath_core/training_plugin.py”, line 212, in run_train_only
self.train_model(self.local_dataset_directory)
File “/model/bin/uipath_core/training_plugin.py”, line 116, in train_model
self.model.train(directory)
File “/workspace/model/microservice/train.py”, line 36, in train
self.process_data()
File “/workspace/model/microservice/train.py”, line 69, in process_data
self.trainer.preprocess_dataset()
File “”, line 49, in preprocess_dataset
Exception: Dataset preprocess Failed
Please let me know how to fix it