Problem occurs when deploying Multilingual Pipeline with Labelled Datasets from AI Center

Hello everyone!

I am facing an error when i’m trying to publish a Multilingual Pipeline in AI Center.
I am using datasets which was labelled (classification labelled) in AI Center.
I have followed the instructions in the UiPath documentation.
This is a sample json file , which is a labelled one:

And here are my environmental variables (it is set by default):

Thank you for your help in advance!

Cheers
-Oliver

Hi @Melczer_Oliver ,

Could you let us know the Error message or the Failed Pipeline logs ?

Yes, (i have ran it again) :

2023-08-24 11:21:17,873 - UiPath_core.trainer_run:main:83 - INFO: Starting training job…
2023-08-24 11:21:18,005 - matplotlib:_get_config_or_cache_dir:484 - WARNING: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-0dpkjx2v because the default path (/home/aicenter/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2023-08-24 11:21:18,307 - matplotlib.font_manager:_load_fontmanager:1443 - INFO: generated new fontManager
2023-08-24 11:21:20,720 - UiPath_core.storage.azure_storage_client:download:118 - INFO: Dataset from bucket folder training-5a246154-f399-4945-9bff-611065894f25/8eb24a0c-e8a6-4804-9e03-bb6acb0d37dc/2d955f7b-4fb3-448d-8084-bcd9d6f3a00f/datalabelling_exportedFiles/1692864123 with size 15 downloaded successfully
2023-08-24 11:21:20,720 - UiPath_core.training_plugin:train_model:130 - INFO: Start model training…
2023-08-24 11:21:20,720 - UiPath_core.training_plugin:initialize_model:124 - INFO: Start model initialization…
2023-08-24 11:21:20,758 - UiPath_core.training_plugin:initialize_model:127 - INFO: Model initialized successfully
2023-08-24 11:21:20,759 - root:read_all_json:20 - INFO: Reading data from /microservice/dataset
2023-08-24 11:21:20,761 - root:read_all_json:20 - INFO: Reading data from /microservice/dataset/test
2023-08-24 11:21:20,762 - root:read_data:55 - INFO: label
Üzemorvos 15
Name: label, dtype: int64
2023-08-24 11:21:20,763 - root:read_data:60 - INFO: Train: (15, 2), Test: (0, 2)
2023-08-24 11:21:20,765 - root:init:68 - INFO: Removing folder /microservice/models/test
2023-08-24 11:21:20,834 - root:prepare_df_for_bert:161 - INFO: Bert Tokenizer dataframe
2023-08-24 11:21:20,842 - root:prepare_df_for_bert:161 - INFO: Bert Tokenizer dataframe
2023-08-24 11:21:24,970 - root:post_create_network:136 - INFO: Enabling multi_gpu setting
2023-08-24 11:21:24,975 - root:create_optimizer:129 - INFO: Creating AdamW optimizer
2023-08-24 11:21:24,976 - root:save:206 - INFO: Saving model…
2023-08-24 11:21:26,159 - root:train:392 - INFO: Training for 100 epochs
2023-08-24 11:21:29,384 - UiPath_core.training_plugin:model_run:179 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Weights sum to zero, can’t be normalized
2023-08-24 11:21:29,390 - UiPath_core.trainer_run:main:100 - ERROR: Training Job failed, error: Weights sum to zero, can’t be normalized
Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.8/site-packages/UiPath_core/trainer_run.py”, line 95, in main
wrapper.run()
File “/microservice/training_wrapper.py”, line 58, in run
return self.training_plugin.model_run()
File “/home/aicenter/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 195, in model_run
raise ex
File “/home/aicenter/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 171, in model_run
self.run_train_only()
File “/home/aicenter/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 255, in run_train_only
self.train_model(self.local_dataset_directory)
File “/home/aicenter/.local/lib/python3.8/site-packages/UiPath_core/training_plugin.py”, line 132, in train_model
response = self.model.train(directory)
File “/microservice/train.py”, line 53, in train
train.train(self.opt, self.df_train, self.df_test)
File “”, line 66, in train
File “”, line 30, in train_bert
File “”, line 397, in train
File “”, line 342, in run_epoch
File “<array_function internals>”, line 180, in average
File “/home/aicenter/.local/lib/python3.8/site-packages/numpy/lib/function_base.py”, line 524, in average
raise ZeroDivisionError(
ZeroDivisionError: Weights sum to zero, can’t be normalized
2023-08-24 11:21:29,391 - UiPath_core.trainer_run:main:107 - INFO: Job run stopped.

@Melczer_Oliver ,

Seems like a problem with the dataset itself. Could you provide the dataset sample or the json file used for Training ?

Yes, sure:

{
“data”: {
“text”: “Szia Anna! TAJ-számom: 541 136 124 Időpont: 2025.01.23 13:00 Üdv: Balázs”

}

}

@Melczer_Oliver ,

But I believe as you have provided the parameters for the Pipeline as ai_center as the file format, you would require to use a Different File format as the dataset.
The below representation should be the one :

Documentation :

If you do not want to use this type of dataset format, you could check on the other two dataset formats available (CSV, Json file) which are simpler.

@supermanPunch
Oh sorry , i’ve misunderstood the question. :slight_smile:
This is the labelled dataset which i have used:
{
“annotations”: {
“intent”: {
“to_name”: “text”,
“choices”: [
“Üzemorvos”
]
}
},
“data”: {
“text”: “Szia János! Üzemorvos időpontja: 2023.08.29 08:30 TAJ-számom: 100 201 205 Üdv: Réka”
}
}

Any idea? @supermanPunch