Pipeline getting failed - AI Center

Hi Community,
I’m working on Light Text Classification model to segregate the priority of emails.

I have uploaded the below dataset

Then I have created the below ML Package.

But when I’m creating the pipeline, it is getting failed.

Pipeline Error Message:-
Train only of TextClassification 4.0 launched - Run 8c29afad-c9ae-4518-8390-c94350877b2d
Train only of TextClassification 4.0 started - Run 8c29afad-c9ae-4518-8390-c94350877b2d
Train only of TextClassification 4.0 scheduled - Run 8c29afad-c9ae-4518-8390-c94350877b2d
Train only of TextClassification 4.0 failed - Run 8c29afad-c9ae-4518-8390-c94350877b2d

Error Details : Pipeline failed due to ML Package Issue

2023-05-17 06:29:59,697 - uipath_core.trainer_run:main:74 - INFO: Starting training job…
2023-05-17 06:29:59,845 - matplotlib:_get_config_or_cache_dir:484 - WARNING: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-rdt4rtg5 because the default path (/home/aicenter/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2023-05-17 06:30:00,275 - matplotlib.font_manager:_load_fontmanager:1443 - INFO: generated new fontManager
2023-05-17 06:30:03,593 - uipath_core.storage.azure_storage_client:download:118 - INFO: Dataset from bucket folder training-63c51fe1-e6bd-4f00-b3da-a301ec7c9981/96714ac1-b5f0-4f46-9465-602a0944b695/f5d9a2f9-6ff1-4c21-87b0-c6c30976d8a8 with size 1 downloaded successfully
2023-05-17 06:30:03,594 - uipath_core.training_plugin:train_model:130 - INFO: Start model training…
2023-05-17 06:30:03,594 - uipath_core.training_plugin:initialize_model:124 - INFO: Start model initialization…
2023-05-17 06:30:03,596 - uipath_core.training_plugin:initialize_model:127 - INFO: Model initialized successfully
2023-05-17 06:30:03,597 - root:init:70 - INFO: Loading data from /microservice/dataset
2023-05-17 06:30:04,015 - root:read_all_csv:179 - INFO: Read [26] data points from [/microservice/dataset/newUrgency1.csv]
2023-05-17 06:30:04,016 - root:read_all_directories:269 - INFO: Reading from directory stucture [/microservice/dataset]
2023-05-17 06:30:04,041 - root:read_all_data:128 - INFO: Unable to read any valid data from *.json files in [/microservice/dataset]
2023-05-17 06:30:04,041 - root:read_all_data:133 - INFO: Unable to read any valid data from directory structure [/microservice/dataset]
2023-05-17 06:30:04,041 - root:init:84 - INFO: Done read [26] points with [4] classes
2023-05-17 06:30:04,042 - root:validate:300 - INFO: Urgency
critical 5
high 6
low 8
moderate 7
Name: Urgency, dtype: int64
2023-05-17 06:30:04,044 - root:init:97 - INFO: Split test data from train data
2023-05-17 06:30:04,045 - root:split:317 - INFO: Train: (20, 2), Test: (6, 2), Ratio: 0.23076923076923078
2023-05-17 06:30:04,046 - root:init:104 - INFO: Train: 20, Test: 6
2023-05-17 06:30:04,047 - root:train:61 - INFO: Started hyperparameter search …
2023-05-17 06:30:04,053 - root:init:48 - INFO: Using LogisticRegression
2023-05-17 06:30:04,839 - root:tokenize:109 - INFO: Tokenizing dataframe
2023-05-17 06:30:05,457 - root:tokenize:109 - INFO: Tokenizing dataframe
2023-05-17 06:30:05,513 - root:build_vocabulary:114 - INFO: Building vocabulary…
2023-05-17 06:30:05,513 - root:prepare_df_for_bow:124 - INFO: Prepare bow
2023-05-17 06:30:05,516 - root:prepare_df_for_bow:124 - INFO: Prepare bow
2023-05-17 06:30:05,517 - root:train:70 - INFO: Vectorizing
2023-05-17 06:30:05,520 - uipath_core.training_plugin:model_run:179 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: After pruning, no terms remain. Try a lower min_df or a higher max_df.
2023-05-17 06:30:05,521 - uipath_core.trainer_run:main:91 - ERROR: Training Job failed, error: After pruning, no terms remain. Try a lower min_df or a higher max_df.
Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.8/site-packages/uipath_core/trainer_run.py”, line 86, in main
wrapper.run()
File “/microservice/training_wrapper.py”, line 57, in run
return self.training_plugin.model_run()
File “/home/aicenter/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py”, line 195, in model_run
raise ex
File “/home/aicenter/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py”, line 171, in model_run
self.run_train_only()
File “/home/aicenter/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py”, line 255, in run_train_only
self.train_model(self.local_dataset_directory)
File “/home/aicenter/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py”, line 132, in train_model
response = self.model.train(directory)
File “/microservice/train.py”, line 53, in train
train.train(self.opt, self.df_train, self.df_test)
File “”, line 62, in train
File “”, line 154, in hyperparam_search_bow
File “/home/aicenter/.local/lib/python3.8/site-packages/optuna/study/study.py”, line 400, in optimize
_optimize(
File “/home/aicenter/.local/lib/python3.8/site-packages/optuna/study/_optimize.py”, line 66, in _optimize
_optimize_sequential(
File “/home/aicenter/.local/lib/python3.8/site-packages/optuna/study/_optimize.py”, line 163, in _optimize_sequential
trial = _run_trial(study, func, catch)
File “/home/aicenter/.local/lib/python3.8/site-packages/optuna/study/_optimize.py”, line 264, in _run_trial
raise func_err
File “/home/aicenter/.local/lib/python3.8/site-packages/optuna/study/_optimize.py”, line 213, in run_trial
value_or_values = func(trial)
File “”, line 139, in call
File “”, line 16, in train_bow
File “”, line 73, in train
File “/home/aicenter/.local/lib/python3.8/site-packages/sklearn/feature_extraction/text.py”, line 2078, in fit_transform
X = super().fit_transform(raw_documents)
File “/home/aicenter/.local/lib/python3.8/site-packages/sklearn/feature_extraction/text.py”, line 1355, in fit_transform
X, self.stop_words
= self._limit_features(
File “/home/aicenter/.local/lib/python3.8/site-packages/sklearn/feature_extraction/text.py”, line 1187, in _limit_features
raise ValueError(
ValueError: After pruning, no terms remain. Try a lower min_df or a higher max_df.
2023-05-17 06:30:05,522 - uipath_core.trainer_run:main:98 - INFO: Job run stopped.

Please suggest me some solution for it.
Thanks in advance.

@siddhi.rani

  1. Light text classification needs input with only two columns …it does not support 3…
  2. Those two columns are to be mapped correctly while creating the pipeline as well

Can you please correct these.for more info

https://docs.uipath.com/ai-center/automation-suite/2023.4/user-guide/light-text-classification#input-type

Cheers

I have also passed only two columns in the environment variables while creating pipeline.

image

@siddhi.rani

As per error looks like the keyword is neglected…please try givign a value to this prameters try with 0 first

Cheers

@Anil_G

This time I’ve assigned variables as below.

Still pipeline is failing.

@siddhi.rani

Did the error say the same still?

If so add more body itesm and try or copy paste same rows again and try to train

Cheers

@Anil_G

It’s the same error message.

Does the dataset file type matter?
I’m providing here *.csv(comma delimited) file in dataset. Should I use same file type or utf-8 csv type.

Thanks.

@siddhi.rani

As per error …it is failingb at the excluded words and keywords to use…not able to get a good output for ml…so please try with 1 instead of zero…let me check if there is a way to increase the max value as wel…

In the mean time you can try to add more data and check to retrain

It is able to read the file and the data

Cheers

@Anil_G ,
By assigning each variable with their default value , the pipeline is successfully running.
Thanks

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.