Training Pipeline failed for the TPOTAutoMLRegression ML Package

pmilanraajp · April 7, 2022, 4:44am

I just started working with the AI Center and am currently testing out various packages. I have a small data set of training data and testing data but the training pipeline keeps failing with the following error message:
Error Details : Pipeline failed due to ML Package Issue

call fit() first.

Does anyone know what is causing this error?
Attached here are my training data and testing data:
Training Data:

Testing Data:

The full Error:
Train only of HomePricesPrediction 1.0 scheduled - Run 619cbf5c-b7c7-408a-b594-ebe6e6b93b87
Train only of HomePricesPrediction 1.0 launched - Run 619cbf5c-b7c7-408a-b594-ebe6e6b93b87
Train only of HomePricesPrediction 1.0 started - Run 619cbf5c-b7c7-408a-b594-ebe6e6b93b87
Train only of HomePricesPrediction 1.0 failed - Run 619cbf5c-b7c7-408a-b594-ebe6e6b93b87

Error Details : Pipeline failed due to ML Package Issue

call fit() first.
joblib.externals.loky.process_executor._RemoteTraceback:
“”"
Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py”, line 418, in _process_worker
r = call_item()
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py”, line 272, in call
return self.fn(*self.args, **self.kwargs)
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py”, line 608, in call
return self.func(*args, **kwargs)
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 256, in call
for func, args, kwargs in self.items]
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 256, in
for func, args, kwargs in self.items]
File “/home/aicenter/.local/lib/python3.6/site-packages/stopit/utils.py”, line 145, in wrapper
result = func(*args, **kwargs)
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/gp_deap.py”, line 417, in _wrapped_cross_val_score
cv_iter = list(cv.split(features, target, groups))
File “/home/aicenter/.local/lib/python3.6/site-packages/sklearn/model_selection/_split.py”, line 333, in split
.format(self.n_splits, n_samples))
ValueError: Cannot have number of splits n_splits=5 greater than the number of samples: n_samples=4.
“”"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 711, in fit
per_generation_function=self._check_periodic_pipeline
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/gp_deap.py”, line 227, in eaMuPlusLambda
population[:] = toolbox.evaluate(population)
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 1321, in _evaluate_individuals
for sklearn_pipeline in sklearn_pipeline_list[chunk_idx:chunk_idx + chunk_size])
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 1017, in call
self.retrieve()
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 909, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py”, line 562, in wrap_future_result
return future.result(timeout=timeout)
File “/usr/local/lib/python3.6/concurrent/futures/_base.py”, line 432, in result
return self.__get_result()
File “/usr/local/lib/python3.6/concurrent/futures/_base.py”, line 384, in __get_result
raise self._exception
ValueError: Cannot have number of splits n_splits=5 greater than the number of samples: n_samples=4.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/trainer_run.py”, line 85, in main
wrapper.run()
File “/microservice/training_wrapper.py”, line 57, in run
return self.training_plugin.model_run()
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 147, in model_run
raise e
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 139, in model_run
self.run_train_only()
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 208, in run_train_only
self.train_model(self.local_dataset_directory)
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 112, in train_model
self.model.train(directory)
File “/microservice/train.py”, line 39, in train
self.model = self.build_model(X, y, self.artifacts_directory)
File “/microservice/train.py”, line 58, in build_model
pipeline_optimizer.fit(X, y)
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 742, in fit
raise e
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 733, in fit
self._update_top_pipeline()
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 811, in _update_top_pipeline
raise RuntimeError(‘A pipeline has not yet been optimized. Please call fit() first.’)
RuntimeError: A pipeline has not yet been optimized. Please call fit() first.
2022-04-07 04:40:15,881 - uipath_core.trainer_run:main:73 - INFO: Starting training job…
2022-04-07 04:40:16,842 - uipath_core.logs.upload_log_service:upload_logs_file:87 - INFO: Retry Training Triggered:
2022-04-07 04:40:16,853 - uipath_core.storage.azure_storage_client:download:106 - INFO: Dataset from bucket folder training-1fc90d1a-b983-45f7-a8ac-f316937245f5/94276d17-5ee6-4eec-a01a-1954d4343328/229c2d02-aae5-4bb5-ab52-b47d4c8af620 with size 1 downloaded successfully
2022-04-07 04:40:16,853 - uipath_core.training_plugin:train_model:110 - INFO: Start model training…
2022-04-07 04:40:16,854 - uipath_core.training_plugin:initialize_model:104 - INFO: Start model initialization…
2022-04-07 04:40:16,854 - uipath_core.training_plugin:initialize_model:107 - INFO: Model initialized successfully
2022-04-07 04:40:19,418 - uipath_core.training_plugin:model_run:146 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: A pipeline has not yet been optimized. Please call fit() first.
2022-04-07 04:40:19,419 - uipath_core.trainer_run:main:90 - ERROR: Training Job failed, error: A pipeline has not yet been optimized. Please call fit() first.
joblib.externals.loky.process_executor._RemoteTraceback:
“”"
Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py”, line 418, in _process_worker
r = call_item()
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py”, line 272, in call
return self.fn(*self.args, **self.kwargs)
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py”, line 608, in call
return self.func(*args, **kwargs)
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 256, in call
for func, args, kwargs in self.items]
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 256, in
for func, args, kwargs in self.items]
File “/home/aicenter/.local/lib/python3.6/site-packages/stopit/utils.py”, line 145, in wrapper
result = func(*args, **kwargs)
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/gp_deap.py”, line 417, in _wrapped_cross_val_score
cv_iter = list(cv.split(features, target, groups))
File “/home/aicenter/.local/lib/python3.6/site-packages/sklearn/model_selection/_split.py”, line 333, in split
.format(self.n_splits, n_samples))
ValueError: Cannot have number of splits n_splits=5 greater than the number of samples: n_samples=4.
“”"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 711, in fit
per_generation_function=self._check_periodic_pipeline
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/gp_deap.py”, line 227, in eaMuPlusLambda
population[:] = toolbox.evaluate(population)
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 1321, in _evaluate_individuals
for sklearn_pipeline in sklearn_pipeline_list[chunk_idx:chunk_idx + chunk_size])
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 1017, in call
self.retrieve()
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/parallel.py”, line 909, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File “/home/aicenter/.local/lib/python3.6/site-packages/joblib/_parallel_backends.py”, line 562, in wrap_future_result
return future.result(timeout=timeout)
File “/usr/local/lib/python3.6/concurrent/futures/_base.py”, line 432, in result
return self.__get_result()
File “/usr/local/lib/python3.6/concurrent/futures/_base.py”, line 384, in __get_result
raise self._exception
ValueError: Cannot have number of splits n_splits=5 greater than the number of samples: n_samples=4.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/trainer_run.py”, line 85, in main
wrapper.run()
File “/microservice/training_wrapper.py”, line 57, in run
return self.training_plugin.model_run()
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 147, in model_run
raise e
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 139, in model_run
self.run_train_only()
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 208, in run_train_only
self.train_model(self.local_dataset_directory)
File “/home/aicenter/.local/lib/python3.6/site-packages/uipath_core/training_plugin.py”, line 112, in train_model
self.model.train(directory)
File “/microservice/train.py”, line 39, in train
self.model = self.build_model(X, y, self.artifacts_directory)
File “/microservice/train.py”, line 58, in build_model
pipeline_optimizer.fit(X, y)
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 742, in fit
raise e
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 733, in fit
self._update_top_pipeline()
File “/home/aicenter/.local/lib/python3.6/site-packages/tpot/base.py”, line 811, in _update_top_pipeline
raise RuntimeError(‘A pipeline has not yet been optimized. Please call fit() first.’)
RuntimeError: A pipeline has not yet been optimized. Please call fit() first.

suraj.setty · April 7, 2022, 4:48am

Hi @pmilanraajp

Welcome to Community,

Please help to know if the Status of the ML Package is “Deployed” or “UnDeployed”?

If “UnDeployed” , please create an ML package and ML Skill Together once the ML Skill is deployed the ML Package will Automatically be Deployed

Thanks.

pmilanraajp · April 7, 2022, 6:46am

It says its deployed but the status is failed.

suraj.setty · April 7, 2022, 6:51am

Hi @pmilanraajp

The ML Skill has been failed to Create , can you please create an ML Skill along with the ML package and the status of the ML package should be “Deployed” and ML Skill as “Available”.

Once you have these status you can retain the Model.

Hope this works,

Thanks.

pmilanraajp · April 7, 2022, 7:14am

My ML Skill keeps failing to be deployed, my ML Packages status is still Undeployed. Do i have to wait for the ML Package to be deployed first before creating the skill?

suraj.setty · April 7, 2022, 7:19am

Try Creating the ML Skill on the ML Package the is created now and check.

Thanks.

pmilanraajp · April 7, 2022, 7:41am

These are the steps that i am currently following:

Create Project (Name: HomePricesPrediction)
Go to ML Packages Tab - Out of the box Packages - Tabular Data - TPOTAutoMLRegression
Name Package (Name: HomePricesPredictionPackage) - Status = Undeployed
Go to ML Skills - Create New ML Skill (Name: HomePricesPredictionSkill, Package: HomePricesPredictionPackage, Major Version: 1, Minor Version: 0)

At this point after some time the Status of the Skill should change to Deployed right?

suraj.setty · April 7, 2022, 7:42am

Th status of the ML Skill should change to “Available”

pmilanraajp · April 7, 2022, 7:54am

But the ML Skill keeps failing

ML Logs:

Do i have to upload training data first?

supermanPunch · April 7, 2022, 8:07am

@pmilanraajp ,

Basically it depends on the ML Packages being used.

For some of the ML Packages, we would require to have Datasets already available, we would require to create the Training Pipeline for it to be Success and then Upload the ML Skill.

For Instance, if you could try the above steps with the Invoices Package or Remittance Advices Package, it should be available as an ML Package with Status as available without any Pipeline being created.

The reason may be due to the fact that these Packages are already trained and have a Predefined Schema.

pmilanraajp · April 7, 2022, 8:17am

So before creating the skill I should train upload training data and create a training pipeline right?
That’s what i tried in the beginning, I uploaded my training data and created a training pipeline.

However it keeps failing:

With this error message:
Error.txt (10.9 KB)

supermanPunch · April 7, 2022, 10:59am

@pmilanraajp , Have you Specified the target_column in the Environment Variable when Creating Training Pipeline ?

pmilanraajp · April 7, 2022, 11:32am

Yes I have done so.

supermanPunch · April 7, 2022, 11:48am

@pmilanraajp Maybe the Training is failing due to very less data.

Try to get more data, then Try Creating only a Training Pipeline, Not Full Pipeline, also no need of Testing/Evaluation Pipeline.

Also let us know How many data you have currently.

pmilanraajp · April 15, 2022, 1:35am

Hi, sorry for the late reply. The issue to the problem that i was facing was with my training data. There is a minimum requirement on the number of training data and data points that is needed to train the ML Package.

This part was shared to me by UiPath Support:

Please be noted that for successfully running a Training pipeline, it is strongly recommend at least 25 documents and at least 10 samples from each labeled field in your dataset . Otherwise, the pipeline throws the following error: Dataset Creation Failed .

Kindly refer to the below link to get a detailed information about training a pipeline and high performing models:

system · April 18, 2022, 1:36am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Failed Training Pipeline: Retraining an older ML Package AI Center question , ai_center	1	787	November 9, 2022
Out Of The Box Package AI Center question , ai_center	1	673	July 28, 2022
AI Center Pipeline is getting failed AI Center orchestrator , question	1	1582	December 11, 2021
Unable to Train ML Package - EnglishTextClassification via Pipeline AI Center question , ai_center	6	452	October 24, 2023
Pipeline failed with error type is ML_PACKAGE_ISSUE AI Center question , ai_center	2	235	April 4, 2024

Most Active Users - Yesterday
ashokkarale
Anil_G
Yoichi
Nitesh
furkan.cosan
Matias_Clemente.Arg
shubham.jadhav
Ernest_Ndifor1
MT8888
fred.bullmer
More details...

Training Pipeline failed for the TPOTAutoMLRegression ML Package

Related topics