ML Package got stuck in "Deploying" status

Hi All,

In AI center, I have been trying to Train my ML model third time, I have Added new set of documents & done the data labelling and created dataset & exported the same after that I have created the pipeline that also completed successfully & trying to deploy the ML model and update the ML skill, but my ML model package is still in Deploying status only pas few days (since last Friday 1st March), status is not getting changed to Failed to identify the issue, can anyone please help me to figure out what’s wrong with ML package deployment? will it take longer time to deploy the package?

Thanks in Advance!!!

@Parmar_Snehal_Cognizant

Sometimes it might take longer…also check if you have enough licenses

If already a package is deployed this might not be…

Cheers

yes I have enough license,6k AI units license its showing under license

@Parmar_Snehal_Cognizant

Is there already few packages deployed?

Also you can open ml logs on the last tab and check if it shows retrying or so

Cheers

yes one deployment was already done Months back, ML logs showing MLPackage validation successful, its not showing Retry option

@Parmar_Snehal_Cognizant

Can you try to deactivate that package…may be you can deploy only one at a time

Cheers

I checked but there is no option to disable the currently deployed package

@Parmar_Snehal_Cognizant

Can you say how many active and deployed pipelines show on the project?

Cheers

only 1 package is active currently


@Parmar_Snehal_Cognizant

are you findign a log like this?

go to ml skill try to delete old one and then deploy

check in ml skills tab may be already one is there

cheers

I can see the logs like ML package validation started and validation success, but package is still under deploying status and for pipeline also I can see the logs pipeline run successful and under ML skill I dont see any logs for the latest deployment but in ML skill one tab is there streaming log there I can see one message
“968357b9-fad6-41db-aa8b-a73934368ed2-21-4-6c85b9c5f9-zjzlq:Warning ==> 0/3 nodes are available: 1 Insufficient memory, 1 node(s) had untolerated taint {nvidia.com/gpu: present}, 1 node(s) had untolerated taint {task.mining/cpu: present}. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.”

I can see the log message for pipe line and ML package but for ML skill I can see “MLSkill ICC_Billing_Invoice MLPackage v#23.10.0 Deployment Started” and “MLSkill ICC_Billing_Invoice MLPackage v#23.10.0 Deployment Failed Attempt: 1” with below error message

“968357b9-fad6-41db-aa8b-a73934368ed2-21-4-6c85b9c5f9-zjzlq:Warning ==> 0/3 nodes are available: 1 Insufficient memory, 1 node(s) had untolerated taint {nvidia.com/gpu: present}, 1 node(s) had untolerated taint {task.mining/cpu: present}. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.”

@Parmar_Snehal_Cognizant

delete the already deployed model and next one you cna see

cheers

ok so under ML package its only allowing me to delete undeployed package & in ML skills if I delete it then is it like will loose all other versions?

@Parmar_Snehal_Cognizant

Please check under ml skill and not ml package

Cheers

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.