How to check a pipeline that is running since a long time?
Root Causes:
- Pipeline could be Running for longer duration than expected due to
- Huge dataset
- Number of epochs
- GPU non-usage
- Pipeline could be stuck in Running state due to
- Infra Issue
- Bugs in product .
Troubleshooting Steps:
Pipelines might be running for a long time or might have been stuck in running state due to various reasons. To confirm if a pipeline is actually stuck or is still running, follow below steps
- Open the pipeline and check for logs section
- If the logs are recent and are streaming, the pipeline is in progress.
- If the last log is generated long back, it means that the pipeline is stuck. Download the logs using download button present below the logs section and share it with the support engineer. If the download button is not present or disabled, copy the logs from logs section and share it with support engineer, along with details from section "Details to be shared with support team for further troubleshooting" below
- If the pipeline is progressing/ running but it has been running since a long time, try below steps to run the pipeline again with shorter duration
- Dataset - Ensure that dataset is well curated and only required data is included in dataset. Having larger dataset doesn't ensure that ML model would be trained better. Refer to respective ML model page or Training High Performance ML models for DU ML models on better understanding of dataset requirements for training.
- Epochs - Number or epochs represent the number of times the training algorithm will work through entire dataset. By default, this number is set to 100 for most ML models. However, this number can be reduced for lower pipeline time. The number of epochs for a pipeline can be configured using ml_model.epochs parameter of pipeline. But this needs to be done carefully as lower number of epochs might lead to underfitting of ML model, which in turn leads to poor performance of the ML model.
- GPU - Training on GPU is 10 times faster when compared to training on CPU. Infact, for training DU ML models, GPU is required when dataset exceeds 1000 documents. Running a pipeline on GPU consumes higher number of AI units (in cloud AIC), hence ensure that there are sufficient number of AI units. In an on-prem installation, ensure that a node with GPU is available before starting a pipeline on GPU.
Information to Share with UiPath Support Engineer:
Cloud AIC Details
In the Cloud Tenant where this issue is occurring, gather the following information:
- Support ID: (This can be found by navigating to Cloud -> Admin -> Settings. The Support ID is located at the top right of this screen)
- URL: (This can be found by navigating to Cloud -> Admin -> Settings)
- Account ID: (This can be found by navigating to Cloud -> AICenter -> In the top right of the screen, click the 3 vertical dots -> View Profile)
- Tenant ID: (This can be found by navigating to Cloud -> AICenter -> In the top right of the screen, click the 3 vertical dots -> View Profile)
- AI Center Project Name:
- Share a screenshot of the ML Pipeline Page
- Pipeline Details: (Click on the pipeline and share a screenshot of the top of the page)
- Pipeline Logs: (Click on the pipeline and share a screenshot of the pipeline logs before scrolling down the logs. There could be an error message listed here, that would not be visible when exporting the full logs. After gathering the screenshot, copy the logs and share them by pasting in an email. If there is an option at the bottom of the logs stating partial logs are being displayed, please download the full log and share them.)
- What was the base model used for training the pipeline? (For example: ML Packages/Out of the box Packages/UiPath Document Understanding/Invoices version 22.10.1.0).
On-Prem AIC Details
- Version of AIC/AS (including minor version ex: 2022.10.1)
- If the installation is standalone AIC or Automation Suite
- If the installation is single node or multi node
- If the installation is airgapped or online
- Support Bundle
- Diagnostic Logs
- Share a screenshot of the ML Pipeline Page
- Pipeline Details: (Click on the pipeline and share a screenshot of the top of the page)
- Pipeline Logs: (Click on the pipeline and share a screenshot of the pipeline logs before scrolling down the logs. There could be an error message listed here, that would not be visible when exporting the full logs. After gathering the screenshot, copy the logs and share them by pasting in an email. If there is an option at the bottom of the logs stating partial logs are being displayed, please download the full log and share them.)
- Details of base model used for training the pipeline (For example: ML Packages/Out of the box Packages/UiPath Document Understanding/Invoices version 22.10.1.0).