Job stuck in RUNNING state

In my orchestrator, job is in Running state since last 15 hours but the queue item status has changed to Abandoned from In Progress. Some queue items are still pending to be processed and are in New status. I am not able to understand the root cause of it. Any help would be much appreciated.

Thanks
Daksh

Hi @DGA

You could start by verifying the below things:

  • Exit condition of the workflow is set correctly
  • Verify the states of the queue items are changed correctly when successfully processed or in case of exceptions.
  • Also check what is the retry values you have provided for the queue items.

The workflow has executed correctly in many previous runs. Some jobs ran for more than 20 hours without any issue. The queue item status is getting updated correctly for success and exceptions cases. Auto-retry is not enabled so the item will be processed only once.

@DGA

  1. Are you getting any logs?
  2. If not check what is the last log you got
  3. As per explanation looks like one of the item went into infinite loop so you have abandoned item…or stuck in process.xaml may be on excel activity or anything

Cheers

For the job that was Abandoned, the logs appear till the time bot was processing. After that, there are no logs. The last log states that it was interacting with an element (its an SAP UI based automation).

@DGA

Are you using any loop to check for element or so?

In that loop if no logs are there then you might not see new logs alao…

Cheers

Hi @DGA,

Are you using ReFramework or any other state machine? Have you done any changes on the state transitions?

Had a similar problem before when I modified some states transitions and they were either ambiguous or the condition was not met. If a condition is not met for any transition, the process will be running forever.

Something else that you can do, but as a workaround to avoid having a process running forever is to set some trigger based conditions:

@DGA

In this case it might be a problem with the activity itself. I had the same problem before during the SAP login where the activity failed to timeout.

I don’t think I ever found a definitive solution for it and for sure never had figured it out why it happens. What I did was to create a workaround for the activity/sequence/workflow I knew it was having the problem.

The workaround is by using a parallel activity:

Create a Boolean var that can be accessed from both sequences. In one of them you will have your workflow/sequence, in the other, a sequence with a delay.

At the very end of your SAP workflow/sequence set the Boolean to True. That way you know that all activities were executed. If something is stuck for the same amount of time you set the delay on the other sequence, the exception will be trigged and the transaction will fail without being stuck forever.

Let me know if that helps.

I finally resolved it! The issue was bot was trying to select a value from a combo box in SAP which did not exist in the dropdown, which caused it to get stuck and stop responding. Thanks for your inputs.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.