We have got an option for each process under JOBS to restart the same job whenever we need
But it would be great if we can have an option in such a way that if a bot fails due to some exception (any exception) can the orchestrator automatically restart that process
Because it’s the same what developers do when a bot fails, like they see the error and then restart the process
So this option can be added as a Enable/Disable Option while creating PROCESS itself like Auto Restart - Enable / Disable
If the option is enabled it will restart the job automatically when it fails due to exception or it won’t
This will help the developer saving some time in monitoring the bot and restart the jobs that got failed
I just found this thread looking for some info on enabling this, so I will give you the scenario that I ran into. We have a quite stable automation that runs morning/evening Monday-Friday to change certain client communication parameters per specific client’s requests. However, this Friday evening I missed the notification that the process failed:
RemoteException wrapping System.ArgumentException: The Computer Vision server encountered an error.
[500]
I agree that if the setting is as simple as re-run for all errors it could often cause infinite loops of failed jobs, but in a case like this where an important job failed due to what was almost certainly a just a momentary network issue, it would be nice to know that the job could run again.
Hey Team, has there been any update on the release of this feature?
I would like to reiterate that this is quite a useful one to have, at the Orchestrator level.
Additional control parameters for this feature would be great:
Retry/Rerun only ‘certain’ number of times
Retry/rerun on ‘certain’ conditions - like on status, Failed, Terminated, etc.
After thinking about it, you can hack this to work by building a dispatcher to submit a queue item.
Then that queue item triggers the process you really care about to start. You can set the queue items to be retried x number of times, and if the process fails before starting the queue item, it will try again on the next half hour.
I concur. We are still at version 21.10, so I don’t know if this has been fixed in 23.10.
Here is one use-case that I imagine everybody has experienced:
This error happens if a job starts that was previously pending, waiting for a pc to become available. This error in itself is ironic, as Orchestrators primary task is to be able to start jobs. However this error has been around since we started using Orchestrator back in 2018.
Ideally the error should be fixed, as it is probably easy to fix, but this is a current use case of a “Restart job On Condition” example.
If the job fails with a condition containing a string, in this example “0x800700AA”, the job can automatically restart after X time, on X machine. It could also include a max no of attempts a job should auto-restart.
Another use case could be any process failing in the phase of initializing apps, which is before any data has been processed, or any queue has been modified, this auto-restart feature could be handy, again with a condition determined by the end user.
For example, if Job ACME fails and the Info contains string “Something went wrong initializing apps”, it could retry the job.
We have several robots bulding large Queues in the weekend, and hopefully processing them before monday morning.
Come monday morning, quite often something has gone wrong, and the normal approach would be to just restart the jobs, which is comically simple to do.