Execution-Based Trigger Disabling - What does it do?

Hi!

Problem
The problem we are trying to solve is that we live with some risk that we could return to work in the morning to find that a trigger has been firing repeatedly and generating many faulted jobs. This risk exists because we have time triggers which fire as much as 24 times a day and we make use of queue triggers for “on-demand” processing that can fire up to every half-hour (48 times a day, per queue trigger).

The root cause of a given incident of repeated faulted jobs is not a concern. It may be that some service that went down or an expired password. We are confident that we can resolve the issue when we return to the office.

Goal
The goal is to improve our own internal support offering, reduce faulted jobs, and avoid as many self-inflicted support headaches as possible. Ideally, we’d like to limit faulted jobs by disabling related triggers after a certain number of consecutive faulted jobs.

We would prefer to use a native UiPath solution rather than build our own solution. We wanted to use “execution-based trigger disabling” to accomplish this. However the feature does not seem to function as advertised. Overall, most descriptions related to this feature suggest that job status drives the behavior, thus why I say that it does not function as advertised. Meanwhile, testing has shown that job status has no effect on trigger disabling. So far, there is no discernable impact on trigger disabling whatsoever.

Question
My question is, what does this feature actually do and, if it isn’t driven by job status, what other ways exist to solve our problem? We’ve attempted to answer this ourselves to no avail. For the interested, feel free to read about our testing methods and findings below.



Testing Methods

  • Method A Components:
    • Process - “FaultingProcessTest”, an unattended automation which only contains a throw activity. It is designed to fault and do nothing else.
    • Multiple triggers with various configurations.
      • Frequency - Daily, hourly, and every 10 minutes.
      • Consecutive job execution fail count - 1, 2, or 3
      • Grace period on disabling the trigger (days) - 0 or 1
      • Robot - Different robots.
    • Purpose - test the relationship between this feature and faulted jobs.
      • Expectation - While the trigger is enabled, we expected jobs to be triggered and fault immediately.
      • Result - Jobs triggered, faulted immediately, but triggers were not disabled, even after exceeding the specified maximum. Triggers were allowed to run for days, to allow for possible backend processing to occur.
  • Method B components:
    • Process - “RobotHold”, an unattended automation designed to open a message box to “hold” a robot and prevent a scheduled job from being triggered during maintenance. Duration configured through argument. Message box dismissable in attended scenario, to stop holding the robot.
    • Multiple triggers
      • Frequency - intentionally overlapped schedules. Every minute and every 15 minutes.
      • Consecutive job execution fail count - 2
      • Grace period on disabling the trigger (days) - 0
      • Duration - 5 minutes or 2 hours
      • Robot - Same robot.
    • Purpose - test the relationship between this feature and failed triggers.
      • Expectation - While the robot is held, we expected triggers to fail.
      • Result - Triggers failed, but were not disabled, even after far exceeding the specified maximum. Triggers were allowed to exceed those maximums for several hours, to allow for possible backend processing to occur.

Findings
Through testing and inspection of the UiPath database, I’ve learned a few things, but I remain confused as to what this feature actually does. I was unable to attach the SQL query I used to review the results.

  • The Orchestrator UI and the documentation suggest that triggers are to be disabled based on JOB execution.
    • From the edit trigger page:
      • “Set execution-based trigger disabling”
      • “Disable when consecutive job execution fail count”
    • From the documentation:
      • This source, towards the bottom, relates directly to the feature available in the Orchestrator UI: Orchestrator - Creating a time trigger
        • It suggests this feature is used “to control when the trigger is disabled once a job fails.” For those who support automations in a production environment, a job failure implies a faulted job.
        • “The trigger is disabled after the number of failed executions you choose for this setting.” Similarly, a failed execution implies failed execution of a job, but is admittedly more ambiguous.
        • “Stopped jobs are not counted towards this value.” This is a direct reference to the influence of job status, implying a contrast between the influence of stopped and faulted job statuses.
        • “The number of days to wait before the trigger is disabled after the first failure of a job.”
      • This source, related to the overall Orchestrator config file, offers no further clarity: Orchestrator - UiPath.Orchestrator.dll.config
        • It references some config settings for Triggers.DisableWhenFailedCount and Triggers.DisableWhenFailingSinceDays, which sound similar to the “execution-based trigger disabling” which appears in the orchestrator UI. The documenation uses slightly different language like “failed launches”, which suggests that it is related to failed triggers instead of failed jobs, but remains unclear.
      • This excerpts from documentation prove vague and potentially misleading.
      • This and other experiences contribute to the suspicion that new UiPath documentation is not being written by informed humans.
  • Despite messaging that uses the word “job” a lot to describe this feature, the following UiPath database column values (see ProcessSchedules table) are only updated in relation to TRIGGER execution.
    • TotalSuccessful - Trigger successfully fired and produced a pending or running job.
    • CurrentConsecutiveSuccessful - Consecutive count for the above.
    • LastSuccessfulTime - UTC timestamp for the above.
    • TotalFailures - Trigger failed to produce a pending job.
    • CurrentConsecutiveFailures - Consecutive count for the above.
    • LastFailureTime - UTC timestamp for the above.
  • Also within the UiPath database ProcessSchedules table, the following columns both relate directly to fields in the orchestrator UI. But testing has not proven a link between these columns and the ones mentioned above.
    • ConsecutiveJobFailuresThreshold = “Disable when consecutive job execution fail count”
    • JobFailuresGracePeriodInHours = “Grace period on disabling the trigger (days)”
  • Another reason I’ve come to believe that this feature has no relation to job execution status (faulted vs successful jobs) is that the following column values remained unchanged. This is despite the similarity to the column names related to this feature, mentioned above.
    • ConsecutiveJobFailures
    • FirstFailedJobTime
  • After all these tests, no proof was found to indicate whether this feature does anything besides change 2 columns in the UiPath database when configuring the trigger.
    • No triggers were disabled.
    • No jobs were stopped from faulting.
  • One possible reason we’re not getting expected results may be the “grace period”. When a trigger has exceeded the number of consecutive failures, it is not disabled immediately. Perhaps there is a periodic check performed by the Orchestrator within the grace period, even when the number of days is zero. With such short tests, we may be missing that periodic check.
    • We’re attempting to perform a longer-term test where we create a conflicting trigger which can NEVER be successful to prove whether this feature does anything.

Hello @AutoJeff!

It seems that you have trouble getting an answer to your question in the first 24 hours.
Let us give you a few hints and helpful links.

First, make sure you browsed through our Forum FAQ Beginner’s Guide. It will teach you what should be included in your topic.

You can check out some of our resources directly, see below:

  1. Always search first. It is the best way to quickly find your answer. Check out the image icon for that.
    Clicking the options button will let you set more specific topic search filters, i.e. only the ones with a solution.

  2. Topic that contains most common solutions with example project files can be found here.

  3. Read our official documentation where you can find a lot of information and instructions about each of our products:

  4. Watch the videos on our official YouTube channel for more visual tutorials.

Hopefully this will let you easily find the solution/information you need. Once you have it, we would be happy if you could share your findings here and mark it as a solution. This will help other users find it in the future.

Thank you for helping us build our UiPath Community!

Cheers from your friendly
Forum_Staff

I finally forced the Orchestrator to disable 2 triggers. No faulted jobs took place, only triggers which failed to kick off a job. The results were muddled and a little absurd at first glance, though after reading them carefully they make a bit more sense.

To summarize, this feature will not suit our needs. After getting it to work, I’ve confirmed it only has to do with the performance of the trigger itself, not the job. For further summaries about the 2 triggers that were disabled, read on.

  • Trigger Failure Test - Robot Hold - 25 hours

    • Notes
      • Configured to kick off on the same robot, every minute, and last 25 hours.
      • First job kicked off Thursday 10/10/24 at 39 minutes past the hour.
      • Allowed 1 running job and 1 pending job to be created from this trigger before enabling the second trigger. This was to force the second trigger to never fire successfully.
      • Pending job kicked off Thursday 10/10/24 at 40 minutes past the hour.
      • The execution-based trigger disabling feature was not enabled for this trigger.
    • Results
      • This trigger was disabled first, 10/11/24 at 40 minutes past the hour. This was exactly 24 hours after the last successful trigger and roughly 24 hours after the first failed trigger.
      • A second job kicked off 25 hours after the first job, corresponding to the pending job created the previous day. At the time this pending job started running, the trigger had already been disabled for an hour.
      • Unexpectedly, this trigger was disabled, despite execution-based trigger disabling not being enabled for this trigger. I cannot confirm at this time whether this is the result of the trigger configuration or the Orchestrator config value. However, the documentation states there is a default 1 day grace period on failing triggers, which applies to the whole Orchestrator. This is the likely cause.
      • The number of consecutive trigger failures was 1440.
  • Trigger Failure Test - Robot Hold - Duplicate

    • Notes
      • Configured to kick off on the same robot, every minute, and last 5 minutes.
      • Trigger enabled and first trigger failed on 10/10/24 at 43 minutes past the hour.
      • This trigger was expected to never fire successfully because of the conflict with the first trigger.
      • Execution-based trigger disabling feature was enabled. It was configured to be disabled after 2 consecutive failures, with a zero day grace period.
    • Results
      • The trigger was disabled 3 minutes after the previous trigger, 10/11/24 at 43 minutes past the hour. This was exactly 24 hours after the first trigger failure (confirmed using audit logs).
      • As expected, the trigger did not successfully create running or pending jobs.
      • Given how the “grace period” parameter is described in the documentation and a value of zero days defined in the trigger, I would not have expected the Orchestrator to wait 24 hours before disabling it. It should have been disabled after the second failed trigger, 2 minutes after the first trigger failed. It appears the overall Orchestrator config values overrode the trigger configuration, given that the trigger was not disabled immediately.
      • The number of consecutive trigger failures was 1441.

Given that this does not solve our problem, I will not be looking into this feature further. However since the Orchestrator config values appear to have overridden the trigger values in both cases, the next step would be to disable the feature at the Orchestrator level in our non-production environment. Then test again at the trigger level to determine if the trigger values can ever offer the granular level of control that the Orchestrator UI suggests is possible.

Overall, I would appreciate it if the documentation were updated to reflect these findings, specifically that this feature relates directly to triggers, not jobs. Also more information about whether the Orchestrator config really does override the trigger values would be appreciated too.

Found this forum post linked under my own that describes the execution-based trigger disabling feature better than the documentation, though it does not mention it by name: Triggers Being Deactivated By The System Administrator

  • The post itself only describes the symptoms of the feature, not the feature itself. And I believe it relates directly to the Orchestrator configuration, not the per-trigger setting.
  • It correctly describes the relationship between this feature, triggers, and jobs: “Schedule failing means a failure related to queue / create the job, not the job execution failure itself.”
  • It also answers one of my remaining questions, confirming that the orchestrator will alert administrators after disabling a trigger: “Check the audit, besides the alert when the trigger gets disabled, lot of alerts about jobs that did not run and a more verbose message with the reason will be available.” In other words, information about this automatic action will appear in alerts and in the audit log.