Execution-Based Trigger Disabling - What does it do?

I finally forced the Orchestrator to disable 2 triggers. No faulted jobs took place, only triggers which failed to kick off a job. The results were muddled and a little absurd at first glance, though after reading them carefully they make a bit more sense.

To summarize, this feature will not suit our needs. After getting it to work, I’ve confirmed it only has to do with the performance of the trigger itself, not the job. For further summaries about the 2 triggers that were disabled, read on.

  • Trigger Failure Test - Robot Hold - 25 hours

    • Notes
      • Configured to kick off on the same robot, every minute, and last 25 hours.
      • First job kicked off Thursday 10/10/24 at 39 minutes past the hour.
      • Allowed 1 running job and 1 pending job to be created from this trigger before enabling the second trigger. This was to force the second trigger to never fire successfully.
      • Pending job kicked off Thursday 10/10/24 at 40 minutes past the hour.
      • The execution-based trigger disabling feature was not enabled for this trigger.
    • Results
      • This trigger was disabled first, 10/11/24 at 40 minutes past the hour. This was exactly 24 hours after the last successful trigger and roughly 24 hours after the first failed trigger.
      • A second job kicked off 25 hours after the first job, corresponding to the pending job created the previous day. At the time this pending job started running, the trigger had already been disabled for an hour.
      • Unexpectedly, this trigger was disabled, despite execution-based trigger disabling not being enabled for this trigger. I cannot confirm at this time whether this is the result of the trigger configuration or the Orchestrator config value. However, the documentation states there is a default 1 day grace period on failing triggers, which applies to the whole Orchestrator. This is the likely cause.
      • The number of consecutive trigger failures was 1440.
  • Trigger Failure Test - Robot Hold - Duplicate

    • Notes
      • Configured to kick off on the same robot, every minute, and last 5 minutes.
      • Trigger enabled and first trigger failed on 10/10/24 at 43 minutes past the hour.
      • This trigger was expected to never fire successfully because of the conflict with the first trigger.
      • Execution-based trigger disabling feature was enabled. It was configured to be disabled after 2 consecutive failures, with a zero day grace period.
    • Results
      • The trigger was disabled 3 minutes after the previous trigger, 10/11/24 at 43 minutes past the hour. This was exactly 24 hours after the first trigger failure (confirmed using audit logs).
      • As expected, the trigger did not successfully create running or pending jobs.
      • Given how the “grace period” parameter is described in the documentation and a value of zero days defined in the trigger, I would not have expected the Orchestrator to wait 24 hours before disabling it. It should have been disabled after the second failed trigger, 2 minutes after the first trigger failed. It appears the overall Orchestrator config values overrode the trigger configuration, given that the trigger was not disabled immediately.
      • The number of consecutive trigger failures was 1441.

Given that this does not solve our problem, I will not be looking into this feature further. However since the Orchestrator config values appear to have overridden the trigger values in both cases, the next step would be to disable the feature at the Orchestrator level in our non-production environment. Then test again at the trigger level to determine if the trigger values can ever offer the granular level of control that the Orchestrator UI suggests is possible.

Overall, I would appreciate it if the documentation were updated to reflect these findings, specifically that this feature relates directly to triggers, not jobs. Also more information about whether the Orchestrator config really does override the trigger values would be appreciated too.