Orchestrator Resiliency

We have orchestrator running on an on-premises VM. As the number of BOTs increases the question of resiliency and having a backup DR orchestrator has been raised. Is there a way to easily accomplish this?

Yes! that is much needed. Here is an official gyide to consider the best practices.

Archiving data is a must do step.
Having a backup is always a good idea for disaster management, if you have cloud DB that can be done easily

Maintenance Considerations (uipath.com)

1 Like

Hi @david.smart,

I think you agree that the “easily” is a oxymoron if there is one when it comes to UiPath upgrades :smiley:

I dont think any operation with an on-premises orchestrator is easy be it installation, maintenance, upgrades.

As per your request, if your organisation already performs backup of SQL instances you will in the worstcase have a fallback. That said, a failure can potentially mean lost robot work days and has to be evaluated by your team if that risk is acceptable. We do have routine backups of the SQL (Orchestrator) but have not planned for a fallback other than that. Would like to see how other CoEs tackle this.

Agreed, “easy” might not have been the best choice of words.

You are correct in that we do have a DR for solution for the SQL database, my question was more around if we “lost” the VM that currently is used for the orchestrator. I’d also be interested to know if anyone has an approach around this. My first thought was simply to have a duplicated instance but wouldn’t that effectively just double every task relating to the orchestrator, if we were to keep them in sync?

Many firms have a NoSql database backup that simply sends the data to both databases, Live DB and backup NoSql DB.

The live database is archived every 1 month (depends on the transactions per day, best practices are mentioned in the link shared warlock earlier). The NoSql database csn be used as a backup and also for BI solution and analyze the performance and accuracy.

Hi David,

Please have a look over the existing documentation for Disaster Recovery and High Availability below. Setting things up is fairly straight forward and more so if you have scripted your infrastructure in some way.

We have multiple environments with a multi-node setup to balance resources, at a high level you want to look at:

  • Place a load-balancer in front of your Orchestrator Cluster
  • Spin up 2 or more Orchestrator nodes, use the output configuration of Primary Node as the input configuration of Secondary node during installation
  • Configure a RESP communication to keep the Orchestrator nodes in sync for High Availability (The only supported product is UiPath’s High Availability Add-On which a repackaged Redis Enterprise, but other solutions such as MSOpenTech Redis, Redis Server, and Redis Enterprise have been tested to work) … this is required to keep your Ochestrator nodes in sync and prevent them from running into any race conditions while attempting to process Jobs or Queue Items, etc.
  • Configure Always On for SQL Server
  • Perform periodic backups and snapshots
  • Leverage Maintenance Mode to gracefully take down the platform when performing upgrades, etc.
  • “High Capacity” Robots aka RDS on Windows Server. Determine how machine Robots you can afford to loose at any given time and split them up onto multiple machines to balance resources/management and resilience. I would recommend creating a machine pairing (Side A / Side B) and have a set of Robots per environment … if you are on an older version of Orchestrator/Robot - this allows you to perform maintenance on one side while only experiencing a degradation in capacity. If you are on the latest Orchestrator/Robot you could probably use Modern Folders with Assigned Credentials and Machines on a Folder which would simplify this.
2 Likes