Business Continuity Plans And Disaster Recovery For Cloud-Based Services

With a standalone Orchestrator deployed to (Azure|AWS|GCP), and disaster recovery (DR) and business continuity plans (BCP) being developed, is an Active/Passive or Active/Active Orchestrator deployment needed to be implemented?

Issue Description: Orchestrator set up has the packages stored locally in AWS EC2. It is required to move the packages outside AWS EC2 to shared file storage location and update the Orchestrator config to point to diff location for packages.


Resolution:
Active/Passive and Two Active Data Center deployment models are intended for physical, on-prem services that are vulnerable to single points of failure, providing redundancy and failover in the occurrence of a catastrophic failure. They don't make sense in the context of PaaS (cloud-based) installations, as these services already offer geo-redundancy across one or more regions.

Each cloud platform provides its own DR and BCP guidance across a range of price points, some of which may not be suitable for a production environment. As a component of business continuity planning, you are encouraged to review and select the options best suited to your organization's recovery time objective (RTO), which is the maximum acceptable length of time that your application can be offline, a value usually defined as part of a larger service level agreement (SLA), and recovery point objective (RPO), which is the maximum acceptable length of time during which data might be lost from the application due to a major incident.

Amazon Web Services (AWS) provides guidance on implementing DR plans as a part of a more comprehensive Business Continuity Plan (BCP) at a regional and multi-regional level for more nuanced responses in depth to mitigate a variety of potential scenarios. Periodic server images provide a basic recovery approach to a corrupted installation and can replace the installation on an existing EC2 server or spun up into a new EC2 instance in short order, only requiring an IP redirection in the load balancer for failover.

Microsoft Azure deployments offer similar options for disaster recovery and backup. Microsoft also provides guidance for implementing commonly used disaster recovery techniques to prevent loss of functionality or data for web apps if there's a regional disaster. Azure customers will need to actively develop their DR plan, as beginning 31 March 2025, Microsoft will no longer place Azure App Service web applications in disaster recovery mode in the event of a disaster in an Azure region. (Action recommended: Implement disaster recovery strategies for your Azure App Service web apps by 31 March 2025)

Google Cloud Platform (GCP) provides disaster recovery guidance across a variety of DR scenarios.