Advanced Failover Architecture

Follow the instructions below to set up an Advanced Failover Architecture as described in Best Practices for using Elastic IPs and Availability Zones.   This setup has a production deployment where instances are running in more than one availability zone.  In this setup, there are two frontend load balancers using Elastic IPs and the same number of application servers in each zone.  Theforefore, if there is a failure in one of the availability zones, your site's available performance will temporarily be cut directly in half.   If redundancy is the main goal and there are no limits to your budget, you can create very complex deployments with instances spread across many zones.   

Remember, if you have instances running in multiple availability zones, you'll have to pay for the zone-to-zone transfer fees.  Data transferred between instances across different availability zones on EC2 costs $0.01 per GB.

This advanced architecture addresses the weaknesses of DNS multiple A record solutions.  Either load balancer 1 or 2 could distribute load to any available application server.  TTL delays that are inherent with any DNS scheme have been avoided.

07-advanced_setup.gif

 

Setup Instructions

Production Deployment (us-east-1a)

If you follow the diagram above as an example, your production setup will consist of two active deployments with another deployment configured to launch in another zone.  Each deployment will be dedicated to a particular zone.  Notice that you're using an Elastic IP for each frontend load balancer with the same number of application servers in each availability zone.  The clone operation makes this duplication a one click solition (with a few adjustments later).

03-advanced_1a.gif


04-advanced_1b.gif

 

The next step is to clone the Production (1b) deployment and call the new one "Production (1c)".  Remember, you do not want to clone "Production (1a)" because if the zone with the Master-DB fails, you will simply promote the Slave-DB to Master-DB and launch a new Slave-DB instance.

  • Change the availability zone for each server in Production (1c) to launch into a different availability zone (ex: us-east-1c).
  • Change the Elastic IP for the backup frontend load balancer (FrontEnd-3) to be "-none-" because you don't know which Elastic IP you'll need to assign to this instance. 

05-advanced_1c.gif

 

For advanced architectures, it's better to think at the deployment level, instead of at the individual server level.  Instead of creating backups for each type of server, you can simply create backup deployments that are ready to be launched in a different availability zone if one of the zones stops. 

Once you start thinking of collections of servers are a "Deployment Unit" the move from Zone to Zone is a simple operation.

Simply click the Clone button for a deployment and change the availability zones for all servers accordingly.

All of the servers in your account with the same network options are on the same local neetowrk even when in distinct deployments.

Failover Scenarios

Failure in Availability Zone (us-east-1a)

 

06-advanced_failure1a.gif

Recipe

In this recipe, you do not want to launch the entire deployment at the same time.  Do not use the "launch-all" button because it's important that the servers in the backup deployment are launched in the correct order.

  1. Go to the backup deployment (Production 1c).
  2. If the availability zone with the Master-DB fails, promote the Slave-DB to Master-DB.  If the Master-DB is still operational, proceed to Step 2.
  3. Launch the Slave-DB into the new availability zone (us-east-1c).   Use operational scripts to attach the new Slave-DB to the current Master-DB to restart redundancy and replication.
  4. Launch the application servers (app-5, app-6) into the new availability zone (us-east-1c).
  5. Launch the frontend load balancers (FrontEnd-3) into the new availability zone (us-east-1c).
  6. Execute the LB get HA proxy config operational action on the new load balancer (FrontEnd-3) to get the configuration file from the running load balancer in order to establish communication with the application servers.
  7. Associate the unused Elastic IP to the new load balancer (FrontEnd-3).

 

Notes

If your Master-DB is large and takes a long time to start from backups, one advanced strategy would be to keep a Slave-DB "hot" by keeping one slave up and running in as many zones as you can afford.

In a failure scenario (or heavy load), the extra backup Slave-DB will already to be connected to the load balancers and application servers, and be ready to serve a larger percent of the traffic.   Most application servers launch quickly and can be added as needed.  These "spare slaves" make a great place to test backup restore policy and QA new software.  I tend to try a restore a salve DB  once and a while and check to see if the data is correct.  This makes a nice place to perform these audits.

08-advanced_hot.gif

Tag page
You must login to post a comment.