Once you've set up up your production deployment, you're probably already thinking about "How do I set up my deployment to autoscale?"
In order to create a scalable deployment, you need to define an alert and escalation, and then specify the appropriate server template and array that you want to scale. An alert is a notification that a problematic condition occurred in your deployment. An escalation performs an action when a particular alert is triggered. Together, they can be used to configure your deployment to automatically take action on your behalf.
For example, if your site is experiencing a large amount of traffic, you can set up an alert and escalation to grow or "scale" your deployment, where more server resources are added to your server array in order to handle the increased bandwidth requirements. Therefore, you can set up your deployment to automatically scale based on particular alerts and escalations that you define. Currently, you can only have one array per deployment. If you need to scale multiple servers, you will need to create multiple deployments.
To set up a deployment so that it will automatically scale based on an alert and escalation that you define. In this example, we will configure the deployment to "grow" or add more application servers to satisfy an increase in bandwidth requirements.
NOTE: This tutorial only applies to Premium accounts. If you have a Developer account, you will need to contact sales@rightscale.com to upgrade.
This tutorial shows you how to change a basic 4-instance setup into a scalable deployment with a scalable application server array.
The basic setup has two FrontEnds (load balancer + app server), as well as a Master and Slave database.

When additional server resources are needed in a scalable setup, new servers are launched and added to the server array. When those resources are no longer needed, the application servers can be terminated and removed from the server array while the basic 4-instance setup stays intact. Autoscaling is especially useful for ensuring that your deployment can easily scale up whenever extra server resources are needed, whether that's tomorrow, next week, next year, or perhaps never. At least you know your deployment is set up to take advantage of one of the key benefits of cloud computing--launching additional server resources on demand.
.gif?size=webview)
This tutorial is divided into 5 Steps:
Step 1: Create a Server Array
Step 2: Create Escalations
Step 3: Create Alerts
Step 4: Attach Alerts and Escalations to the FrontEnds
Step 5: Attach Alerts and Escalations to the Server Template for Arrays
Go to Manage -> Arrays and click on New to create a new array.
The next step is to configure the array and define how the array will be scaled.
Click Save.
The next step is to define the action(s) to be taken when an alert is triggered. An escalation can have one or more actions. We will create two escalations:

To create a new escalation, go to Design -> Alerts -> Escalations and click the New button.

Enter a name and a description for the new alert escalation. You will also need to associate the escalation to a deployment. If you have followed the other tutorials, attach this escalation to your "Production" deployment and click Save.
NOTE: You will need to remember the name of the new escalation ("scale-up") when you create an alert in the next step.
Now that you've created a new escalation you'll need to define what action should be taken when the alert is triggered. In this example, we want to add server resources to the application array. i.e. "grow the array."
Click on the Actions tab.
You'll notice that we've pre-defined several common actions that can be associated with escalations.
Select the “vote_grow_array” action from the pulldown bar and click Add.
Since we want the server array to grow immediately, leave the "Escalate after" (escalate to the next action after n minutes) field blank or type zero and click Save.
Now that we've created an escalation to grow the server array, we need to create another escalation to shrink the server array.
Click the New button to create another escalation.

Enter a name and a description for the new alert escalation. Associate the escalation to the same deployment as the scale-up escalation. Click Save.
Now define the action for the scale-down escalation.
Click on the Actions tab.

This time select the “vote_shrink_array” action from the pulldown bar and click Add.
Now that we've set up escalations to grow and shrink the array, let's configure the alerts.
Next we need to define the alert specifications. Go to Design -> Alerts -> Specs. We will create two alerts:

To create an alert specification, go to Design -> Alerts -> Specs, and click the New button.

First give it a name and description. Let's configure this alert to trigger when the CPU Idle value is less than 30% for 3 minutes. If a server is at over 70% capacity, it may be a good time to scale-up and launch a couple more servers.

Click Save.
Next, we need to create the converse alert specification that will indicate when it is time to shrink the array and scale-down the deployment.
Click New.
Give it a name and description.
Except this time, let's configure this alert to trigger when the CPU Idle value is more than 85% for 3 minutes. If a server is only using 15% of its CPU power, this may be a good indicator that it's time to scale-down and terminate some unnecessary server resources.
Now that you've created the necessary alerts and escalations in order to scale-up and scale-down, the next step is to connect them to your deployment. First, we will connect them to the two FrontEnds.

Go to the Production deployment and select each of the FrontEnd templates. Select FrontEnd-1.

Click on the Alerts tab.

Select the Need to Grow Array alert and click Attach.
Select the Need to Shrink Array alert and click Attach.
Both alerts are now attached to FrontEnd-1. Repeat these steps and add the same alerts to FrontEnd-2.
The last step is to connect the alerts to the server template that will be used to launch new servers into the application server array. This will ensure that any server that gets added to the server array will also have the same alerts as the two FrontEnds.
NOTE: If servers have alerts, they must also have "SYS Monitoring install" installed. See Alert System.

Go to Manage -> Arrays.
Select the appropriate server template for the scalable array. ex: PHP App Server v3 (clone)
NOTE: If you choose a server template from the Premium list you will not be able to add an alert to it until you clone it.

Select the Alerts tab.

To attach an alert to the server template, select it from the list and click Attach.
Select the Need to Grow Array alert and click Attach.
Select the Need to Shrink Array alert and click Attach.
Both alerts are now attached to server template that will be used to launch new servers. Therefore, whenever a new server is launched and added to the application server array, it will have the same alerts as the FrontEnds.
Congratulations! You just set up a scalable deployment that is configured to grow and shrink its server array when the CPU Idle value triggers an alert and escalation. You can also set up an escalation that will also send an email whenever a particular action is taken in your deployment. But, be careful. You don't want to become a spammer!
----------------------
Did you find this document helpful? Please feel free to leave us a comment below so that we'll know how we can improve our documentation. Thanks!