Day 99: Creating Highly Available Servers in AWS (Part 3)

Welcome to Day 99 of the “100 Days of DevOps with PowerShell”! For background on our goals in this series, see Announcing the “100 Days of Devops with PowerShell” Series here at SCC.

In our previous posts in this mini-series on creating highly available servers in AWS we have primarily looked at the role of Elastic Load Balancers (ELBs) and how they can manage client traffic to instances and also monitor the health of those instances that will ultimately handle that traffic.  In this post we are going to look at monitoring and remediating unhealthy instances, but also how monitoring the resource levels on servers and if necessary provisioning more instances.  All of which is part of a feature set known as Auto Scaling.

What is Auto Scaling?

Auto Scaling is the ability to increase resources i.e. instances for an application based on instance availability or resource availability of your instances.  So if you have a web application where the instances are pegged at 90% CPU for a given period of time, you can automatically deploy more instances to handle the extra load.  Similarly when the average CPU utilization drops below a given threshold you can automatically terminate those instances.

Creating the auto scaling configuration

Continuing with our example from the previous two posts, we’re going to see how the web servers we have deployed can be configured as part of an auto-scaling group.  One of the key differences is that instances are no longer deployed by New-EC2Instance cmdlet, but you first need to create an auto scaling configuration.  This is similar to what was deployed previously in that we’re using the User Data to configure the server at launch and we specify the instance type and the AMI to use, which is in this case is just the default Windows Server 2012 R2 image.

Creating the auto scaling group

Next we create the auto scaling group using New-ASAutoScalingGroup and here we’re saying the group should have a minimum of 2 instances (MinSize), a maximum of 6 (MaxSize), but the initial deployment should have 2 (DesiredCapacity).  A key parameter here is HealthCheckGracePeriod, which is the period that AWS will wait after an instance has been deployed before its health is checked.  Therefore the time you allow here has to be sufficient for the instance to be deployed and be up and running and in the example below we’ve gone with 10 minutes (600 seconds). The DefaultCoolDown is the amount of time AWS will wait after a scaling activity has been performed.  If this number is too low AWS may continue deploying more instances without waiting for the previous scaling activity to have an impact, therefore careful consideration should be given to this value as the default is 300 seconds which may be too low for Windows servers. Finally for the VPCIdentifier, we specify the subnet IDs where the instances should be deployed, separated by a comma  delimiter.

Creating the scaling policy

Using Write-ASScalingPolicy we create the scaling policy i.e. at what increments should the the number of resources be increase by and then finally we can associate a cloud watch monitor with the auto scaling group.  Here we are using AWS’s monitoring platform Cloud Watch to measure the CPU utilization of the instances.  Using Write-CWMetricAlarm we specify that if the CPU utilization is above 85% for two internals of 300 seconds, then that will trigger an alarm which will trigger an auto scaling policy.

The script

While we appear to have successfully handled an increased in load our environment, we don’t necessarily want those extra instances running when the demand has dropped.  We therefore create a ‘scale in policy’ where if the CPU drops below the threshold (25%) for a specified period of time (300 seconds), two instances will be deleted.  Specified by –ScalingAdjustment making sure this is a negative number or else you be deploying more instances!

Conclusion

There are many facets to auto scaling and getting it to work with your own application will require extensive testing and an imitate understanding of how your application performs, beyond the simple example provided here.  Auto scaling is one of the key features of any cloud service and is one of the major reasons why you would choose to run applications in the public cloud: the ability to easily consume more resources to meet demand and providing seamless high availability.  For more information on auto scaling I would recommend reading through the AWS documentation at What is Auto Scaling?

 

Previous Installments

To see the previous installments in this series, visit “100 Days of DevOps with PowerShell”.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.