Autoscaling, I'm working. don't kill me!
Problem:
When you've an autoscaling group in AWS scaling in (reducing capacity), some of your instances get terminated. AWS usually terminates the oldest first.
The problem is, sometimes, we might not want to terminate an ec2 instance because it is running a job. While we've handled our application architecture to take up jobs and process them, it just increases the time to process. Undesirable.
Solution:
In case of AWS, it offers something called instance-protection. To automate and use this efficiently, if a server is picking up a job, set instance-protection to true from within instance cli. When the job is finished, set instance-protection is set to false.
Make sure the instance has permissions attached from IAM role to set instance protection on autoscaling group.
- Get autoscaling group name and instance id:
You can use the following command for ASG name and the curl inside it for
instance id.
aws autoscaling describe-auto-scaling-instances --instance-ids `curl --silent http://169.254.169.254/latest/meta-data/instance-id 2>&1` --region us-east-1
- To enable protection:
aws autoscaling set-instance-protection --instance-ids i-xxxx --auto-scaling-group-name <asg name from above> --protected-from-scale-in
To disable:
aws autoscaling set-instance-protection --instance-ids i-xxxx --auto-scaling-group-name <asg name from above> --no-protected-from-scale-in
- Wrap these commands in a job watcher or codebase itself, so that whenever you pick up a background job, you can protect it from asg scale in.
However, if the instance runs on spot and spot triggers termination of the instance, ASG can't save it.