How to make AWS ASG do nothing when instance count is between min and max

amazon ec2amazon-web-servicesterraform

We're using Terraform to launch ASGs for most of our AWS EC2 instances. The problem is that once in awhile we want to do some extra work before terminating an instance; for example: decommissioning a node from a cluster before the EC2 instance it was running on is terminated. If we were just to lower min == max (our default) then an instance gets terminated and we can't run a graceful decommission.

Instead what I've tried is lowering min to the new desired value (example: 6) and keeping the max at the old value (example: 10), what happens in this case is that the desired value stays at 10 (the max) and terminating the EC2 instance causes a new one to be launched by the ASG. NOTE: we are not setting the Terraform desired_capacity setting at all.

If I set desired_capacity manually, I risk the ASG terminating a node that has not been gracefully decommissioned so I don't think that's an option for me.

What I'd ideally like is for the ASG to do nothing when the current EC2 Instance count for that ASG is between min and max and let me manually terminate instances. Obviously if the count goes below min I'd still like the ASG to launch a new EC2 Instance.

Is there any way to achieve this?

Best Answer

Supposed solutions:

Option 1: Your ASG should be created with instance protection ON - Terraform docs

In this case, we could have next sequence of operations for instance decommission:

  1. Perform decommission of particular instance(s)(remove data from it(them))
  2. Reduce count of ASG's size by desirable value
  3. Apply Terraform state
  4. Remove protection flag from decommissioned instances: aws autoscaling set-instance-protection --instance-ids <instances_ids> --auto-scaling-group-name <asg_name> --no-protected-from-scale-in

Option 2: Your ASG was not created with instance protection.

In this case, we could have next sequence of operations for instance decommission:

  1. Add protection flag on all instances of ASG: aws autoscaling set-instance-protection --instance-ids <instances_ids> --auto-scaling-group-name <asg_name> --protected-from-scale-in
  2. Perform decommission of particular instance(s)(remove data from it(them))
  3. Reduce count of ASG's size by desirable value
  4. Apply Terraform state
  5. Remove protection flag from decommissioned instances: aws autoscaling set-instance-protection --instance-ids <instances_ids> --auto-scaling- group-name <asg_name> --no-protected-from-scale-in
  6. (Optionally) Wait until ASG shrinks to desirable size and remove protection flag from other instances