AWS Auto Scaling and CodeDeploy never leaves Pending:Wait

amazon ec2amazon-web-servicesautoscalingload balancing

I am having an issue with Auto Scaling on AWS. It seems a lot of people have had similar issues, but I haven't been able to find anything that resolves my specific problem. When I manually increase the number of servers in my Auto Scaling Group (from 1 to 2), it creates the instance fine from the AMI. Then CodeDeploy detects it needs to be updated, and start deploying fine. The problem is that when the CodeDeploy deployment completes successfully, the instance instantly starts terminating. I get the following error:

Launching a new EC2 instance: SOME_INSTANCE. Status Reason: Instance failed to complete user's Lifecycle Action: Lifecycle Action with token SOME_TOKEN was abandoned: Heartbeat Timeout

The instance never leaves the Pending:Wait Lifecycle, and the whole process starts over again. I have tried extending the health check grace period. Everything seems to be healthy, and I can ssh into the server before it gets killed, and can go directly to the server by IP address and everything works. It looks like it just never gets put in an InService Lifecycle, and then never gets added to the Load Balancer.

Any thoughts on what might be causing this would be awesome. One thought I had was that maybe the CodeDeploy doesn't have access to add the server once the deploy has completed. I have two IAM Roles made, 1 for EC2, and one for CodeDeploy, both with the following policy, but with their own seperate trusted relationships:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:*",
                "codedeploy:*",
                "ec2:*",
                "elasticloadbalancing:*",
                "iam:AddRoleToInstanceProfile",
                "iam:CreateInstanceProfile",
                "iam:CreateRole",
                "iam:DeleteInstanceProfile",
                "iam:DeleteRole",
                "iam:DeleteRolePolicy",
                "iam:GetInstanceProfile",
                "iam:GetRole",
                "iam:GetRolePolicy",
                "iam:ListInstanceProfilesForRole",
                "iam:ListRolePolicies",
                "iam:ListRoles",
                "iam:PassRole",
                "iam:PutRolePolicy",
                "iam:RemoveRoleFromInstanceProfile",
                "s3:*"
            ],
            "Resource": "*"
        }
    ]
}

Best Answer

Okay, I was able to fix the issue. I had to delete and recreate my Auto Scaling group. I might have missed a setting in the advanced area when I initially made it, but not sure. I did notice that on the first auto scale out, the codedeploy didn't automatically fire on the new instance, but I edited it (without changing anything), clicked on deploy to current servers, and the next time I scaled out, it worked like a charm. This time it moved to InService. :)

Related Topic