Amazon ECS Task fails with STOPPED (CannotPullContainerError: Error response from daem)

amazon-ecramazon-vpcaws-fargatepermissions

I have set up an AWS VPC and am trying to deploy a functional container in ECS on a Fargate launch type but the task always fails with:

STOPPED (CannotPullContainerError: Error response from daem)

Task role context:

ecsTaskExecutionRole

Which has the following IAM permissions:

enter image description here

The repo permissions are such:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "AllowPull",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::aws_account_id:role/ecsTaskExecutionRole"
      },
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:DescribeImages",
        "ecr:DescribeRepositories",
        "ecr:GetAuthorizationToken",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetRepositoryPolicy",
        "ecr:ListImages"
      ]
    }
  ]
}

For security, the actual id is replaced with aws_account_id

I have followed this guide on troubleshooting which states:

You can receive this error due to one of the following issues:

  • Your launch type doesn't have access to the Amazon ECR endpoint

    I believe Fargate has access to ECR

  • Your Amazon ECR repository policy restricts access to repository images

    I believe it permits pull access for role used – see repo permissions above.

  • Your AWS Identity and Access Management (IAM) role doesn't have the right permissions to pull or push images

    I believe it does have necessary permissions – See task role context above.

  • The image can't be found

    The image is in ECR and permissions are above

  • Amazon Simple Storage Service (Amazon S3) access is denied by your Amazon Virtual Private Cloud (Amazon VPC) gateway endpoint policy

    I believe so. IAM permission is set per above S3 read access, furthermore, no explicit endpoint policy has been put in place, which according to docs, means full access by default.

To pull images, Amazon ECS must communicate with the Amazon ECR endpoint.

Routing table defined in the VPC:

enter image description here

with all of the VPC's subnets associated. So the VPC and anything running in it should be able to see the internet – The security policy used for the task currently allows all ports (temp while troubleshooting ECR issue).

What am I missing that I am still getting this error?

This works using an EC2 instance – If I create a task that uses an EC2 instance with all other things being equal (where applicable) EXCEPT

EC2:  Network Mode = Bridge 
Fargate: Network Mode = awsvpc

The container provisions and runs – and the web app that runs in container is running normally. But in Fargate, Network Mode MUST be awsvpc

Fargate only supports network mode ‘awsvpc’.

I think this is where the problem resides, but do not know how to
remedy.

The task definition is:

{
  "ipcMode": null,
  "executionRoleArn": "arn:aws:iam::aws_account_id:role/ecsTaskExecutionRole",
  "containerDefinitions": [
    {
      "dnsSearchDomains": null,
      "logConfiguration": {
        "logDriver": "awslogs",
        "secretOptions": null,
        "options": {
          "awslogs-group": "/ecs/deploy-test-web",
          "awslogs-region": "us-west-2",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "entryPoint": [],
      "portMappings": [
        {
          "hostPort": 8080,
          "protocol": "tcp",
          "containerPort": 8080
        }
      ],
      "command": null,
      "linuxParameters": null,
      "cpu": 1,
      "environment": [],
      "resourceRequirements": null,
      "ulimits": null,
      "dnsServers": null,
      "mountPoints": [],
      "workingDirectory": null,
      "secrets": null,
      "dockerSecurityOptions": null,
      "memory": null,
      "memoryReservation": null,
      "volumesFrom": [],
      "stopTimeout": null,
      "image": "csrepo/test-web-v4.0.6",
      "startTimeout": null,
      "dependsOn": null,
      "disableNetworking": null,
      "interactive": null,
      "healthCheck": null,
      "essential": true,
      "links": null,
      "hostname": null,
      "extraHosts": null,
      "pseudoTerminal": null,
      "user": null,
      "readonlyRootFilesystem": null,
      "dockerLabels": null,
      "systemControls": null,
      "privileged": null,
      "name": "test-web-six"
    }
  ],
  "placementConstraints": [],
  "memory": "2048",
  "taskRoleArn": "arn:aws:iam::aws_account_id:role/ecsTaskExecutionRole",
  "compatibilities": [
    "EC2",
    "FARGATE"
  ],
  "taskDefinitionArn": "arn:aws:ecs:us-west-2:aws_account_id:task-definition/deploy-test-web3:4",
  "family": "deploy-test-web3",
  "requiresAttributes": [
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.execution-role-awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.private-registry-authentication.secretsmanager"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.task-iam-role"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.task-eni"
    }
  ],
  "pidMode": null,
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "1024",
  "revision": 4,
  "status": "ACTIVE",
  "inferenceAccelerators": null,
  "proxyConfiguration": null,
  "volumes": []
}

Best Answer

I solved this problem by removing and creating again ECR repository

Related Topic