How to set up AWS Glue Using Terraform

amazon-web-servicesterraform

How can I set up AWS Glue using Terraform (specifically I want it to be able to spider my S3 buckets and look at table structures). A quick Google search came up dry for that particular service. The S3 bucket I want to interact with is already and I don't want to give Glue full access to all of my buckets.

I've submitted my solution Q & A style, but I'm interested to see if there are any thoughts on how I could have done it better.

Best Answer

If you have recommendations on how to do this better, then please submit an answer so I can do better next time.

My example here will closely reflect the situation I was in. In particular, the S3 bucket I wanted to interact with was already defined and I didn’t want to give Glue full access to all of my buckets.

The first component is the role itself. Amazon recommends the particular name I use in this section so that the role can be passed from console users to the service. Check out the IAM Role Section of the Glue Manual in the References section if that isn’t acceptable. The other thing that was different from a boilerplate “Assume Role“ was “Principal” and “Service”.

resource "aws_iam_role" "glue" {
  name = "AWSGlueServiceRoleDefault"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "glue.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}

The next component was to attach the AWSGlueServiceRole managed policy to the role. Amazon pre-defines this so that the role has almost all of the permissions it needs in order to work out of the gate.

resource "aws_iam_role_policy_attachment" "glue_service" {
    role = "${aws_iam_role.glue.id}"
    policy_arn = "arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole"
}

If you don’t have a policy already defined for your S3 bucket, then you can define your policy and attach it to this glue role all in the same block, like this:

resource "aws_iam_role_policy" "my_s3_policy" {
  name = "my_s3_policy"
  role = "${aws_iam_role.glue.id}"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:*"
      ],
      "Resource": [
        "arn:aws:s3:::my_bucket",
        "arn:aws:s3:::my_bucket/*"
      ]
    }
  ]
}
EOF
}

If, like me, you defined that policy but had already attached it to another role, then you can re-use it and attach it to the glue role as well like this:

resource "aws_iam_role_policy" "glue_service_s3" {
 name = "glue_service_s3"
    role = "${aws_iam_role.glue.id}"
    policy = "${aws_iam_role_policy.my_s3_policy.policy}"
}

The text you’d change here to match your configuration would be ‘my_s3_policy’ for the policy option/key.

My answer here replicated in part in my Medium post.

Related Topic