CloudFormation Join: Use Sub Instead

Hello!

In CloudFormation, it’s common to construct strings with the !Join function. Like this example from AWS’s cfn-init docs:

UserData: !Base64
  'Fn::Join':
    - ''
    - - |
        #!/bin/bash -xe
      - |
        # Install the files and packages from the metadata
      - '/opt/aws/bin/cfn-init -v '
      - '         --stack '
      - !Ref 'AWS::StackName'
      - '         --resource WebServerInstance '
      - '         --configsets InstallAndRun '
      - '         --region '
      - !Ref 'AWS::Region'
      - |+

Here’s the script this renders into:

#!/bin/bash -xe
# Install the files and packages from the metadata
/opt/aws/bin/cfn-init -v          --stack test         --resource WebServerInstance          --configsets InstallAndRun          --region us-west-2

To me, both are messy and hard to read. It abuses multiline string declarations (|) to create single line breaks. Spaces are all over the place. There are YAML - and ' characters everywhere.

I use the !Sub function with one multi-line string instead:

UserData:
  Fn::Base64: !Sub |
    #!/bin/bash -xe
    # Install the files and packages from the metadata
    /opt/aws/bin/cfn-init -v \
      --stack ${AWS::StackName} \
      --resource WebServerInstance \
      --configsets InstallAndRun \
      --region ${AWS::Region}

Fewer lines, no YAML syntax scattered around the script. It reads like a normal shell script except we can use ${} wherever we’d have used a !Ref. It renders like this:

#!/bin/bash -xe
# Install the files and packages from the metadata
/opt/aws/bin/cfn-init -v \
  --stack test \
  --resource WebServerInstance \
  --configsets InstallAndRun \
  --region us-west-2

I think both are much easier to read.

Some details:

  • I used Fn::Base64 instead of !Base64 because you can’t mix the long and short form in this case.
  • If your string needs values from other functions, like !GetAtt or !ImportValue, check out this.
  • Every new line in the sub version is a new line in the rendered script. Like any multiline shell command it has to break lines between the arguments with \. The join version renders the cfn-init command into one long line with a ton of spaces between the arguments, and a side effect is they don’t need the multiline command syntax.
  • The ${thing} syntax of !Sub is also a type shell variable expansion. Make sure you only use it for CloudFormation references. No problem for me because I only use CloudFormation to render super simple scripts that basically just call cfn-init. Shell’s $thing syntax is all I need. If your script is complex enough that this isn’t enough, I recommend reconsidering your approach. It’s usually an anti-pattern to use CloudFormation for those cases.

I almost always use !Sub instead of !Join. It lets you write strings like the strings they are, rather than polluting them with a bunch of YAML syntax.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation: cfn-lint Pre-Commit Git Hook

Hello!

There are a few ways to validate CloudFormation templates, but I mostly use cfn-lint. I run it in my build pipelines, but I also like to run it locally just so I don’t have to wait for a build to find out if I did a typo. I work on a lot of templates, though, and it’s easy to forget. I still end up finding out about simple errors after builds fail. Good news! Git can remember to do it for me.

I keep all my templates in git repos, and git has hooks to run scripts on actions like commits. They’re covered in the core docs and Atlassian also has a pretty good guide.

My goal is to keep bad code out of the git history, so I want commits to fail if there are linting errors. For that we need .git/hooks/pre-commit file:

#!/usr/bin/env bash
 
find . -type f -name "*\.yaml" | xargs cfn-lint

It must be executable: chmod u+x .git/hooks/pre-commit

I bodged together a repo with some bad templates in it so you can see the output:

.
└── templates
    ├── app1
    │   └── working.yaml
    └── app2
        └── broken.yaml

There’s a deliberate syntax error in broken.yaml. If I try to commit:

git commit -m "Test cfn-lint pre-commit hook."
E0000 Template needs to be an object.
./templates/app2/broken.yaml:1:1

… and there’s no commit. My changes are still just staged:

git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)
 
    new file:   templates/app1/working.yaml
    new file:   templates/app2/broken.yaml

A few notes:

  • You can ignore hook errors with the --no-verify flag. I use this when I need to break down my work into iterations of partially-working commits, then I go back and squash them in an interactive rebase.
  • These hooks are unique to your clone, they don’t automatically copy to others. That’s fine for my cases because this is just my personal development process; the build jobs are the real gateway.
  • cfn-lint doesn’t automatically descend into subdirectories, which is why I do a find.
  • This only works if the hook script exits non-zero when there are errors. My find/xargs pattern does, but there are lots of patterns that don’t. If you use a for/do/done loop, for example, you can end up masking the return codes and the hook won’t block commits. If you modify the pattern for searching, make sure you test the failure cases.
  • I always use the .yaml extension for my templates, and typically they’re the only YAML files in my repo. If your case is more complex than that, you’ll need to modify the pattern for searching.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CodePipeline lambda Function: Complete Example

Hello!

It takes a few pieces to assemble a working lambda action for CodePipeline. I like to start from a simple example and build up to what I need. Here’s the code I use as a starting point.

First, a few notes:

  • My pipeline lambda functions are usually small, often only a few dozen lines (more than that is usually a signal that I’m implementing an anti-pattern). Because my resources are small, I just drop the code into CloudFormation’s ZipFile. That saves me from building a package. In more complex cases you may want to expand your lambda function resource.
  • One of the most common problems I see in lambda action development is unhandled exceptions. Read more in my article here.
  • This example focuses on the minimum resources, permissions, and code for a healthy lambda action. I skipped some of the usual good practices like template descriptions and parameterized config.
  • I put the S3 Bucket and CloudWatch Logs Log Group in the same template as the function so it was easy to see for this example. Usually I put them in a separate template because they don’t share the same lifecycle. I don’t want rollbacks or reprovisions to delete my artifacts or logs.
  • My demo function doesn’t do anything with the pipeline artifact, it just logs the user parameter string passed to it. When I’m using custom actions like these it’s often for non-artifact tasks like passing notifications to outside systems and this is all I need.
  • You’ll have to upload a file as my_artifact to the bucket this creates so the pipeline’s source action has something to pull. The bucket will be named for your account ID and region to prevent collisions with other people’s bucket (S3’s namespace is global to all AWS customers).

Now, the code:

---
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  PipelineBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
        - ServerSideEncryptionByDefault:
            SSEAlgorithm: AES256
      BucketName: !Sub '${AWS::AccountId}-${AWS::Region}-pipeline'
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
      VersioningConfiguration:
        Status: Enabled
 
  LambdaLogs:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: /aws/lambda/log-user-parameters
      RetentionInDays: 30
 
  LambdaRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: execution-role
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogStream
            - logs:DescribeLogGroup
            - logs:PutLogEvents
            Resource: !GetAtt LambdaLogs.Arn
          - Effect: Allow
            Action:
            - codepipeline:PutJobFailureResult
            - codepipeline:PutJobSuccessResult
            # When this was written, CP's IAM policies required '*' for job results permissions.
            # https://docs.aws.amazon.com/IAM/latest/UserGuide/list_awscodepipeline.html#awscodepipeline-actions-as-permissions
            Resource: '*'
 
  PipelineRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - codepipeline.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: actions
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - s3:Get*
            - s3:Put*
            - s3:ListBucket
            Resource:
            - !Sub
              - ${BucketArn}/*
              - BucketArn: !GetAtt PipelineBucket.Arn
            - !GetAtt PipelineBucket.Arn
          - Effect: Allow
            Action:
            - lambda:InvokeFunction
            # ARN manually constructed to avoid circular dependencies in CloudFormation.
            Resource: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:log-user-parameters'
 
  Function:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          # https://operatingops.com/2019/08/03/codepipeline-python-aws-lambda-functions-without-timeouts/
          import json
          import logging
          import boto3
 
          def lambda_handler(event, context):
              logger = logging.getLogger()
              logger.setLevel(logging.INFO)
              logger.debug(json.dumps(event))
 
              codepipeline = boto3.client('codepipeline')
              s3 = boto3.client('s3')
              job_id = event['CodePipeline.job']['id']
 
              try:
                  user_parameters = event['CodePipeline.job']['data']['actionConfiguration']['configuration']['UserParameters']
                  logger.info(f'User parameters: {user_parameters}')
                  response = codepipeline.put_job_success_result(jobId=job_id)
                  logger.debug(response)
              except Exception as error:
                  logger.exception(error)
                  response = codepipeline.put_job_failure_result(
                      jobId=job_id,
                      failureDetails={
                        'type': 'JobFailed',
                        'message': f'{error.__class__.__name__}: {str(error)}'
                      }
                  )
                  logger.debug(response)
      FunctionName: log-user-parameters
      Handler: index.lambda_handler
      Role: !GetAtt LambdaRole.Arn
      Runtime: python3.7
      Timeout: 30
 
  Pipeline:
    Type: AWS::CodePipeline::Pipeline
    Properties:
      ArtifactStore:
        Location: !Ref PipelineBucket
        Type: 'S3'
      Name: log-user-parameters
      RoleArn: !GetAtt PipelineRole.Arn
      Stages:
      - Name: Source
        Actions:
        - Name: Source
          ActionTypeId:
            Category: Source
            Owner: AWS
            Provider: 'S3'
            Version: '1'
          # Docs say 'Configuration' has to be JSON but you can use YAML.
          # CloudFormation will convert it to JSON.
          Configuration:
            S3Bucket: !Ref PipelineBucket
            S3ObjectKey: my_artifact
            PollForSourceChanges: false
          InputArtifacts: []
          OutputArtifacts:
          - Name: Artifact
          Region: !Ref 'AWS::Region'
      - Name: LogUserData
        Actions:
        - Name: LogUserData
          ActionTypeId:
            Category: Invoke
            Owner: AWS
            Provider: Lambda
            Version: '1'
          # Docs say 'Configuration' has to be JSON but you can use YAML.
          # CloudFormation will convert it to JSON.
          Configuration:
            FunctionName: !Ref Function
            UserParameters: Hello!
          InputArtifacts:
          - Name: Artifact
          Region: !Ref 'AWS::Region'
          RunOrder: 1

This creates a pipeline:

Pipeline2

With an action that logs our user parameters string:

Logs

With a few CloudFormation parameters and a little extra code in my function and this pattern almost always solves my problem. Hope it helps!

Happy automating,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation: Conditional List Items

Hello!

Many CloudFormation resources take parameters whose values are lists. Like tags on a Load Balancer:

Lb:
  Type: AWS::ElasticLoadBalancingV2::LoadBalancer
  Properties:
    Name: MyLb
    Scheme: internal
    Subnets: !Ref Subnets
    Tags:
      - Key: TagA
        Value: Unconditionally present

I often find that I want to add an item to the list, but only if a condition is true. That’s possible, but the syntax is a little funky.

In addition to conditions, we need AWS::NoValue. It’s a CloudFormation “Pseudo Parameter” that removes properties from templates. It allows you to dynamically include or remove properties based on conditions; you can pass AWS::NoValue to either the true or false case of the condition and it’s as if you never wrote that property into the template. The docs demonstrate it for top-level properties like Name and Scheme, but it also works for list items:

Conditions:
  AlwaysTrue: !Equals [true, true]
  AlwaysFalse: !Equals [true, false]
 
Resources:
  Lb:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      Name: MyLb
      Scheme: internal
      Subnets: !Ref Subnets
      Tags:
        - Key: TagA
          Value: Unconditionally present
        - !If
          - AlwaysTrue
          - Key: TagB
            Value: Always present
          - !Ref 'AWS::NoValue'
        - !If
          - AlwaysFalse
          - Key: TagC
            Value: Never present
          - !Ref 'AWS::NoValue'

These conditions (which of course are hacks just to make the demo easy) will only create TagA and TagB:

ConditionalTags

That’s it!

Before we wrap up, though, it’s worth highlighting a detail. There are two list here, the list of tags and the list of arguments to the !If function:

- !If
  - AlwaysFalse
  - Key: TagC
    Value: Never present
  - !Ref 'AWS::NoValue'

The second element of the arguments list (line 3) is a map that will become the value of an element of the tags list (line 1). In long templates with complicated conditions I find this detail turns in to a tripping point. If you’re getting errors, make sure your indentation is clean and you’re not mixing up the two lists.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation: Multiline Strings

Hello!

If you’re asking how to do multiline strings in CloudFormation’s JSON syntax: I recommend switching to YAML.

There are a bunch of ways to do multiline strings in YAML, so there are a bunch of ways to do them in CloudFormation. For inline code, write them as Literal Block Scalars with a vertical pipe (|). Like in this lambda function:

Function:
  Type: AWS::Lambda::Function
  Properties:
    Code:
      ZipFile: |
        def handler(event, context):
            print('Huzzah!')
    Handler: index.handler
    Role: !GetAtt ExecutionRole.Arn
    Runtime: python3.7

This preserves newlines, so it’ll create a function whose code looks the same as it does in the CloudFormation template:

def handler(event, context):
    print('Huzzah!')

For parameter descriptions, any of the formats work. My stacks are usually managed by a CI/CD pipeline, so users read the descriptions from the actual template code when they’re choosing parameter values to pass to the pipeline. Any non-erroring syntax is fine, it’s just a code style choice. Even if you use the web console you’re still good, they all render the same:

Parameters:
  FlowScalar:
    Type: String
    Description:
      This is a parameter is and I want to write a long description. That will
      be easier to read if I split it onto several lines.
  FoldedBlockScalar:
    Type: String
    Description: >
      This is a parameter is and I want to write a long description. That will
      be easier to read if I split it onto several lines.
  LiteralBlockScalar:
    Type: String
    Description: |
      This is a parameter is and I want to write a long description. That will
      be easier to read if I split it onto several lines.
MultilineParameterDescriptions

I often use Literal Block Scalars with a vertical pipe (|) for these just so I don’t have to remember which pattern to use in which case.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation Custom Resource: Complete Example

Hello!

It takes a few pieces to assemble a working CloudFormation Custom Resource. I like to start from a simple example and build up to what I need. Here’s the code I use as a starting point.

First, a few notes:

  • My custom resources are usually small, often only a few dozen lines (more than that is usually a signal that I’m implementing an anti-pattern). Because my resources are small, I just drop the code into CloudFormation’s ZipFile. That saves me from building a package and from porting my own version of cfn-response. In more complex cases you may want to expand your lambda function resource.
  • When I tested the code in this article, the current version of AWS’s cfn-response module threw this warning into logs: DeprecationWarning: You are using the put() function from 'botocore.vendored.requests'. The vendored version of requests in botocore was never really meant for use outside of botocore itself, and it was recently deprecated. AWS knows about this, but they haven’t updated their cfn-response code yet.
  • One of the most common problems I see in custom resource development is unhandled exceptions. Read more in my article here.
  • This example focuses on the minimum resources, permissions, and code for a healthy custom resource. I skipped some of the usual good practices like template descriptions and parameterized config.
  • I put the CloudWatch Logs Log Group in the same template as the custom resource so it was easy to see for this example. Usually I put them in a separate template because they don’t share the same lifecycle. I don’t want rollbacks or reprovisions to delete my logs.
  • The custom resource Type can have several different values. Check out the details here.

Now, the code:

---
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  Logs:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: /aws/lambda/custom-resource
      RetentionInDays: 30
 
  ExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: custom-resource-execution-role
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogStream
            - logs:DescribeLogGroup
            - logs:PutLogEvents
            Resource: !GetAtt Logs.Arn
 
  Function:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          # https://operatingops.com/2018/10/13/cloudformation-custom-resources-avoiding-the-two-hour-exception-timeout/
          import logging
          import cfnresponse
 
          def handler(event, context):
              logger = logging.getLogger()
              logger.setLevel(logging.INFO)
              try:
                  if event['RequestType'] == 'Delete':
                      logger.info('Deleted!')
                      cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
                      return
 
                  logger.info('It worked!')
                  cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
              except Exception:
                  logger.exception('Signaling failure to CloudFormation.')
                  cfnresponse.send(event, context, cfnresponse.FAILED, {})
      FunctionName: custom-resource
      Handler: index.handler
      Role: !GetAtt ExecutionRole.Arn
      Runtime: python3.7
      Timeout: 30
 
  CustomResource:
    Type: Custom::Function
    Properties:
      ServiceToken: !GetAtt Function.Arn

With a few parameters and a little extra code this pattern almost always solves my problem. Hope it helps!

Happy automating,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Route 53: How To Alias Application Load Balancers

Hello!

This is a simple one but I kept getting stuck trying to figure it out. My brain was blocked on it. I’m sharing the pattern here in case you had the same problem.

All I needed was a Route 53 Hosted Zone with an alias record for an Application Load Balancer. I needed these defined in a CloudFormation template. Here’s how to do it:

---
AWSTemplateFormatVersion: '2010-09-09'
 
Parameters:
  VpcId:
    Type: AWS::EC2::VPC::Id
  Subnets:
    Type: List<AWS::EC2::Subnet::Id>
 
Resources:
  HostedZone:
    Type: AWS::Route53::HostedZone
    Properties:
      Name: demo-zone.internal
      VPCs:
        - VPCId: !Ref VpcId
          VPCRegion: !Ref 'AWS::Region'
 
  LoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      IpAddressType: ipv4
      Name: demo-lb
      Scheme: internal
      Subnets: !Ref Subnets
 
  LoadBalancerAlias:
    Type: AWS::Route53::RecordSet
    Properties:
      AliasTarget:
        DNSName: !GetAtt LoadBalancer.DNSName
        HostedZoneId: !GetAtt LoadBalancer.CanonicalHostedZoneID
      HostedZoneId: !Ref HostedZone
      Name: friendly-name.demo-zone.internal
      Type: A

These were two details that got me.

First, you need a different HostedZoneId in each of two places:

  • In the AliasTarget of the record. This is not the ID of the zone where you’re creating the record. All ALBs automatically get a DNS name. Like this: internal-demo-lb-XXXXXXXXXX.us-west-2.elb.amazonaws.com. As far as I understand, you need the ID of the zone where that automatic record lives. AWS manages that zone, so it won’t appear anywhere in your infrastructure. You get its ID from a property on the ALB resource: !GetAtt LoadBalancer.CanonicalHostedZoneID.
  • In the root Properties of the record. This is the ID of the zone where you’re creating the record.

Second, you need an A record (type), not a CNAME record.

Route 53 alias records are an AWS-specific technology, but they’re still aliases. CNAMEs are the native DNS aliases, so I expected Route 53 aliases to be an extension of that type. Nope! Aliases of ALBs are A records.

I think the detail is that aliases point directly to the IP addresses of the load balancer, there’s no chained DNS resolution like there is with CNAMEs. That makes them effectively magic A records. The magic is that AWS keeps them up to date with the dynamically changing IPs of load balancers.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudWatch Logs: Preventing Orphaned Log Groups

Hello!

When you need to publish logs to CloudWatch (e.g. from a lambda function), you need an IAM role with access to CloudWatch. It’s tempting to use a simple policy like the one in the AWS docs. You might write a CloudFormation template like this:

# Don't use this!
 
AWSTemplateFormatVersion: '2010-09-09'
 
Resources:
  DemoRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: lambda-logs
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogGroup
            - logs:CreateLogStream
            - logs:DescribeLogStreams
            - logs:PutLogEvents
            Resource: arn:aws:logs:*:*:*
 
  DemoFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          def handler(event, context):
              print('Demo!')
      FunctionName: demo-function
      Handler: index.handler
      Role: !GetAtt DemoRole.Arn
      Runtime: python3.7

Obviously, the role is too permissive: arn:aws:logs:*:*:*

But, there’s another problem: it grants logs:CreateLogGroup.

Here’s what happens:

  1. Launch a stack from this template
  2. Run demo-function
  3. Because we granted it permission, demo-function automatically creates /aws/lambda/demo-function log group in CloudWatch Logs
  4. Delete the stack
  5. CloudFormation doesn’t delete the /aws/lambda/demo-function log group

CloudFormation doesn’t know about the function’s log group because it didn’t create that group, so it doesn’t know anything needs to be deleted. Unless an operator deletes it manually, it’ll live in the account forever.

It seems like we can fix that by having CloudFormation create the log group:

DemoLogGroup:
  Type: AWS::Logs::LogGroup
  Properties:
    LogGroupName: /aws/lambda/demo-function
    RetentionInDays: 30

But, if the function still has logs:CreateLogGroup I’ve seen race conditions where the stack deletes the group before the lambda function and the function recreates that group before it gets deleted.

Plus, there aren’t any errors if you forget to define the group in CF. The stack launches. The lambda function runs. We even get logs, they’ll just be orphaned if we ever delete the stack.

That’s why it’s a problem to grant logs:CreateLogGroup. It allows lambda (or EC2 or whatever else is logging) to log into unmanaged groups.

All resources in AWS should be managed by CloudFormation (or terraform or whatever resource manager you use). Including log groups. So, you should never grant logs:CreateLogGroup except to your resource manager. Nothing else should need that permission.

And that’s the other reason: lambda doesn’t need logs:CreateLogGroup because it should be logging to groups that already exist. You shouldn’t grant permissions that aren’t needed.

Here’s the best practice: always manage your CloudWatch Logs groups and never grant permission to create those groups except to your resource manager.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation: Limited-Privilege IAM Policies With cfn-nag

Hello!

This article is about security testing in CloudFormation, if you’re looking for functional testing, check out this.

When you write IAM policies, you should grant the smallest set of permissions that work. So, looking at this policy defined in a CloudFormation resource:

DemoPolicy:
  Type: AWS::IAM::Policy
  Properties:
    PolicyDocument:
      Version: '2012-10-17'
      Statement:
      - Effect: Allow
        Action:
          - ec2:DescribeInstances
        Resource: '*'
    PolicyName: star-required

The Resource: '*' looks wrong. It grants permission to make the DescribeInstances on all resources. It would be better if that were limited to specific resources. It’s easy to see in a code snippet like this one, but in a full template that defines a dozen policies it’s easy to miss.

Enter cfn-nag, a Ruby gem designed to detect insecure patterns in CloudFormation templates. If we run it on the template that contains the resource above, we get a warning (cfn_nag_scan is the CLI for the cfn-nag gem):

cfn_nag_scan --input-path ./demo_failing.yaml
------------------------------------------------------------
./demo_failing.yaml
------------------------------------------------------------------------------------------------------------------------
| WARN W12
|
| Resources: ["DemoPolicy"]
| Line Numbers: [5]
|
| IAM policy should not allow * resource
 
Failures count: 0
Warnings count: 1

Great! It auto-detects the '*' (star).

Here’s the problem: in this case star is required. Not all IAM permissions support resource-specification. Some of them, like ec2:DescribeInstances, require star as their resource. You can confirm this in the IAM documentation. Like most AWS services, EC2 has an Actions, Resources, and Condition Keys for Amazon EC2 page. Its Actions Defined by Amazon EC2 section describes when star is required.

So, we need to silence the warning for this resource. Fortunately, cfn-nag supports this. Here’s the modified CF:

DemoPolicy:
  Type: AWS::IAM::Policy
  Metadata:
    cfn_nag:
      rules_to_suppress:
        - id: W12
          reason: |
            At time of writing, ec2:DescribeInstances required '*' as its resource.
            https://docs.aws.amazon.com/IAM/latest/UserGuide/list_amazonec2.html#amazonec2-actions-as-permissions
  Properties:
    PolicyDocument:
      Version: '2012-10-17'
      Statement:
      - Effect: Allow
        Action:
          - ec2:DescribeInstances
        Resource: '*'
    PolicyName: star-required

Now, cfn-nag runs without warning:

cfn_nag_scan --input-path ./demo_working.yaml 
------------------------------------------------------------
./demo_working.yaml
------------------------------------------------------------
Failures count: 0
Warnings count: 0

I recommend always including an explanation and a link to AWS documentation when you suppress a failure on a star resource. That way it’s clear to future readers (and auditors) why you granted all those permissions.

There’s a way to globally suppress warnings about star resources, but I recommend suppressing only on specific resources. This way, as you add new policies you’ll get warnings about cases where it is possible to limit the resources permissions apply to.

Happy securing!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation Gotcha: Numbers Are Strings

Good day!

CloudFormation templates take parameters. There are several types. One of those types is Number. Here’s a super-simple demo template:

---
AWSTemplateFormatVersion: "2010-09-09"
 
Parameters:
  CoolNumber:
    Type: Number
 
Resources:
  CoolResource:
    Type: AWS::SSM::Parameter
    Properties:
      Name: CoolNumber
      Type: String
      Value: !Ref CoolNumber

It just passes the CF param into an SSM param. It’s a little funky that we can create an SSM param with type String from a CF param with type Number, but in CF all numbers become strings when you Ref them (see the link above). That’s weird but it doesn’t break anything. Things break when we try to create a stack from the template.

Here’s a params file. We set our number to an integer:

[
  {
    "ParameterKey": "CoolNumber",
    "ParameterValue": 1
  }
]

If we try to launch the template, it fails:

aws cloudformation create-stack \
  --template-body file://./cool_stack.yaml \
  --stack-name CoolStack \
  --parameters file://./params.json
 
Parameter validation failed:
Invalid type for parameter Parameters[0].ParameterValue, value: 1, type: <class 'int'>, valid types: <class 'str'>

But we declared a number! We gave it a number! Why does it want a string?

In CloudFormation, parameter types aren’t really types. They’re validators. CF expects to receive everything as a string, then it checks to see if that string looks like the parameter’s “type”. In our example, it wants a string that happens to contain characters that represent a number.

This works (the only difference is the new quote marks):

[
  {
    "ParameterKey": "CoolNumber",
    "ParameterValue": "1"
  }
]

(╯°□°)╯︵ ┻━┻

I’ve also seen this behavior from CI/CD pipelines that use SDK-driven plugins to launch stacks with params files.

There are cases where you won’t get this error. I think some input mechanisms end up converting values to strings for you.

Text box in the console works:
console_number_param.png

Inline params in CF’s CLI commands in bash work:

aws cloudformation deploy \
  --template-file cool_stack.yaml \
  --stack-name CoolStack \
  --parameter-overrides CoolNumber=1

aws cloudformation create-stack \
  --template-body file://./cool_stack.yaml \
  --stack-name CoolStack \
  --parameters ParameterKey=CoolNumber,ParameterValue=1

Specifically in CloudFormation, here’s what you should do: declare whatever parameter types you need, but always pass the values in a strings. Anywhere else I can think of that would cause an error. Only do this for CloudFormation.

Keep your automation weird!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles: