CloudFormation: functions like ImportValue and GetAtt inside a Sub

Hello!

In CloudFormation, I think !Sub is the best way to generate strings that contain dynamic values. It’s better to interpolate, like this:

!Sub 'This is security group ${SG} in account ${AWS::AccountId}!'

Than to join, like this:

!Join ['', ['This is security group', !Ref SG, 'in account', !Ref 'AWS::AccountId']]

Both are common solutions, ${SG} resolves to the same value as !Ref SG, but I think interpolation is the right tool here. Join is better for other situations (like creating a comma delimited list from an array of strings).

It gets tricky if you need to render in your security group’s ID. You can’t !Ref that, you have to use !GetAtt, and the !Sub interpolation syntax just does a !Ref behind the scenes. You could still join:

!Join ['', ['This is security group', !GetAtt SG.GroupId, 'in account', !Ref 'AWS::AccountId']]

But that’s just as ugly as before. And, it doesn’t scale to more complex cases. I inherited a template, once, that used join to assemble a JSON string with half a dozen key/value pairs and it was unreadable. I’ll omit an example like that because it would be unreadable, but, if you’re struggling with join, read on! There is a solution.

!Sub has a second syntax that accepts a mapping, and with that and a little YAML sugar you can use it to render beautifully formatted, multi-line strings using values from any intrinsic function:

Parameter:
  Type: AWS::SSM::Parameter
  Properties:
    Name: /Example/SecurityGroupsJson
    Type: String
    Value: !Sub
      - |
        {
          "SecurityGroup1Id": "${SecurityGroup1Id}",
          "SecurityGroup2Id": "${SecurityGroup2Id}"
        }
      - SecurityGroup1Id: !ImportValue Stack1:SecurityGroupId
        SecurityGroup2Id: !GetAtt SecurityGroup.GroupId

That renders a JSON-formatted string:

{
  "SecurityGroup1Id": "sg-xxxxxxxx",
  "SecurityGroup2Id": "sg-yyyyyyyy"
}

Some details:

  • The ${} syntax now uses values from the map you pass in and from references to other resources in the template. I don’t know what happens if your map contains a key with the same name as a resource in that template because I’ve never tried that because that would be madness.
  • The vertical pipe is one of the YAML ways to make a multi-line string. There are a couple, but there’s an awesome website that explains them all.
  • Eagle-eyed readers may notice that the CF docs use a curly-bracket syntax for the map in the second item of the array, like this:
    !Sub
      - String
      - { Var1Name: Var1Value, Var2Name: Var2Value }
    

    That works, but my way is still valid CF YAML (tested ✅) and I think it’s a little cleaner.

Before I sign off, there’s one small piece of advice I’d like to leave you with: if this solves whatever problem you were facing, make sure you were solving the right problem. I haven’t seen a ton of cases in well-written CF templates where this was needed. It’s possible you’re trying to implement an anti-pattern and that’s why things got so complicated.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation Custom Resources: Avoiding the Two Hour Exception Timeout

If you’re new to custom resources check out this complete example first.

There’s a gotcha when writing CloudFormation Custom Resources that’s easy to miss and if you miss it your stack can get stuck, ignoring its timeout setting. It’ll fail on its own after an hour, but if it tries to roll back you have to wait a second hour. If the resource is defined in a nested stack, it’ll retry the rollback three times, adding even more hours to the delay. Here’s how to avoid this.

This post assumes you’re already working with Custom Resources and that yours are backed by lambda.

Here’s an empty custom resource:

import logging
import cfnresponse

def handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)

    if event['RequestType'] == 'Delete':
        logger.info('Deleted!')
        cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
        return

    logger.info('It worked!')
    cfnresponse.send(event, context, cfnresponse.SUCCESS, {})

It’s a successful no-op:

SuccessfulNoOp

Now let’s add an exception:

import logging
import cfnresponse

def handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)

    if event['RequestType'] == 'Delete':
        logger.info('Deleted!')
        cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
        return

    raise Exception
    logger.info('It worked!')
    cfnresponse.send(event, context, cfnresponse.SUCCESS, {})

We can see the exception in the logs:

ExceptionThreeRetries

But, then the stack gets stuck because the cfnresponse callback never happened and CF doesn’t know there was a problem:

FailureTimeouts

It took exactly an hour to fail, which suggests CF hit some internal, fallback timeout. My stack timeout was set to five minutes. We can see it retry the lambda function once a minute for three minutes, but then it never tries again in the remaining 57 minutes. I got the same delays in reverse when it tried to roll back (which is really just another update to the previous state). And, since the rollback failed, I had to manually edit the lambda function code and remove the exception to get it to finish rolling back.

Maybe this is a bug? Either way, there’s a workaround.

You should usually only catch specific errors that you know how to handle. It’s an anti-pattern to use except Exception. But, in this case we need to guarantee that the callback always happens. In this one situation (not in general) we need to catch all exceptions:

import logging
import cfnresponse

def handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)

    try:
        if event['RequestType'] == 'Delete':
            logger.info('Deleted!')
            cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
            return

        raise Exception
        logger.info('It worked!')
        cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
    except Exception:
        logger.exception('Signaling failure to CloudFormation.')
        cfnresponse.send(event, context, cfnresponse.FAILED, {})

(logger.exception(error) logs the exception and its stack trace. Even though we’re catching all errors, we shouldn’t let them pass silently.)

Now, the failure is visible to CF and it doesn’t wait:

ExceptionHandled.png

You should use this pattern in every Custom Resource: catch all exceptions and return a FAILED result to CF. You can still catch more specific exceptions inside the catchall try/except, ones specific to the feature you’re implementing, but you need that catchall to ensure the result returns when the unexpected happens.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

 

3 Tools to Validate CloudFormation

Hello!

This article is about functional testing in CloudFormation, if you’re looking for security testing, check out this.

I run three tools before applying CF templates. Here they are!

#1 AWS CLI’s validator

This is the native tool. It’s ok. It’s really only a syntax checker, there are plenty of errors you won’t see until you apply a template to a stack. Still, it’s fast and catches some things.

aws cloudformation validate-template --template-body file://./my_template.yaml

Notes:

  • The CLI has to be configured with access keys or it won’t run the validator.
  • If the template is JSON, this will ignore some requirements (e.g. it’ll allow trailing commas). However, the CF service ignores the same things.

#2 cfn-nag cfn-lint

cfn-lint is, like you’d expect, a linter for CloudFormation. I only started using it recently, but so far it’s pretty helpful.

cfn-lint my_template.yaml

Notes:

  • Before cfn-lint came out, I was using cfn-nag. I switched for two reasons:
    • Cfn-nag is a security testing tool, not a validator in general. Check out my article on using it to help write limited-privilege IAM policies.
    • It was a Ruby gem so you needed a whole extra dependency chain (and ideally a tool like RVM) to install it). Cfn-lint is a Python app available on PyPI, like the AWS CLI and its validator. Less tooling to maintain.

#3 Python’s JSON library

In general you should only write CloudFormation templates in YAML, but, sometimes I’m stuck with legacy JSON ones that need to be maintained.

Because the AWS CLI validator ignores some JSON requirements, I like to pass JSON templates through Python’s parser to make sure they’re valid. In the past, I’ve had to do things like load and search templates for unused parameters, etc. That’s not ideal but it’s happened a couple times while doing cleanup and refactoring of legacy code. It’s easier if the JSON is valid JSON.

It’s fiddly to run this in a shell script. I do it with a heredoc so I don’t have to write multiple scripts to the filesystem:

python - <<END
import json
with open('my_template.json') as f:
    json.load(f)
END

Notes:

  • I use Python for this because it’s a dependency of the AWS CLI so I know it’s already installed. You could use jq or another tool, though.
  • I don’t do the YAML equivalent of this because it errors on CF-specific syntax like !Ref.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles: