CloudWatch JSON Logs: aws_request_id

Hello!

In a previous article, I showed how to make AWS lambda functions log JSON objects to CloudWatch (⬅️ start there if you’re new to JSON logs). The pattern in that post had a flaw: it didn’t pass the aws_request_id. I was writing small functions to glue together bits of deployment automation and I didn’t need it. It was easy to correlate logs just with timestamps. Not every case is that simple. Sometimes you need that ID. Fortunately, Paulie Pena IV gave me some tips on how to pass it through.

We can look up the aws_request_id in the lambda function’s context object. To add it to log events, the python-json-logger library supports custom fields. To add a field for all log events, we need to subclass jsonlogger.JsonFormatter. Then we can save the ID into the new field and resolve it in the format string. Here’s the code from before with that change:

import logging
from pythonjsonlogger import jsonlogger
 
class CustomJsonFormatter(jsonlogger.JsonFormatter):
    def __init__(self, *args, **kwargs):
        self.aws_request_id = kwargs.pop('aws_request_id')
        super().__init__(*args, **kwargs)
    def add_fields(self, log_record, record, message_dict):
        super().add_fields(log_record, record, message_dict)
        log_record['aws_request_id'] = self.aws_request_id
 
def setup_logging(log_level, aws_request_id):
    logger = logging.getLogger()
 
    # Testing showed lambda sets up one default handler. If there are more,
    # something has changed and we want to fail so an operator can investigate.
    assert len(logger.handlers) == 1
 
    logger.setLevel(log_level)
    json_handler = logging.StreamHandler()
    formatter = CustomJsonFormatter(
        fmt='%(aws_request_id)s %(asctime)s %(levelname)s %(name)s %(message)s',
        aws_request_id=aws_request_id
    )
    json_handler.setFormatter(formatter)
    logger.addHandler(json_handler)
    logger.removeHandler(logger.handlers[0])
 
def lambda_handler(event, context):
    setup_logging(logging.DEBUG, context.aws_request_id)
    logger = logging.getLogger()
    logger.info('Huzzah!')

Now our logs contain the aws_request_id:

WithRequestId

Hope that helps,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CodePipeline lambda Function: Complete Example

Hello!

It takes a few pieces to assemble a working lambda action for CodePipeline. I like to start from a simple example and build up to what I need. Here’s the code I use as a starting point.

First, a few notes:

  • My pipeline lambda functions are usually small, often only a few dozen lines (more than that is usually a signal that I’m implementing an anti-pattern). Because my resources are small, I just drop the code into CloudFormation’s ZipFile. That saves me from building a package. In more complex cases you may want to expand your lambda function resource.
  • One of the most common problems I see in lambda action development is unhandled exceptions. Read more in my article here.
  • This example focuses on the minimum resources, permissions, and code for a healthy lambda action. I skipped some of the usual good practices like template descriptions and parameterized config.
  • I put the S3 Bucket and CloudWatch Logs Log Group in the same template as the function so it was easy to see for this example. Usually I put them in a separate template because they don’t share the same lifecycle. I don’t want rollbacks or reprovisions to delete my artifacts or logs.
  • My demo function doesn’t do anything with the pipeline artifact, it just logs the user parameter string passed to it. When I’m using custom actions like these it’s often for non-artifact tasks like passing notifications to outside systems and this is all I need.
  • You’ll have to upload a file as my_artifact to the bucket this creates so the pipeline’s source action has something to pull. The bucket will be named for your account ID and region to prevent collisions with other people’s bucket (S3’s namespace is global to all AWS customers).

Now, the code:

---
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  PipelineBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
        - ServerSideEncryptionByDefault:
            SSEAlgorithm: AES256
      BucketName: !Sub '${AWS::AccountId}-${AWS::Region}-pipeline'
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
      VersioningConfiguration:
        Status: Enabled
 
  LambdaLogs:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: /aws/lambda/log-user-parameters
      RetentionInDays: 30
 
  LambdaRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: execution-role
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogStream
            - logs:DescribeLogGroup
            - logs:PutLogEvents
            Resource: !GetAtt LambdaLogs.Arn
          - Effect: Allow
            Action:
            - codepipeline:PutJobFailureResult
            - codepipeline:PutJobSuccessResult
            # When this was written, CP's IAM policies required '*' for job results permissions.
            # https://docs.aws.amazon.com/IAM/latest/UserGuide/list_awscodepipeline.html#awscodepipeline-actions-as-permissions
            Resource: '*'
 
  PipelineRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - codepipeline.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: actions
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - s3:Get*
            - s3:Put*
            - s3:ListBucket
            Resource:
            - !Sub
              - ${BucketArn}/*
              - BucketArn: !GetAtt PipelineBucket.Arn
            - !GetAtt PipelineBucket.Arn
          - Effect: Allow
            Action:
            - lambda:InvokeFunction
            # ARN manually constructed to avoid circular dependencies in CloudFormation.
            Resource: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:log-user-parameters'
 
  Function:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          # https://operatingops.com/2019/08/03/codepipeline-python-aws-lambda-functions-without-timeouts/
          import json
          import logging
          import boto3
 
          def lambda_handler(event, context):
              logger = logging.getLogger()
              logger.setLevel(logging.INFO)
              logger.debug(json.dumps(event))
 
              codepipeline = boto3.client('codepipeline')
              s3 = boto3.client('s3')
              job_id = event['CodePipeline.job']['id']
 
              try:
                  user_parameters = event['CodePipeline.job']['data']['actionConfiguration']['configuration']['UserParameters']
                  logger.info(f'User parameters: {user_parameters}')
                  response = codepipeline.put_job_success_result(jobId=job_id)
                  logger.debug(response)
              except Exception as error:
                  logger.exception(error)
                  response = codepipeline.put_job_failure_result(
                      jobId=job_id,
                      failureDetails={
                        'type': 'JobFailed',
                        'message': f'{error.__class__.__name__}: {str(error)}'
                      }
                  )
                  logger.debug(response)
      FunctionName: log-user-parameters
      Handler: index.lambda_handler
      Role: !GetAtt LambdaRole.Arn
      Runtime: python3.7
      Timeout: 30
 
  Pipeline:
    Type: AWS::CodePipeline::Pipeline
    Properties:
      ArtifactStore:
        Location: !Ref PipelineBucket
        Type: 'S3'
      Name: log-user-parameters
      RoleArn: !GetAtt PipelineRole.Arn
      Stages:
      - Name: Source
        Actions:
        - Name: Source
          ActionTypeId:
            Category: Source
            Owner: AWS
            Provider: 'S3'
            Version: '1'
          # Docs say 'Configuration' has to be JSON but you can use YAML.
          # CloudFormation will convert it to JSON.
          Configuration:
            S3Bucket: !Ref PipelineBucket
            S3ObjectKey: my_artifact
            PollForSourceChanges: false
          InputArtifacts: []
          OutputArtifacts:
          - Name: Artifact
          Region: !Ref 'AWS::Region'
      - Name: LogUserData
        Actions:
        - Name: LogUserData
          ActionTypeId:
            Category: Invoke
            Owner: AWS
            Provider: Lambda
            Version: '1'
          # Docs say 'Configuration' has to be JSON but you can use YAML.
          # CloudFormation will convert it to JSON.
          Configuration:
            FunctionName: !Ref Function
            UserParameters: Hello!
          InputArtifacts:
          - Name: Artifact
          Region: !Ref 'AWS::Region'
          RunOrder: 1

This creates a pipeline:

Pipeline2

With an action that logs our user parameters string:

Logs

With a few CloudFormation parameters and a little extra code in my function and this pattern almost always solves my problem. Hope it helps!

Happy automating,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Lambda Gotcha: CloudWatch Logs Group Name

Hello!

Today’s post is a little “gotcha” that sometimes still gets me when I’m developing AWS lambda functions: if you want to stream the function’s logs to CloudWatch the log group’s name has to follow a specific convention.

Suppose I’m creating a lambda function with this CloudFormation snippet:

Function:
  Type: AWS::Lambda::Function
  Properties:
    Code:
      ZipFile: |
        # https://operatingops.com/2018/10/13/cloudformation-custom-resources-avoiding-the-two-hour-exception-timeout/
        import logging
        import cfnresponse
 
        def handler(event, context):
            logger = logging.getLogger()
            logger.setLevel(logging.INFO)
            try:
                if event['RequestType'] == 'Delete':
                    logger.info('Deleted!')
                    cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
                    return
 
                logger.info('It worked!')
                cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
            except Exception:
                logger.exception('Signaling failure to CloudFormation.')
                cfnresponse.send(event, context, cfnresponse.FAILED, {})
    FunctionName: custom-resource
    Handler: index.handler
    Role: !GetAtt ExecutionRole.Arn
    Runtime: python3.7
    Timeout: 30

The key piece is this:

FunctionName: custom-resource

When AWS lambda sends logs to CloudWatch, it assumes the target log group has a name like this:

/aws/lambda/[function name]

This isn’t configurable. If your log group’s name doesn’t follow this convention you won’t get logs from your lambda function.

So, in our case, we need a log group called /aws/lambda/custom-resource. In CloudFormation, we could create it like this:

Logs:
  Type: AWS::Logs::LogGroup
  Properties:
    LogGroupName: /aws/lambda/custom-resource
    RetentionInDays: 30

The IAM role attached to your function of course still needs permissions to send logs, and there’s another gotcha there that can lead to orphaned log groups.

Hope this helps!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudFormation Custom Resource: Complete Example

Hello!

It takes a few pieces to assemble a working CloudFormation Custom Resource. I like to start from a simple example and build up to what I need. Here’s the code I use as a starting point.

First, a few notes:

  • My custom resources are usually small, often only a few dozen lines (more than that is usually a signal that I’m implementing an anti-pattern). Because my resources are small, I just drop the code into CloudFormation’s ZipFile. That saves me from building a package and from porting my own version of cfn-response. In more complex cases you may want to expand your lambda function resource.
  • When I tested the code in this article, the current version of AWS’s cfn-response module threw this warning into logs: DeprecationWarning: You are using the put() function from 'botocore.vendored.requests'. The vendored version of requests in botocore was never really meant for use outside of botocore itself, and it was recently deprecated. AWS knows about this, but they haven’t updated their cfn-response code yet.
  • One of the most common problems I see in custom resource development is unhandled exceptions. Read more in my article here.
  • This example focuses on the minimum resources, permissions, and code for a healthy custom resource. I skipped some of the usual good practices like template descriptions and parameterized config.
  • I put the CloudWatch Logs Log Group in the same template as the custom resource so it was easy to see for this example. Usually I put them in a separate template because they don’t share the same lifecycle. I don’t want rollbacks or reprovisions to delete my logs.
  • The custom resource Type can have several different values. Check out the details here.

Now, the code:

---
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  Logs:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: /aws/lambda/custom-resource
      RetentionInDays: 30
 
  ExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: custom-resource-execution-role
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogStream
            - logs:DescribeLogGroup
            - logs:PutLogEvents
            Resource: !GetAtt Logs.Arn
 
  Function:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          # https://operatingops.com/2018/10/13/cloudformation-custom-resources-avoiding-the-two-hour-exception-timeout/
          import logging
          import cfnresponse
 
          def handler(event, context):
              logger = logging.getLogger()
              logger.setLevel(logging.INFO)
              try:
                  if event['RequestType'] == 'Delete':
                      logger.info('Deleted!')
                      cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
                      return
 
                  logger.info('It worked!')
                  cfnresponse.send(event, context, cfnresponse.SUCCESS, {})
              except Exception:
                  logger.exception('Signaling failure to CloudFormation.')
                  cfnresponse.send(event, context, cfnresponse.FAILED, {})
      FunctionName: custom-resource
      Handler: index.handler
      Role: !GetAtt ExecutionRole.Arn
      Runtime: python3.7
      Timeout: 30
 
  CustomResource:
    Type: Custom::Function
    Properties:
      ServiceToken: !GetAtt Function.Arn

With a few parameters and a little extra code this pattern almost always solves my problem. Hope it helps!

Happy automating,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Don’t Import requests From botocore.vendored

Hello!

I’ve seen this anti-pattern scattered around plenty DevOps code, especially in AWS lambda functions:

from botocore.vendored import requests

Vendoring libraries like requests into other libraries like botocore is arguably an anti-pattern in general, but reaching in to botocore and importing it in your own code is definitely one. Here are some of the reasons.

The maintainers may un-vendor it. This just happened! In newer versions of botocore you can still import requests but all that’s left are some bits of the error handling system. If you upgrade botocore your imports will still work but you’ll get errors when you try to use requests. Like this in version 1.13.15:

>>> from botocore.vendored import requests
>>> print(requests.get('https://google.com'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'botocore.vendored.requests' has no attribute 'get'

I saw an import like this running in an AWS lambda function a few days ago and it worked but was showing deprecation warnings. When AWS upgrades it’ll break entirely.

The vendored version may be outdated and have security vulnerabilities that have been patched in newer versions. Even if maintainers realize there’s a vulnerability, if they know they’re not using a vulnerable module of the package they may still not upgrade. Unless you check every usage to ensure you’re not using vulnerable modules at the vendored version, you should assume you are.

The vendored package may have been customized. This shouldn’t happen, but I’ve seen it in other packages plenty of times. Once the code has been copied into the repo it’s super easy for someone to tweak it for convenience. Vendored packages may no longer behave how you expect.

Instead of relying on botocore’s vendoring, add requests to your dependency chain like usual. Add a line to your requirements.txt file or update your setup.py. To package code with dependencies for AWS lambda functions, check out this. To add dependencies to your setup.py check out this.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudWatch Logs: Preventing Orphaned Log Groups

Hello!

When you need to publish logs to CloudWatch (e.g. from a lambda function), you need an IAM role with access to CloudWatch. It’s tempting to use a simple policy like the one in the AWS docs. You might write a CloudFormation template like this:

# Don't use this!
 
AWSTemplateFormatVersion: '2010-09-09'
 
Resources:
  DemoRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: '/'
      Policies:
      - PolicyName: lambda-logs
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogGroup
            - logs:CreateLogStream
            - logs:DescribeLogStreams
            - logs:PutLogEvents
            Resource: arn:aws:logs:*:*:*
 
  DemoFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          def handler(event, context):
              print('Demo!')
      FunctionName: demo-function
      Handler: index.handler
      Role: !GetAtt DemoRole.Arn
      Runtime: python3.7

Obviously, the role is too permissive: arn:aws:logs:*:*:*

But, there’s another problem: it grants logs:CreateLogGroup.

Here’s what happens:

  1. Launch a stack from this template
  2. Run demo-function
  3. Because we granted it permission, demo-function automatically creates /aws/lambda/demo-function log group in CloudWatch Logs
  4. Delete the stack
  5. CloudFormation doesn’t delete the /aws/lambda/demo-function log group

CloudFormation doesn’t know about the function’s log group because it didn’t create that group, so it doesn’t know anything needs to be deleted. Unless an operator deletes it manually, it’ll live in the account forever.

It seems like we can fix that by having CloudFormation create the log group:

DemoLogGroup:
  Type: AWS::Logs::LogGroup
  Properties:
    LogGroupName: /aws/lambda/demo-function
    RetentionInDays: 30

But, if the function still has logs:CreateLogGroup I’ve seen race conditions where the stack deletes the group before the lambda function and the function recreates that group before it gets deleted.

Plus, there aren’t any errors if you forget to define the group in CF. The stack launches. The lambda function runs. We even get logs, they’ll just be orphaned if we ever delete the stack.

That’s why it’s a problem to grant logs:CreateLogGroup. It allows lambda (or EC2 or whatever else is logging) to log into unmanaged groups.

All resources in AWS should be managed by CloudFormation (or terraform or whatever resource manager you use). Including log groups. So, you should never grant logs:CreateLogGroup except to your resource manager. Nothing else should need that permission.

And that’s the other reason: lambda doesn’t need logs:CreateLogGroup because it should be logging to groups that already exist. You shouldn’t grant permissions that aren’t needed.

Here’s the best practice: always manage your CloudWatch Logs groups and never grant permission to create those groups except to your resource manager.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Lambda: Building Python 3 Packages

Hello!

This is a guide for building lambda zip packages with pip dependencies in Python 3. It expands the AWS guide to:

  • Build your code if it’s either a single-file Python module or a pip-installable Python package (e.g. contains a setup.py).
  • Track pip dependencies for single-file Python modules in the standard requirements.txt file.
  • Show the file tree for each case and the lambda handler path that results from it.
  • Provide a copy/paste-able script for each case.

Notes:

  • Each package.sh can be run from any directory, but will always produce a function.zip in the same directory as itself.
  • Shell tested in the Apple Mac OS X 10.14.6 terminal.
  • Lambda packages tested in Python 3.7 functions.

Single-File Modules

File tree:

.
├── function.py  (defines a 'lambda_handler' function)
├── package.sh
└── requirements.txt

Lambda handler path: function.lambda_handler

package.sh:

#!/usr/bin/env bash
 
# Builds a lambda package from a single Python 3 module with pip dependencies.
# This is a modified version of the AWS packaging instructions:
# https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html#python-package-dependencies
 
# https://stackoverflow.com/a/246128
SCRIPT_DIRECTORY="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
 
pushd $SCRIPT_DIRECTORY > /dev/null
 
rm -rf .package function.zip
mkdir .package
 
pip install --target .package --requirement requirements.txt
 
pushd .package > /dev/null
zip --recurse-paths ${SCRIPT_DIRECTORY}/function.zip .
popd > /dev/null
 
zip --grow function.zip function.py
 
popd > /dev/null

Pip-Installable Python Packages

In lambda with Python 2 this used to be harder, but in lambda with Python 3 it’s as easy as a flat module. If you’re still on Python 2, it’s time to upgrade.

File tree:

.
├── function
│   ├── __init__.py
│   └── function.py  (defines a 'lambda_handler' function)
├── package.sh
└── setup.py

Lambda handler path: function.function.lambda_handler

package.sh:

#!/usr/bin/env bash
 
# Builds a lambda package from a pip-installable Python 3 package.
# This is a modified version of the AWS packaging instructions:
# https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html#python-package-dependencies
 
# https://stackoverflow.com/a/246128
SCRIPT_DIRECTORY="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
 
pushd $SCRIPT_DIRECTORY > /dev/null
 
rm -rf .package function.zip
mkdir .package
 
pip install --target .package .
 
pushd .package > /dev/null
zip --recurse-paths ${SCRIPT_DIRECTORY}/function.zip .
popd > /dev/null
 
popd > /dev/null

That’s it! In the latest versions of lambda it’s pretty easy, you just have to do a little fiddling to make the scripts easy to use and to figure out the handler paths.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudWatch Logs Structured as JSON with Python Lambda Functions

Hello!

If you’re setting up JSON logging in a script and instead of a lambda function, check out this instead. The pattern is a little simpler.

Update 2020-08-30: Replaced the explanation of the missing aws_request_id with a link to a new article explaining how to pass that value.

Recently, I’ve been switching to logs structured as JSON. That means output like this:

{
    "asctime": "2019-09-20 13:55:13,604",
    "levelname": "INFO",
    "name": "root",
    "message": "Huzzah!"
}

Instead of this:

2019-09-21 15:52:43,240 INFO root Huzzah!

This way, your log processor can reference keys in a JSON object instead of splitting strings and hoping the split works right. Modern log processors are pretty good with delimited strings, but I’ve still seen them break because of things like unexpected whitespace or messages that happen to contain delimiter characters.

CloudWatch Logs reads JSON out of box. To send it JSON from a Python lambda function, we need to do to two things:

  • Configure a JSON handler for the root logger. I use the python-json-logger library.
  • Remove lambda’s default log handler (to prevent duplicate messages).

First, the full example. Then the details. Here’s the code:

import logging
from pythonjsonlogger import jsonlogger
 
def setup_logging(log_level):
    logger = logging.getLogger()
 
    # Testing showed lambda sets up one default handler. If there are more,
    # something has changed and we want to fail so an operator can investigate.
    assert len(logger.handlers) == 1
 
    logger.setLevel(log_level)
    json_handler = logging.StreamHandler()
    formatter = jsonlogger.JsonFormatter(
        fmt='%(asctime)s %(levelname)s %(name)s %(message)s'
    )
    json_handler.setFormatter(formatter)
    logger.addHandler(json_handler)
    logger.removeHandler(logger.handlers[0])
 
def lambda_handler(event, context):
    setup_logging(logging.DEBUG)
    logger = logging.getLogger()
    logger.info('Huzzah!')

For more details on what that assert is and how it works, check out this.

By default, CloudWatch renders this function’s JSON messages on one line:

log_output_collapsed.png

But you can expand them into pretty-printed objects:

log_output_expanded.png

The expanded form is extra nice when you have large log messages or lots of fields.

Like we saw in my previous post, lambda’s out-of-box setup logs delimited strings. I figured out from testing that it does this with a handler on the root logger. If we added a JSON handler and did nothing else, we’d get duplicate log messages; one from each handler. We have to call removeHandler() on the default one.

To see how this works, let’s add debug messages to our setup function (and switch to logging.DEBUG level, of course):

...
def setup_logging(log_level):
    logger = logging.getLogger()
 
    # Testing showed lambda sets up one default handler. If there are more,
    # something has changed and we want to fail so an operator can investigate.
    assert len(logger.handlers) == 1
 
    logger.setLevel(log_level)
    json_handler = logging.StreamHandler()
    formatter = jsonlogger.JsonFormatter(
        fmt='%(asctime)s %(levelname)s %(name)s %(message)s'
    )
    json_handler.setFormatter(formatter)
    logger.debug(f'Number of handlers: {len(logger.handlers)}')
    logger.debug('Adding JSON handler.')
    logger.addHandler(json_handler)
    logger.debug(f'Number of handlers: {len(logger.handlers)}')
    logger.debug('This message logs twice because there are two handlers.')
    logger.debug('Removing original handler.')
    logger.removeHandler(logger.handlers[0])
    logger.debug('This message logs once because we removed the original handler.')
    ...

Now our output shows what’s happening with the handlers:

debug_output_collapsed.png

The default handler logs an aws_request_id that our JSON handler doesn’t. My functions didn’t need that ID. If yours do, check out this article.

Like we also saw in another previous post, one of the reasons to use Python’s logging library is it automatically captures messages from third-party tools like boto. We still get that:

boto_output_collapsed

I recommend switching to logs structured as JSON. It can reduce failures and complexity in log processing, and the output is cleaner overall.

Happy logging!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CodePipeline: Python AWS Lambda Functions Without Timeouts

Hello!

If you’re new to CodePipeline lambda actions check out this complete example first.

There’s a gotcha when writing CodePipeline lambda functions that’s easy to miss and if you miss it your pipeline can get stuck in timeout loops that you can’t cancel. Here’s how to avoid that.

This article assumes you’re familiar with CodePipeline and lambda and that you’ve granted the right IAM permissions to both. You may also want to check out lambda function logging.

This is Python 3. Python 2 is out of support.

CodePipeline uses a callback pattern for running lambda functions: it invokes the function and then waits for that function to call back with either put_job_success_result or put_job_failure_result.

Here’s an empty lambda action:

import json
import logging
import boto3
 
def lambda_handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    logger.debug(json.dumps(event))
 
    codepipeline = boto3.client('codepipeline')
    job_id = event['CodePipeline.job']['id']
 
    logger.info('Doing cool stuff!')
    response = codepipeline.put_job_success_result(jobId=job_id)
    logger.debug(response)

It’s a successful no-op:

SimpleWorking

Now let’s add an exception:

import json
import logging
import boto3
 
def lambda_handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    logger.debug(json.dumps(event))
 
    codepipeline = boto3.client('codepipeline')
    job_id = event['CodePipeline.job']['id']
 
    logger.info('Doing cool stuff!')
    raise ValueError('Fake error for testing!')
    response = codepipeline.put_job_success_result(jobId=job_id)
    logger.debug(response)

The log shows the exception, like we’d expect:

SimpleFailing

But, the pipeline action takes 20 minutes to time out. The CodePipeline limits doc says it takes 1 hour for lambda functions to time out and that used to apply to functions that didn’t send results, I tested it. Sadly, I didn’t think to keep screenshots back then. In my latest tests it took 20 minutes: ConsistentTwentyMinuteTimeout

It doesn’t matter what the lambda function’s timeout is. Mine was set to 3 seconds. We’re hitting a timeout that’s internal to CodePipeline.

At least the action’s details link give an error saying specifically that it didn’t receive a result: NoResultReturnedErrorMinimal.png

There’s a workaround. You should usually only catch specific errors that you know how to handle. It’s an anti-pattern to use except Exception. But, in this case we need to guarantee that the callback always happens. In this one situation (not in general) we need to catch all exceptions:

import json
import logging
import boto3
 
def lambda_handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
    logger.debug(json.dumps(event))
 
    codepipeline = boto3.client('codepipeline')
    job_id = event['CodePipeline.job']['id']
 
    try:
        raise ValueError('This message will appear in the CodePipeline UI.')
        logger.info('Doing cool stuff!')
        response = codepipeline.put_job_success_result(jobId=job_id)
        logger.debug(response)
    except Exception as error:
        logger.exception(error)
        response = codepipeline.put_job_failure_result(
            jobId=job_id,
            failureDetails={
              'type': 'JobFailed',
              'message': f'{error.__class__.__name__}: {str(error)}'
            }
        )
        logger.debug(response)

(logger.exception(error) logs the exception and its stack trace. Even though we’re catching all errors, we shouldn’t let them pass silently.)

Now the failure will be visible to CodePipeline and the action won’t get stuck waiting.

The failureDetails message will appear in the CodePipeline UI. We send the exception message so it’s visible to operators:

HealthyError2

Of course, you’ll want to remove that ValueError. It’s just to demonstrate the handling.

You should use this pattern in every lambda action: catch all exceptions and return a JobFailed result to the pipeline. You can still catch more specific exceptions inside the catchall try/except, ones specific to the feature you’re implementing, but you need that catchall to ensure the result returns when the unexpected happens.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Lambda: boto3 CloudWatch Logs

Good morning!

If you’re writing a regular script (i.e. not a lambda function), check out this article.

This pattern outputs traditional delimited strings. If you want to upgrade that into output structured as JSON objects, check out this article.

For those custom cases that don’t fit into Terraform or CloudFormation, a little bit of Python and some boto3 in a lambda function can save you. Lambda captures the output of both print() and logging.Logger calls into CloudWatch so it’s easy to log information about what your code is doing. When things go wrong, though, I often find that just the output I wrote doesn’t give me enough to diagnose the problem. In those cases, it’s helpful to see the log output both for your code and boto3. Here’s how you do that.

Use the logging library. It’s a Python core library that provides standard features like timestamped prefixes and support for levels (e.g. INFO or DEBUG). For simple deployment helpers this is usually all you need:

logger = logging.getLogger(logging.INFO)
logger.info('Message at the INFO level.')
logger.debug('Message at the DEBUG level.')

This sets the root logger (which sees all log messages) to the INFO level. Normally you’d have to configure the root logger, but lambda does that automatically (which is actually annoying if you need to change your formatter, but that’s for another post). Now, logger.info() calls will show up in the logs and logger.debug() calls won’t. If you increase the level to DEBUG you’ll see both.

Because logging is the standard Python way to handle log output, maintainers of libraries like boto3 use it throughout their code to show what the library is doing (and they’re usually smart about choosing what to log at each level). By setting a level on the root logger, you’re choosing which of your output to capture and which of boto3’s output to capture. Powerful when you’re diagnosing a failure.

Here’s a demo function to show how the output looks. You might notice that it puts the logger setup calls inside the handler even though the AWS docs tell you to put them under the import. Function calls made directly in modules (e.g. not inside functions declared within the module) are import-side effects and import side-effects are an anti-pattern. I put the calls in the handler so they only run when the handler is called. This isn’t likely to matter much in a lambda function, but I like to stick to good patterns.

import logging
 
import boto3
 
def lambda_handler(event, context):
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)
 
    client = boto3.client('sts')
    account_id = client.get_caller_identity()['Account']
 
    logger.info('Getting account ID...')
    logger.debug('Account ID: {}'.format(account_id))
    return account_id

This is the full output when run at the INFO level:

START RequestId: a61471fe-c3fd-11e8-9f43-bdb22e22a203 Version: $LATEST
[INFO]  2018-09-29T15:38:01.882Z    a61471fe-c3fd-11e8-9f43-bdb22e22a203    Found credentials in environment variables.
[INFO]  2018-09-29T15:38:02.83Z a61471fe-c3fd-11e8-9f43-bdb22e22a203    Starting new HTTPS connection (1): sts.amazonaws.com
[INFO]  2018-09-29T15:38:02.531Z    a61471fe-c3fd-11e8-9f43-bdb22e22a203    Getting account ID...
END RequestId: a61471fe-c3fd-11e8-9f43-bdb22e22a203
REPORT RequestId: a61471fe-c3fd-11e8-9f43-bdb22e22a203  Duration: 734.96 ms Billed Duration: 800 ms Memory Size: 128 MB Max Memory Used: 29 MB

When run at the DEBUG level it produces a ton of lines:

START RequestId: 9ea3bbef-c3fe-11e8-8eb1-730a799b5405 Version: $LATEST
[DEBUG] 2018-09-29T15:44:58.850Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.880Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable config_file from defaults.
[DEBUG] 2018-09-29T15:44:58.881Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable credentials_file from defaults.
[DEBUG] 2018-09-29T15:44:58.881Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable data_path from defaults.
[DEBUG] 2018-09-29T15:44:58.881Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable region from environment with value 'us-west-2'.
[DEBUG] 2018-09-29T15:44:58.900Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.900Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable ca_bundle from defaults.
[DEBUG] 2018-09-29T15:44:58.900Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.900Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable api_versions from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable credentials_file from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable config_file from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable metadata_service_timeout from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.901Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable metadata_service_num_attempts from defaults.
[DEBUG] 2018-09-29T15:44:58.942Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:58.960Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Looking for credentials via: env
[INFO]  2018-09-29T15:44:58.960Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Found credentials in environment variables.
[DEBUG] 2018-09-29T15:44:58.961Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading JSON file: /var/runtime/botocore/data/endpoints.json
[DEBUG] 2018-09-29T15:44:59.1Z  9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading variable profile from defaults.
[DEBUG] 2018-09-29T15:44:59.20Z 9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Event choose-service-name: calling handler
[DEBUG] 2018-09-29T15:44:59.60Z 9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading JSON file: /var/runtime/botocore/data/sts/2011-06-15/service-2.json
[DEBUG] 2018-09-29T15:44:59.82Z 9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Event creating-client-class.sts: calling handler
[DEBUG] 2018-09-29T15:44:59.100Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    The s3 config key is not a dictionary type, ignoring its value of: None
[DEBUG] 2018-09-29T15:44:59.103Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Setting sts timeout as (60, 60)
[DEBUG] 2018-09-29T15:44:59.141Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Loading JSON file: /var/runtime/botocore/data/_retry.json
[DEBUG] 2018-09-29T15:44:59.141Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Registering retry handlers for service: sts
[DEBUG] 2018-09-29T15:44:59.160Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Event before-parameter-build.sts.GetCallerIdentity: calling handler
[DEBUG] 2018-09-29T15:44:59.161Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Making request for OperationModel(name=GetCallerIdentity) (verify_ssl=True) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'Content-Type': 'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent': 'Boto3/1.7.74 Python/3.6.1 Linux/4.14.62-65.117.amzn1.x86_64 exec-env/AWS_Lambda_python3.6 Botocore/1.10.74'}, 'body': {'Action': 'GetCallerIdentity', 'Version': '2011-06-15'}, 'url': 'https://sts.amazonaws.com/', 'context': {'client_region': 'us-west-2', 'client_config': , 'has_streaming_input': False, 'auth_type': None}}
[DEBUG] 2018-09-29T15:44:59.161Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Event request-created.sts.GetCallerIdentity: calling handler
[DEBUG] 2018-09-29T15:44:59.161Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Event choose-signer.sts.GetCallerIdentity: calling handler
[DEBUG] 2018-09-29T15:44:59.162Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Calculating signature using v4 auth.
[DEBUG] 2018-09-29T15:44:59.180Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    CanonicalRequest:
POST
/
 
content-type:application/x-www-form-urlencoded; charset=utf-8
host:sts.amazonaws.com
x-amz-date:20180929T154459Z
x-amz-security-token:FQoGZXIvYXdzEKn//////////wEaDOOlIItIhtRakeAyfCLrAWPZXQJFkNrDZNa4Bny102eGKJ5KWD0F+ixFqZaW+A9mgadICpLRxBG4JGUzMtPTDeqxPoLT1qnS6bI/jVmXXUxjVPPMRiXdIlP+li0eFyB/xOK+PN/DOiByee0eu6bjQmkjoC3P5MREvxeanPY7hpgXNO52jSBPo8LMIdAcjCJxyRF7GHZjtZGAMARQWng6DJa9RAiIbxOmXpSbNGpABBVg/TUt8XMUT+p9Lm2Txi10P0ueu1n5rcuxJdBV8Jr/PUF3nZY+/k7MzOPCnzZNqVgpDAQbwby+AVIQcvVwaKsXePqubCqBTHxoh/Mo0ay+3QU=
 
content-type;host;x-amz-date;x-amz-security-token
ab821ae955788b0e33ebd34c208442ccfc2d406e2edc5e7a39bd6458fbb4f843
[DEBUG] 2018-09-29T15:44:59.181Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    StringToSign:
AWS4-HMAC-SHA256
20180929T154459Z
20180929/us-east-1/sts/aws4_request
7cf0af0e8f55fb1b9c0009104aa8f141097f00fea428ddf1654321e7054a920d
[DEBUG] 2018-09-29T15:44:59.181Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Signature:
c00de0a12c9ee0fce348df452f2833749b854915db58f8d106e3166545a70c43
[DEBUG] 2018-09-29T15:44:59.183Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Sending http request:
[INFO]  2018-09-29T15:44:59.201Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Starting new HTTPS connection (1): sts.amazonaws.com
[DEBUG] 2018-09-29T15:44:59.628Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    "POST / HTTP/1.1" 200 461
[DEBUG] 2018-09-29T15:44:59.628Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Response headers: {'x-amzn-requestid': '9f421e56-c3fe-11e8-b622-2d5da14a8dc9', 'content-type': 'text/xml', 'content-length': '461', 'date': 'Sat, 29 Sep 2018 15:44:58 GMT'}
[DEBUG] 2018-09-29T15:44:59.640Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Response body:
b'\n \n arn:aws:sts::268133297303:assumed-role/demo-boto3-logging/demo-boto3-logging\n AROAITTVSA67NGZPH2QZI:demo-boto3-logging\n 268133297303\n \n \n 9f421e56-c3fe-11e8-b622-2d5da14a8dc9\n \n\n'
[DEBUG] 2018-09-29T15:44:59.640Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Event needs-retry.sts.GetCallerIdentity: calling handler
[DEBUG] 2018-09-29T15:44:59.641Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    No retry needed.
[INFO]  2018-09-29T15:44:59.641Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Getting account ID...
[DEBUG] 2018-09-29T15:44:59.641Z    9ea3bbef-c3fe-11e8-8eb1-730a799b5405    Account ID: 268133297303
END RequestId: 9ea3bbef-c3fe-11e8-8eb1-730a799b5405
REPORT RequestId: 9ea3bbef-c3fe-11e8-8eb1-730a799b5405  Duration: 813.73 ms Billed Duration: 900 ms Memory Size: 128 MB Max Memory Used: 29 MB

boto3 can be very verbose in DEBUG so I recommend staying at INFO unless you’re actively troubleshooting.

Happy debugging!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles: