A Checklist for Submitting Pull Requests

Hello!

Reviewing code is hard, especially because reviewers tend to inherit some responsibility for problems the code causes later. That can lead to churn while they try to develop confidence that new submissions are ready to merge.

I submit a lot of code for review, so I’ve been through a lot of that churn. Over the years I’ve found a few things that help make it easier for my reviewers to develop confidence in my submissions, so I decided to write a checklist. ✔️

The code I write lives in diverse repos governed by diverse requirements. A lot of the items in my checklist are there to help make sure I don’t mix up the issues I’m working on or the requirements of the repos I’m working in.

This isn’t a guide on writing good code. You can spend a lifetime on that topic. This is a quick checklist I use to avoid common mistakes.

This is written for Pull Requests submitted in git repos hosted on GitHub, but most of its steps are portable to other platforms (e.g. Perforce). It assumes common project features, like a contributing guide. Adjust as needed.

The Checklist

Immediately before submitting:

  1. Reread the issue.
  2. Merge the latest changes from the target branch (e.g. master).
  3. Reread the diff line by line.
  4. Rerun all tests. If the project doesn’t have automated tests, you can still:
    • Run static analysis tools on every file you changed.
    • Manually exercise new functionality.
    • Manually exercise existing functionality to make sure it hasn’t changed.
  5. Check if any documentation needs to be updated to reflect your changes.
  6. Check the rendering of any markup files (e.g. README.md) in the GitHub UI.
    • There are remarkable differences in how markup files render on different platforms, so it’s important to check them in the UI where they’ll live.
  7. Reread the project’s contributing guide.
  8. Write a description that:
    1. Links to the issue it addresses.
    2. Gives a plain English summary of the change.
    3. Explains decisions you had to make. Like:
      • Why you didn’t clean up that one piece of messy code.
      • How you chose the libraries you used.
      • Why you expanded an existing module instead of writing a new one.
      • How you chose the directory and file names you did.
      • Why you put your changes in this repo, instead of that other one.
    4. Lists all the tests you ran. Include relevant output or screenshots from manual tests.

There’s no perfect way to submit code for review. That’s why we still need humans to do it. The creativity and diligence of the engineer doing the work are more important than this checklist. Still, I’ve found that these reminders help me get code through review more easily.

Happy contributing!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

How to Grep in PowerShell

Hello!

In oldschool Linux shells, you search files for a string with grep. You’re probably used to commands like this (example results from an OSS repo):

grep -r things .
./terraform.tfstate.backup:              "./count_things.py"
./count_things.py:def count_things(query):
./count_things.py:    count_things()
./terraform.tf:  program = ["python", "${path.module}/count_things.py"]

It outputs strings that concatenate the filename and the matching line. You can pipe those into awk or whatever other command to process them. Standard stuff.

You can achieve the same results in PowerShell, but it’s pretty different. Here’s the basic command:

Get-ChildItem -Recurse | Select-String 'things'

count_things.py:7:def count_things(query):
count_things.py:17:    count_things()
terraform.tf:6:  program = ["python", "${path.module}/count_things.py"]
terraform.tfstate.backup:25:              "./count_things.py"

This part is similar. Get-ChildItem recurses through the filesystem and passes the results to Select-String, which searches those files for the string things. The output looks the same. File on the left, matching line on the right. That’s just friendly formatting, though. Really what you’re getting is an array of objects that each represent one match. Posh summarizes that array with formatting that’s familiar, but actually processing these results is completely different.

We could parse out details the Linux way by piping into Out-String to convert the results into strings, splitting on :, and so on, but that’s not idiomatic PowerShell. Posh is object-oriented, so instead of manipulating strings we can just process whichever properties contain the information we’re searching for.

First, we need to know what properties are available:

Get-ChildItem -Recurse | Select-String 'things' | Get-Member

   TypeName: Microsoft.PowerShell.Commands.MatchInfo

Name               MemberType Definition
----               ---------- ----------
Equals             Method     bool Equals(System.Object obj)
GetHashCode        Method     int GetHashCode()
GetType            Method     type GetType()
RelativePath       Method     string RelativePath(string directory)
ToEmphasizedString Method     string ToEmphasizedString(string directory)
ToString           Method     string ToString(), string ToString(string directory)
Context            Property   Microsoft.PowerShell.Commands.MatchInfoContext Context {get;set;}
Filename           Property   string Filename {get;}
IgnoreCase         Property   bool IgnoreCase {get;set;}
Line               Property   string Line {get;set;}
LineNumber         Property   int LineNumber {get;set;}
Matches            Property   System.Text.RegularExpressions.Match[] Matches {get;set;}
Path               Property   string Path {get;set;}
Pattern            Property   string Pattern {get;set;}

Get-Member tells us the properties of the MatchInfo objects we piped into it. Now we can process them however we need.

Select One Property

If we only want the matched lines, not all the other info, and we can filter out the Line property with Select-Object.

Get-ChildItem -Recurse | Select-String 'things' | Select-Object 'Line'

Line
----
def count_things(query):
    count_things()
  program = ["python", "${path.module}/count_things.py"]
              "./count_things.py"

Sort Results

We can sort results by the content of a property with Sort-Object.

Get-ChildItem -Recurse | Select-String 'things' | Sort-Object -Property 'Line'

terraform.tfstate.backup:25:              "./count_things.py"
count_things.py:17:    count_things()
terraform.tf:6:  program = ["python", "${path.module}/count_things.py"]
count_things.py:7:def count_things(query):

Add More Filters

Often, I search for a basic pattern like ‘things’ and then chain in Where-Object to filter down to more specific results. It can be easier to chain matches as I go than to write a complex match pattern at the start.

Get-ChildItem -Recurse | Select-String 'things' | Where-Object 'Line' -Match 'def'

count_things.py:7:def count_things(query):

We’re not limited to filters on the matched text, either:

Get-ChildItem -Recurse | Select-String 'things' | Where-Object 'Filename' -Match 'terraform'

terraform.tf:6:  program = ["python", "${path.module}/count_things.py"]
terraform.tfstate.backup:25:              "./count_things.py"

There are tons of things you can do. The main detail to remember is that you need Get-Member to tell you what properties are available, then you can use any Posh command to process those properties.

Enjoy freedom from strings!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Tox: Testing Multiple Python Versions with Pyenv

Hello!

I use Python’s tox to orchestrate a lot of my tests. It lets you set a list of versions in a tox.ini file (in the same directory as your setup.py), like this:

[tox]
envlist = py37, py38

[testenv]
allowlist_externals = echo
commands = echo "success"

Then you can run the tox command, it’ll create a venv for each version, and run your tests in each of those environments. It’s an easy way to ensure your code works across all the versions of Python you want to support.

But, if I install tox into a 3.8 environment and run the tox command in the directory where we created the tox.ini above, I get this:

tox
GLOB sdist-make: /Users/adam/Local/fiddle/setup.py
py37 create: /Users/adam/Local/fiddle/.tox/py37
ERROR: InterpreterNotFound: python3.7
py38 create: /Users/adam/Local/fiddle/.tox/py38
py38 inst: /Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 installed: example @ file:///Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 run-test-pre: PYTHONHASHSEED='2325607949'
py38 run-test: commands[0] | echo success
success
___________________________________________________________________________ summary ____________________________________________________________________________
ERROR:  py37: InterpreterNotFound: python3.7
  py38: commands succeeded

It found the 3.8 interpreter I ran it with, but it couldn’t find 3.7.

pyenv can get you past this. It’s a utility for installing and switching between multiple Python versions. I use it on OS X (⬅️ instructions to get set up, if you’re not already). Here’s how it looks when I have Python 3.6, 3.7, and 3.8 installed, and I’m using 3.8:

pyenv versions
  system
  3.6.11
  3.7.9
* 3.8.5 (set by /Users/adam/.pyenv/version)

Just having those versions installed isn’t enough, though. You still get the error from tox about missing versions. You have to specifically enable each version:

pyenv local 3.8.5 3.7.9
pyenv versions
  system
  3.6.11
* 3.7.9 (set by /Users/adam/Local/fiddle/.python-version)
* 3.8.5 (set by /Users/adam/Local/fiddle/.python-version)

This will create a .python-version file in the current directory that sets your Python versions. pyenv will read that file whenever you’re in that directory. You can also set versions that’ll be picked up in any folder with the pyenv global command.

Now, tox will pick up both versions:

tox
GLOB sdist-make: /Users/adam/Local/fiddle/setup.py
py37 inst-nodeps: /Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py37 installed: example @ file:///Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py37 run-test-pre: PYTHONHASHSEED='1664367937'
py37 run-test: commands[0] | echo success
success
py38 inst-nodeps: /Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 installed: example @ file:///Users/adam/Local/fiddle/.tox/.tmp/package/1/example-0.0.0.zip
py38 run-test-pre: PYTHONHASHSEED='1664367937'
py38 run-test: commands[0] | echo success
success
___________________________________________________________________________ summary ____________________________________________________________________________
  py37: commands succeeded
  py38: commands succeeded
  congratulations :)

That’s it! Now you can run your tests in as many verions of Python as you need.

Happy testing,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell on OS X: Setting Your Path Variable

Hello!

There are two tricky little problems when setting your path variable in PowerShell. Here’s how to get past them.

First, lots of guides show things like this:

$Env:Path += "PATH_STRING"

Which works on Windows but won’t work on OS X. The variable name has to be all-caps:

$Env:PATH += "PATH_STRING"

Next, the separator between path elements on Windows is ;, but on OS X it’s :. Swap them and you should be good to go:

# Windows-only, won't work:
# $Env:PATH += ";/Users/adam/opt/bin"

# Works on OS X:
$Env:PATH += ":/Users/adam/opt/bin"

Small details, but they were remarkably fiddly to figure out the first time I ran in to them. Lots of people use Posh on Windows, so lots of guides and docs won’t work on Mac. You may find similar compatibility problems in scripts, too. Hopefully this saves you from some frustration.

Happy scripting,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell: Python venv Missing Activate.ps1

Hello!

Ran in to a weird problem this week: I created a Python 3.7.9 venv, but I couldn’t activate it in PoweShell (my shell of choice). The Activate.ps1 script was missing.

The core docs for 3.7 list VENV/Scripts/Activate.ps1 as the command to activate venvs in PowerShell (which seemed odd because I’m used to VENV/bin/activate from Bash, but whatever). The Scripts directory didn’t even exist:

gci ./test_venv_379/

    Directory: /Users/adam/Local/fiddle/test_venv_379

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          10/22/2020  9:28 AM                bin
d----          10/22/2020  9:28 AM                include
d----          10/22/2020  9:28 AM                lib
-----          10/22/2020  9:28 AM             98 pyvenv.cfg

I recreated the venv and got the same results. I made new venvs with 3.7.8 and 3.6.11, and again the same results. When I made a 3.8.5 venv, though, it had a VENV/bin/Activate.ps1 (which works great).

gci ./test_venv_385/bin

    Directory: /Users/adam/Local/fiddle/test_venv_385/bin

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-----          10/22/2020  9:13 AM           2236 activate
-----          10/22/2020  9:13 AM           1288 activate.csh
-----          10/22/2020  9:13 AM           2440 activate.fish
-----          10/22/2020  9:13 AM           8834 Activate.ps1
-----          10/22/2020  9:13 AM            263 easy_install
...

Then I read the docs for 3.8: VENV/Scripts/Activate.ps1 is the PowerShell activation script but VENV/bin/Activate.ps1 is the PowerShell Core activation script. The 3.7 and 3.6 docs don’t make this distinction, which I’d bet is because PowerShell Core wasn’t supported until 3.8. I’m running Posh on Mac, so of course I’m running Posh Core (only Core supports Mac and Linux).

I suspect the VENV/Scripts/Activate.ps1 file was missing from both venvs because Python detected my shell was Posh Core, which it didn’t support. That would also explain why my 3.8 venv only had a VENV/bin/Activate.ps1 file, the file needed by Posh Core.

Anyway, if you upgrade to 3.8 (I used 3.8.5) you should be good to go.

If you can’t upgrade. Upgrade! But if you really really can’t, you can still use a 3.7 venv in Posh Core. Just call the executables by path instead of activating:

./test_venv_385/bin/python --version
Python 3.8.5

Hope that gets you past the problem!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Terratest Good Practices: Table-Driven Tests

Hello!

Terratest is a common way to run integration tests against terraform modules. I use it on many of the modules I develop. If you haven’t used it before, check out its quickstart for an example of how it works.

For simple cases, the pattern in that quickstart is all you need. But, bigger modules mean more tests and pretty soon you can end up swimming in all the cases you have to define. Go has a tool to help: table-driven tests. Here’s what you need to get them set up for terratest (Dave Cheney also has a great article on them if you want to go deeper).

First, let’s look at a couple simple tests that aren’t table-driven:

package test

import (
	"testing"

	"github.com/gruntwork-io/terratest/modules/terraform"
	"github.com/stretchr/testify/assert"
)

func TestOutputsExample(t *testing.T) {
	terraformOptions := &terraform.Options{
		TerraformDir: ".",
	}
	defer terraform.Destroy(t, terraformOptions)
	terraform.InitAndApply(t, terraformOptions)

	one := terraform.Output(t, terraformOptions, "One")
	assert.Equal(t, "First.", one)
	two := terraform.Output(t, terraformOptions, "Two")
	assert.Equal(t, "Second.", two)
}

Easy. Just repeat the calls to terraform.Output and assert.Equal for each test and assert it’s what you expect. Not a problem, unless you have dozens or hundreds of tests. Then you end up with a lot of duplication.

You can de-duplicate the repeated calls by defining your test cases in a slice of structs (the “table”) and then looping over the cases. Similar to adaptive modeling. Like this:

package test

import (
	"testing"

	"github.com/gruntwork-io/terratest/modules/terraform"
	"github.com/stretchr/testify/assert"
)

func TestOutputsTableDrivenExample(t *testing.T) {
	terraformOptions := &terraform.Options{
		TerraformDir: ".",
	}
	defer terraform.Destroy(t, terraformOptions)
	terraform.InitAndApply(t, terraformOptions)

	outputTests := []struct {
		outputName    string
		expectedValue string
	}{
		{"One", "First."},
		{"Two", "Second."},
	}

	for _, testCase := range outputTests {
		outputValue := terraform.Output(t, terraformOptions, testCase.outputName)
		assert.Equal(t, testCase.expectedValue, outputValue)
	}
}

Now, there’s just one statement each for terraform.Output and assert.Equal. With only two tests it actually takes a bit more code to use a table, but once you have a lot of tests it’ll save you.

That’s it! That’s all table-driven tests are. Just a routine practice in Go that work as well in terratest as anywhere.

Happy testing,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

CloudWatch JSON Logs: aws_request_id

Hello!

In a previous article, I showed how to make AWS lambda functions log JSON objects to CloudWatch (⬅️ start there if you’re new to JSON logs). The pattern in that post had a flaw: it didn’t pass the aws_request_id. I was writing small functions to glue together bits of deployment automation and I didn’t need it. It was easy to correlate logs just with timestamps. Not every case is that simple. Sometimes you need that ID. Fortunately, Paulie Pena IV gave me some tips on how to pass it through.

We can look up the aws_request_id in the lambda function’s context object. To add it to log events, the python-json-logger library supports custom fields. To add a field for all log events, we need to subclass jsonlogger.JsonFormatter. Then we can save the ID into the new field and resolve it in the format string. Here’s the code from before with that change:

import logging
from pythonjsonlogger import jsonlogger

class CustomJsonFormatter(jsonlogger.JsonFormatter):
    def __init__(self, *args, **kwargs):
        self.aws_request_id = kwargs.pop('aws_request_id')
        super().__init__(*args, **kwargs)
    def add_fields(self, log_record, record, message_dict):
        super().add_fields(log_record, record, message_dict)
        log_record['aws_request_id'] = self.aws_request_id

def setup_logging(log_level, aws_request_id):
    logger = logging.getLogger()

    # Testing showed lambda sets up one default handler. If there are more,
    # something has changed and we want to fail so an operator can investigate.
    assert len(logger.handlers) == 1

    logger.setLevel(log_level)
    json_handler = logging.StreamHandler()
    formatter = CustomJsonFormatter(
        fmt='%(aws_request_id)s %(asctime)s %(levelname)s %(name)s %(message)s',
        aws_request_id=aws_request_id
    )
    json_handler.setFormatter(formatter)
    logger.addHandler(json_handler)
    logger.removeHandler(logger.handlers[0])

def lambda_handler(event, context):
    setup_logging(logging.DEBUG, context.aws_request_id)
    logger = logging.getLogger()
    logger.info('Huzzah!')

Now our logs contain the aws_request_id:

WithRequestId

Hope that helps,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell: Sort Hash Table Into Ordered Dictionary

Hello!

PowerShell’s Hash Tables are unordered. The keys don’t always come back in the same order you entered them:

PS /Users/adam/Local/fiddle> $HashTable = @{                   
>>     'a' = 1
>>     'b' = 2
>>     'c' = 3
>>     'd' = 4
>> }
PS /Users/adam/Local/fiddle> $HashTable

Name                           Value
----                           -----
c                              3
b                              2
d                              4
a                              1

I created the hash in the order a, b, c, d but I got back c, b, d, a. That’s normal.

PowerShell also has Ordered Dictionaries that work like Hash Tables but preserve order:

PS /Users/adam/Local/fiddle> $OrderedDictionary = [ordered]@{
>>     'a' = 1
>>     'b' = 2
>>     'c' = 3
>>     'd' = 4
>> }
PS /Users/adam/Local/fiddle> $OrderedDictionary

Name                           Value
----                           -----
a                              1
b                              2
c                              3
d                              4

Today I had large hash and I needed to convert it to a dictionary that was sorted by key name. The hash was returned by a library I didn’t control so I couldn’t just re-define it as a dictionary. I had to convert it. There are a couple ways, but I found this was the cleanest:

$HashTable = @{
    'd' = 4
    'a' = 1
    'b' = 2
    'c' = 3
}
$OrderedDictionary = [ordered]@{}
foreach ($Item in ($HashTable.GetEnumerator() | Sort-Object -Property Key)) {
    $OrderedDictionary[$Item.Key] = $Item.Value
}
$OrderedDictionary

This outputs a dictionary that has been sorted by its keys:

PS /Users/adam/Local/fiddle> ./Ordering.ps1

Name                           Value
----                           -----
a                              1
b                              2
c                              3
d                              4

Because it’s a dictionary and not a regular hash, it’ll keep that ordering.

Happy scripting!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

Azure Pipelines: If Statements

Hello!

This is about Azure YAML Pipelines, not Azure Classic Pipelines. The docs explain the differences.

Everything I show here is in the docs but the pieces are scattered and it took some work to find everything. This article collects what I found.

Pipelines support two kinds of conditions. The first isn’t an if statement, but it acts like one by letting you use expressions to choose when jobs, steps, or stages run. Like this example from the doc:

jobs:
- job: Foo
  steps:
  - script: echo Hello!
    condition: always()

There are a bunch of different functions you can use.

These conditions are handy but haven’t been enough on their own. It’s not enough to turn jobs on and off, I need to automate decisions about what config to pass them. Fortunately, you can also conditionally set variables and parameters using if statements. The expressions doc calls it “conditional insertion” and gives examples like this:

variables:
  ${{ if eq(variables['Build.SourceBranchName'], 'master') }}:
    stageName: prod

And this:

steps:
- task: PublishPipelineArtifact@1
  inputs:
    targetPath: '$(Pipeline.Workspace)'
    ${{ if eq(variables['Build.SourceBranchName'], 'master') }}:
      artifact: 'prod'

You can also use these in pipeline templates. The templates doc has its own section on conditional insertion that gives examples of lists of tasks:

steps:
- ${{ if eq(parameters.toolset, 'msbuild') }}:
  - task: msbuild@1
  - task: vstest@2

Last, Microsoft has a repo of example pipeline YAML and its each expressions also happen to show if statements. I’m linking the latest commit (at the time I wrote this) because it’s technically not a doc about if statements and future versions might not use the same examples. You may also want to check out the latest version.

DevOps is all about managing config and conditional variables and parameters are essential to doing that well. Most of the pipelines I build rely on this second kind of condition, the if statement. Hopefully it helps with yours.

Happy automating!

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles:

PowerShell On Apple Mac OS X

Hello!

I love PowerShell. Its object-oriented nature is welcome relief from the endless string parsing of bash.

Good news! PowerShell Core installs on Windows, OS X, and Linux. I made it my default in Terminal on OS X. Here’s how.

I tested this on OS X Catalina (10.15) with PowerShell Core 7.

First, install from Homebrew like the docs say:

brew cask install powershell

Then open Terminal, select Terminal > Preferences in the menu, and set shells to open with /usr/local/bin/pwsh:

TerminalDefaultShell

Now quit and re-open Terminal. Boom! 💥 You’re standardized on PowerShell.

If you’re like me, though, you also need to add a bunch of stuff to your path. It’s similar to Linux-land, just update an environment variable in your profile, but there were gotchas.

PowerShell exposes an Environment Provider that works like a filesystem drive. That’s where your path is set:

PS /Users/adam> Get-Item Env:PATH

Name                           Value
----                           -----
PATH                           /usr/local/microsoft/powershell/7:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

When I wrote this, the docs gave examples that used both Env:path and Env:Path, but neither worked on OS X. I had to use Env:PATH. It’s tricky because creating the wrong one doesn’t cause errors it just doesn’t do what you want.

The second gotcha was easier. In Windows the separator is ; so that’s what most examples use, but in OS X it’s :. I was copy/pasting from Windows code samples before I noticed the problem.

Just like Linux, modifications to Env:PATH are specific to your session. They’re lost on exit. We can make them permanent. First, find your PowerShell profile:

PS /Users/adam> $PROFILE
/Users/adam/.config/powershell/Microsoft.PowerShell_profile.ps1

Create that path and file if it doesn’t exist. Put this command in the Microsoft.PowerShell_profile.ps1 script:

$Env:PATH += ":/Users/adam/opt/path"

Now quit and re-open Terminal and your path should be up to date:

PS /Users/adam> Get-Item Env:PATH

Name                           Value
----                           -----
PATH                           /usr/local/microsoft/powershell/7:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/adam/opt/path

That’s it!

Happy automating,

Adam

Need more than just this article? We’re available to consult.

You might also want to check out these related articles: