Crafting Test Matrices
In Part 1 I described the problem with how GitHub Actions Matrices work and their limitations. I also described a solution using JSON strings but I left with the question;
What if the JSON string was created on the fly? by a PowerShell script? š¤
Now I'll go deeper into creating the JSON string and some of the interesting ways we can use this feature
So lets start with a fresh GitHub Action Workflow called dynamic-test-matrix
.
name: dynamic-test-matrix
on:
push:
branches:
- main
pull_request:
branches:
- main
jobs:
matrix:
name: Generate test matrix
runs-on: ubuntu-latest
outputs:
matrix-json: ${{ steps.set-matrix.outputs.matrix }}
steps:
- uses: actions/checkout@v2
- id: set-matrix
shell: pwsh
# Use a small PowerShell script to generate the test matrix
run: "& .github/workflows/create-test-matrix.ps1"
run-matrix:
needs: [matrix]
strategy:
fail-fast: false
matrix:
include: ${{ fromJson(needs.matrix.outputs.matrix-json) }}
name: "${{ matrix.job_name }}"
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- name: Run Command
shell: pwsh
run: |
Write-Host "Run '${{ matrix.command }}'"
Just like the example in Part 1, there are two jobs:
matrix
: Runs a PowerShell script to create the test matrixrun-matrix
: Pretends to the run the command for each item in the matrix
The matrices creation job
Let's look the matrix
job:
name: Generate test matrix
A nice friendly name in the UI
runs-on: ubuntu-latest
The job will run on an Ubuntu based runner. Why Ubuntu instead of Windows? There is no reason why it has to be Windows specific so why not make it a cross platform script.
outputs:
matrix-json: ${{ steps.set-matrix.outputs.matrix }}
This instructs GitHub Actions that it will output a Job parameter called matrix-json
, and that its value comes from the step output matrix
, from the step called set-matrix
.
Next comes the steps for this job
steps:
- uses: actions/checkout@v2
The steps need the PowerShell script to run, so we do need to checkout the project source code first
- id: set-matrix
shell: pwsh
# Use a small PowerShell script to generate the test matrix
run: "& .github/workflows/create-test-matrix.ps1"
The set-matrix
step the runs the create-test-matrix.ps1
PowerShell Script (We'll get to the script soon). Note that the id
is important here as this is name used in the job output above.
Using the matrices
Let's look the run-matrix
job:
needs: [matrix]
This job (run-matrix
) needs the output of the matrix
job, so this instructs GitHub Actions to wait until the matrix
job completes.
strategy:
fail-fast: false
For the purposes of this blog post, I want all of the test cells to run even if other ones have failed. The GitHub Action Documentation has more examples of the matrix configuration
matrix:
include: ${{ fromJson(needs.matrix.outputs.matrix-json) }}
This is where the magic happens, where we take the output from matrix job, to configure the matrix for this job. This is similar to the "Sharing matrix information between jobs" post, where it deserialises the JSON string from the matrix
job output parameter called matrix-json
.
Note that I'm using include
instead of cfg
like the post above. This makes it easier to reference matrix items later in the steps. For example, instead of matrix.cfg.os
we can just use matrix.os
.
name: "${{ matrix.job_name }}"
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- name: Run Command
shell: pwsh
run: |
Write-Host "Run '${{ matrix.command }}'"
And then finally the steps for the job.
matrix.job_name
is used to dynamically change the friendly name of the jobmatrix.os
is used to change which job runner is usedmatrix.command
is used to show what PowerShell command could be run
Note - This example won't actually run any PowerShell commands but you can make GitHub Actions run a PSake command, or PSBuild command, or run other PowerShell scripts in your project.
Creating matrices in PowerShell
Here's an example script that will output a two cell matrix: One cell for Windows and one for Ubuntu.
This would be saved as .github/workflows/create-test-matrix.ps1
. If you change the name of this script, make sure you also change the GitHub Actions workflow to use the new name too.
$Jobs = @()
@('ubuntu-latest', 'windows-latest') | ForEach-Object {
$Jobs += @{
job_name = "Run $_ jobs"
os = $_
command = "$_ command"
}
}
Write-Host "::set-output name=matrix::$($Jobs | ConvertTo-JSON -Compress))"
So what's going on here:
$Jobs = @()
We store the Job information in the $Jobs
variable. Initially we have no jobs
@('ubuntu-latest', 'windows-latest') | ForEach-Object {
Instead of hardcoding, we can use loops and enumeration to create matrix cells
$Jobs += @{
job_name = "Run $_ jobs"
os = $_
command = "$_ command"
}
To create a matrix cell we add a HashTable to the $Jobs
array. Each key in the HashTable appears as a matrix variable in the GitHub Actions Workflow. In this example we are setting three keys; job_name
, os
and command
. These are then used in the Workflow as matrix.job_name
, matrix.os
and matrix.command
respectively. And each matrix cell does not has to have the same keys. It's completely up to you to what each matrix cell has.
Write-Host "::set-output name=matrix::$($Jobs | ConvertTo-JSON -Compress))"
And then lastly we use the set-output
magic text and convert the Jobs into a JSON string. Note the use of -Compress
here. GitHub Actions doesn't allow line breaks in the output so compress is used to create a JSON string on a single line.
When you run this script you get the following output:
::set-output name=matrix::[{"job_name":"Run ubuntu-latest jobs","command":"ubuntu-latest command","os":"ubuntu-latest"},{"job_name":"Run windows-latest jobs","command":"windows-latest command","os":"windows-latest"}])
Which, let's be honest is hard to read. Let's add a script parameter called Raw
which will output the JSON in a readable way for humans
param(
[Switch]$Raw,
)
$Jobs = @()
@('ubuntu-latest', 'windows-latest') | ForEach-Object {
$Jobs += @{
job_name = "Run $_ jobs"
os = $_
command = "$_ command"
}
}
if ($Raw) {
Write-Host ($Jobs | ConvertTo-JSON)
} else {
# Output the result for consumption by GitHub Actions
Write-Host "::set-output name=matrix::$($Jobs | ConvertTo-JSON -Compress))"
}
So now running .github/workflows/create-test-matrix.ps1 -Raw
[
{
"job_name": "Run ubuntu-latest jobs",
"command": "ubuntu-latest command",
"os": "ubuntu-latest"
},
{
"job_name": "Run windows-latest jobs",
"command": "windows-latest command",
"os": "windows-latest"
}
]
Seeing the GitHub Action in action
-
Running the workflow we see first that the
Generate test matrix
job is running but there are not yet any subsequent jobs -
Once the generation job is complete, two new jobs appear. These are the two jobs we specified in our PowerShell script:
Run ubuntu-latest jobs
andRun windows-latest jobs
-
When we look at the output of these commands we can see that the Ubuntu job has
Write-Host "Run 'ubuntu-latest command'
and the Windows job hasWrite-Host "Run 'windows-latest command'
, just like we specified in our PowerShell script
Back to the original problem ...
Back to Puppet Editor Services ... Now I could create a PowerShell script which created a JSON string with all of the test cases I needed (12 in total), I could easily make out what each cell matrix did, and I could easily add and remove test cases in the future.
All of the code for this is in Pull Request 288 of the Puppet Editors Services project.
Going further
Now that we are using PowerShell to generate the matrix, it opens up more opportunities:
Different tests for different people
You could add some tests if the person who raised the Pull Request had the name 'glennsarti'
if ($ENV:GITHUB_ACTOR -eq 'glennsarti') {
$Jobs += @{
# ...
}
}
Note - See the documentation for the full list of GitHub Action Environment Variables
Different tests based on the changed files
What if we could detect which files were being changed in a Pull Request and then change the testing. For example:
Let's say we had a PowerShell module which included documentation. If a Pull Request was ONLY changing the documentation files then there'd be no need to run PowerShell script tests. And vice versa.
Fortunately git can help us here. We can use git diff --name-only
to list all of the files that are affected.
param(
[Switch]$Raw,
[String]$FromRef
)
$Jobs = @()
$TestModule = $false
$TestDocs = $false
if (![String]::IsNullOrWhiteSpace($FromRef)) {
(& git diff --name-only $FromRef...HEAD) | ForEach-Object {
if ($_ -like 'src/*') { $TestModule = $true }
if ($_ -like 'docs/*') { $TestDocs = $true }
}
}
# Make sure we test something
if (!$TestModule -and !$TestDocs) {
$TestModule = $true
$TestDocs = $true
}
@('ubuntu-latest', 'windows-latest') | ForEach-Object {
if ($TestModule) {
$Jobs += @{
job_name = "Test PowerShell Module - $_"
os = $_
command = "psake test-powershell"
}
}
if ($TestDocs) {
$Jobs += @{
job_name = "Test Documentation - $_"
os = $_
command = "psake test-documentation"
}
}
}
if ($Raw) {
Write-Host ($Jobs | ConvertTo-JSON)
} else {
# Output the result for consumption by GitHub Actions
Write-Host "::set-output name=matrix::$($Jobs | ConvertTo-JSON -Compress))"
}
Let's go through this script in more detail;
[String]$FromRef
)
$Jobs = @()
$TestModule = $false
$TestDocs = $false
We add a new parameter called FromRef
which specifies where in the git history we compare from. Typically this is the branch the Pull Request is targeted against. We also add two flag variables TestModule
and TestDocs
which we'll use to track whether we should test the Module and Documentation.
if (![String]::IsNullOrWhiteSpace($FromRef)) {
(& git diff --name-only $FromRef...HEAD) | ForEach-Object {
if ($_ -like 'src/*') { $TestModule = $true }
if ($_ -like 'docs/*') { $TestDocs = $true }
}
}
If the FromRef
is set then we run the git diff
command.
--name-only
means it only returns the filename instead of the full diff for each file$FromRef...HEAD
means to compare from$FromRef
, to the current commit (This is known as HEAD)
Then for each file that has been changed we test if it's a PowerShell Module file (src/*
) or a documentation file (docs/*
) and set the appropriate flag (TestModule or TestDocs)
# Make sure we test something
if (!$TestModule -and !$TestDocs) {
$TestModule = $true
$TestDocs = $true
}
It is possible that a Pull Request doesn't change either the Module or Documentation so test everything just in case.
if ($TestModule) {
$Jobs += @{
job_name = "Test PowerShell Module - $_"
os = $_
command = "psake test-powershell"
}
}
if ($TestDocs) {
$Jobs += @{
job_name = "Test Documentation - $_"
os = $_
command = "psake test-documentation"
}
}
And now only add the PowerShell Module and Documentation testing if the appropriate flag is set.
The last change is to the GitHub Actions Workflow.
Previously we called the PowerShell script using
run: "& .github/workflows/create-test-matrix.ps1"
and instead now we can pass through the -FromRef
argument
run: "& .github/workflows/create-test-matrix.ps1 -FromRef '${{ github.base_ref }}'"
The github.base_ref
variable comes the GitHub Actions context syntax
The base_ref or target branch of the pull request in a workflow run. This property is only available when the event that triggers a workflow run is a pull_request.
Wrapping Up
I migrated from Travis and AppVeyor CI using a custom PowerShell matrix creation script which now gives me more power to cater for different testing scenarios. And we also saw what else you could achieve with this technique; using it to change testing based on who made the change, or changing the testing requirements based on what was changed.
Resources
You can find the original version of this post on Glenn's personal blog.