Use the cache action to speedup GitHub Actions
My Norwegian site, Pappaperm.com, is built using Jekyll and I run a simple CI-workflow on every pull request using GitHub Actions. The workflow checks both my writing and technical matters, such that no links return 404 or that every image also have an alternate text.
name: CI
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
env:
JEKYLL_VERSION: 3.8
steps:
- uses: actions/checkout@v2
- name: Check formatting, build and test site
run: |
docker pull andmos/markdownlint
docker run --rm -v $PWD:/usr/src/app/files andmos/markdownlint **/*.md -i ./_drafts -i ./_my_tags
docker pull jekyll/jekyll:$JEKYLL_VERSION
docker run --rm --volume="$PWD:/srv/jekyll:delegated" --volume="$PWD/tmp:/usr/local/bundle:delegated" jekyll/jekyll:$JEKYLL_VERSION /bin/bash -c "chmod a+wx . && bundle check || bundle install && rake ci"
The build first pulls a Docker image containing a linter for Markdown and uses it to check my Markdown-files for style. Then it pulls a Docker image with Ruby correctly configured for Jekyll 3.8 before running tests on the actual website. The tests are run using a simple rake action within the Jekyll-powered container instance and could do with a bit of explaining.
Since Jekyll is a Ruby-tool, Gemfile.lock
specifies the sites’s dependency graph, and bundle check
is run first to check if the dependencies are already satisfied. If they’re not, they’ll be installed into the tmp
directory using bundle install
. This directory is mounted within the container instance as /usr/local/bundle
, the place where the Ruby bundler stores the dependency graph.
All well and good and the build clocks in under 2 minutes. Sadly though, the majority of this time is spent doing unnecessary work. The sites’s dependency graph remains unchanged between runs more often than not. Thus, the script above wastes a lot of time downloading already known dependencies.
Enter GitHub Action’s Cache Action.
The top run utilizes caching and the bottom one doesn’t.
The difference is massive, caching cuts the build time nearly in half, and it’s easy to implement.
name: CI
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
env:
JEKYLL_VERSION: 3.8
steps:
- uses: actions/checkout@v2
- name: Check formatting
run: |
docker pull andmos/markdownlint
docker run --rm -v $PWD:/usr/src/app/files andmos/markdownlint **/*.md -i ./_drafts -i ./_my_tags
- name: Pull latest Jekyll image
run: docker pull jekyll/jekyll:$JEKYLL_VERSION
- uses: actions/cache@v1
with:
path: tmp
key: rubygems-v2-{{ hashFiles('Gemfile.lock') }}
- name: Build and test site
run: docker run --rm --volume="$PWD:/srv/jekyll:delegated" --volume="$PWD/tmp:/usr/local/bundle:delegated" jekyll/jekyll:$JEKYLL_VERSION /bin/bash -c "chmod a+wx . && bundle check || bundle install && rake ci"
The only real change is this new step:
- uses: actions/cache@v1
with:
path: tmp
key: rubygems-v2-{{ hashFiles('Gemfile.lock') }}
The new build step uses the cache action to cache the files under the tmp
directory, using as key the SHA-256 hash of Gemfile.lock
with a prefix. As long as the contents of Gemfile.lock
is the same as the previous run, the hashed value will be unchanged and the contents of tmp
will be fetched from the cache. Since the contents of Gemfile.lock
will change if dependencies are added or updated, the cache will then invalidate. Thus making this cache perfectly safe for use.
On a clean run, the caching action will not find any matching keys and all dependencies will be downloaded like normal. At the end of the workflow, provided the other steps were successful, the dependencies saved under tmp
will be cached using the specified key and are ready to be used as-is on subsequent runs.
On the next build, provided Gemfile.lock
remains unchanged, the cache action will restore the dependencies in less than a second.
All in all the cache action was easy to utilize, improved the build speed significantly and did not add any unneeded complexity. A good win.