Hugo Workflow

A local-only CI/CD solution

I am a huge fan of fail fast view as part of my CI/CD, DevOps, and even general development practices. There are some pretty crazy ideas out there about when testing should be done, and I have known some developers whom insist that doing pre-commit and pre-push testing just slows down development. I always found this a little wild, I have never had pre-commit tests take that long, but I realize inexperienced programmers may very well be running their entire test suite on every commit. Or perhaps they maybe followed some random blog/post that didn’t walk them through the list of considerations when setting up tests at every point of their dev process (would this be more Cargo Cult?). Since there seems to be a bit of a holy-view on this particular point, I am just going to stick with my basic preferences.

  1. pre-commit: A general rule I have is “don’t push emberrassing pull-requests". Test everything locally that you can test that has minimal impact to the dev process. For me this means that super-simple tests that can be run locally should be run as part of pre-commit. Syntax checks, lint, git commit message, spelling, etc. Even on relatively large code-bases, a given project can run a battery of tests in just a few seconds tops (most of mine finish in less than 1 second). This is in a large-part due to pre-commit selectively running specific tests which target a specific file pattern and only against the files which changed during the commit. Seriously, pushing a pull-request and then submitting fix after fix after fix while waiting for a GitHub pipeline to re-run takes way more time than doing these tests locally. Besides, it is far better to be submitting fixes now, instead of submitting bugs to fix later.

  2. pre-push: Support some mechanism for doing a heavier battery of tests as part of pre-push. This is the point where I run most of my unit tests and report code coverage metrics about the topic branch I am pushing. Well written unit tests can take some time, and it is a little bit of a nightmare to wait for them on every commit, particularly if the code you are working on is the writing of your unit tests (think about it). In my experience, the earliest you can run these and still see a great return from the dev process is during pre-push. If you are processing code coverage results then you can also put some wonderful requirements in here such as preventing any code changes which reduce the overall code coverage (let that sink in). This is something I have been integrating within my workflows more and more.

  3. post-receive: This is the point when we get into using tools such as Drone, GitHub Workflows, GitLab Workflows, Jenkins, etc. At this stage in the development workflow we start looking at spinning up virtual resources to perform things such as integration tests. This stage can be pretty intense with regards to both the types of tests running and the resources needed to run them. It is at this point where we may start to get into a portion of the CI/CD workflow in which we have one foot in development and one foot in operations. Further more, it isn’t something that I can really do with the Hugo workflow that I have been developing for this website. So I will leave this topic for a later post.

note: At every stage of my CI/CD I re-run every test defined in previous stages. This assures me that my tests are consistent, that my product is deterministic, we aren’t stuck in the “it works on my system” mentallity, and we catch anyone who might have bi-passed a test using git commit --no-verify… because .. it happens.

My blog generally follows the above pattern, though there are some minor deviations that I am making due to the infrastructure I am working with.

  1. My Git repository is in KeyBase. So I have no access to a post-receive hook by which to run integration tests (yet). Further more, neither Git nor pre-commit support implementation of a post-push hook. The closest we get to is pre-commit supporting a manual hook which you can trigger separately. Though doing so isn’t much better than manually executing some scripts/ from w/in the repository.

  2. I am leveraging KBFS to host the website using KeyBase Sites. There is some interesting aspects to how this works that allows me to leverage Blue/Green testing and A/B deploys w/out any more infrastructure than my laptop. note: more on this in a later post.

  3. pre-commit does not support any way to enforce hook dependencies such that the starting of one hook is dependent on the success of another. To achieve this we, again, must resort to some external script work.

So .. at the end of my previous post I was still trying to find a clean way of integrating a link checker into my local pre-commit workflow. I was originally hoping there was some simple way to process all the Markdown and such, but in hindsight it should have been fairly self-evident that that sort of test isn’t entirely possible to do before Hugo has rendered the site. The best that could be hoped for is a Hugo module that would potentially test all links (and which could be toggled on/off via some sort of CLI variable).

To make matters even more interesting, pre-commit does not have any direct way to leverage/utilize local scripts/tools defined as part of the repository into your pre-commit process. Pre-commit does let you execute scripts/commands, but it adds a burden that said scripts/commands must be defined in a separate repository which has its own .pre-commit-hooks.yaml, which seems like an unnecessary burden as well as being a huge blocker to anyone whom uses a monorepo dev model. The only other alternative you get into is wiring up tools such as Python’s Nox or Tox to orchestrate running locally defined scripts, which to me seems to be quickly stepping into David Wheeler’s famous quote:

All problems in computer science can be solved by another level of indirection, except for the problem of too many layers of indirection. – David J. Wheeler

I have opted for an alternative approach of creating a new pre-commit-hook which acts as a simple way to launch arbitrary scripts/commands from the current repository. I have named this hook pre-commit-exec (which you can find here). With this hook in-place I now have a semi-sane way to do things such as run a local Hugo build which I can then process with W3C’s link-checker. Alternatively I could run a script which leverages htmltest or Python’s linkchecker, but the W3C tool supports a local --masquerade option which allows me to locally test a statically rendered website w/out it being pushed out to a web server somewhere. Curiously, to make all this work I need a script which performs both a Hugo build and runs the link-checker as part of the same command, and then wire that into pre-commit.

For my local orchestration I defined a maybe-overly-complicated shell script:

scripts/link-check-sh:

#!/bin/sh
set -e

die() { echo "error: '$*'" >&2; exit 1; }

test -e "$(command -v 'hugo')" || die "command not found 'hugo'"

hugo --cleanDestinationDir --baseUrl "file://${PWD}/public/"

CHECK_ALL=
for arg; do
	case "${arg}" in (--all) CHECK_ALL='true';; esac
done ; unset arg

if test -e "$(command -v 'checklink')"; then
	# W3C's link-checker <https://github.com/w3c/link-checker>
	set -- 'checklink' '--quiet' ${CHECK_ALL=-"--location"}${CHECK_ALL=+"--recursive"}
elif test -e "$(command -v 'linkchecker')"; then
	# Python linkchecker <https://github.com/linkchecker/linkchecker>
	set -- 'linkchecker' ${CHECK_ALL:+"--check-extern"}
else
	die 'no supported link checker tool found'
fi

exec "$@" "${PWD}/public/index.html"

And I have added a collection of new rules to my .pre-commit-config.yaml.

- repo: https://github.com/major0/pre-commit-exec
  rev: v0.1.2
  hooks:
    - id: command
      name: linkcheck local links
      alias: link-check
      always_run: true
      args: ["sh", "scripts/link-check.sh"]
- repo: https://github.com/major0/pre-commit-exec
  rev: v0.1.2
  hooks:
    - id: command
      alias: link-check-all
      name: linkcheck all links
      stages: ["manual"]
      verbose: true
      args: ["sh", "scripts/link-check.sh", "--all"]
      always_run: true

note: I am using the default public/ directory as a staging location for development while leveraging a specialized release script for building/deploying the website to Keybase. This way simply typing hugo won’t accidentally result in publishing work-in-progress to the public website.

I have defined the same repo with the same id but I have added a pre-commit alias which allows me to call each hook explicitly from the CLI. One of these hooks is set to the manual stage which means it will not run on every commit, but instead only runs when I want to do a really heavy-lifting test as part of a manual audit (e.g. pre-commit run --hook-stage manual link-check-all). Alternatively I could run this test as part of a pre-push rule (which I may migrate to later).

I am relatively happy with pre-commit-exec and the options it has opened up for me. It is relatively easy for me to abuse this particular hook and leverage it to enact entire local workflows written in a variety of commands that I execute locally, though I suspect I will be trying hard to avoid using pre-commit-exec outside of being a generic fall-back as it allows me too much room to make highly specialized solutions which are not readily re-usable by the overall community. To me that seems to go against the core model of pre-commit. Perhaps a better approach for this would have been to build a Docker image which contains Hugo and link-checker and advertise that as a generic solution for this particular problem. Something for me to consider for the future.


See also