Hugo + Pre-Commit

How to get to were I want to be...

Now that I am done with my first pass of working with Hugo I still have areas to work on. One of the ones that is currently the most annoying to me is fixing the Cache Busting situation w/in the theme I picked. As I have not been able to locate a simple+direct way of moving all assets to leveraging Cache Busting names, I am stuck with doing some heavy lifting to the theme. As this sort of work is effectively tedium-extrodanare, and will be highly prone to typos, missed work, etc… it is pretty much in my best interest to set up a collection of tests, AND as I am publishing directly into KeyBase Pages I can add a bit of magic to my Git Hooks to automate the entire website.

As a general rule I consider it a best-practice to clearly outline my goals before I get started on anything (which also means a lot of my time is spent pondering, rambling, and taking notes). The goal, w/in the scope of my goals, is to define done criteria of a task (not the whole project). I suppose I could go on a whole rambling about the benefits of defining bite sized tasks with well defined done criteria vs a minimal viable product and how both are distinct while also being core tenants in good software design .. but .. today I am rambling about CI/CD (and I have no doubt I will rant about that and more at some later opertunity).

For the most part my goals are fairly simple:

  1. Test every commit w/ pre-commit.

  2. Deployment via git-push (will address this in my next post).

The pre-commit situation is made simpler thanks to the pre-commit project. I simply need to decide what sort of tests I need to focus on, at least initially. I can always add tests later once pre-commit is in-place.

So for now I can just kick a rough outline of:

  • Validate all comments in my git-commit’s conform. I have been relying on commit message testing for so long, going w/out it would feel a bit odd to me.. a little like making an attempt to act completely normal in public.

  • Test for spaces before tabs. On that note: Please, everyone, start highlighting spaces-before-tabs in your editors!

  • Test for spaces before newline. Again: Please, everyone, start adding highlighting to whitepace-before-newline in your editors!

  • Valid ending newline in all documents. Curiously, a large number of editors don’t seem to think that all lines should be terminated with a newline in a document. Imagine a world where we treated text-documents as if they were all C-Strings and used a \0 to delimit lines instead of \n. This is exactly what happens when your editor makes an arbitrary decisions that the last line in the file doesn’t need to be terminated with \n. Curiously, this is also part of the ANSI C and POSIX standards.

    3.206 Line A sequence of zero or more non-<newline> characters plus a terminating <newline> character.

    If your line doesn’t end in a <newline> then it isn’t actually a line w/in the scope of the definition. Some compilers/interpretters are strict enough about this that they will litterately discard the last line of your code. Sure, you could argue that the compiler is lazy, but it would be hard to argue that the compiler is lazier than you when it is so easy to fix a simple editor setting. See: https://stackoverflow.com/questions/729692/why-should-text-files-end-with-a-newline

    It is insane that some editors just arbitrarily throw this out .. AHEM: VSCode, Notepad++, and pretty much anything from Microsoft.

  • YAML: I prefer YAML over TOML and so all my Hugo config’s are in YAML format. See: https://en.wikipedia.org/wiki/TOML#Criticism

  • MarkDown: I have been using Markdown since shortly after John Gruber first released it on Daring Fireball circa ~2000. So everything I write is in it. All my posts, docs, notes, ramblings, and maybe one day even my Last Will and Testament.

  • Spell Checker: Since we never adopted Benjamin Franklin’s Alphabet, and I find memorization of minute details, such as the arbitrary non-rules of the English Language, to be ineffecient+pointless .. I am in dire need of a pre-commit spell-checker w/in the scope of my MarkDown documents. If I don’t do this then I will likely attract society’s spelling/grammar police who don’t seem to recognize their own ism tendencies when they pass judgement on the spelling/grammar of others. note: this form of descrimination/racism is recognized as a world-wide problem. See: Linguistic Descrimination, Linguistic Racism

  • Validate all links. This should validate that any MarkDown reference links have a valid definition somewhere, and that all links point at a valid page somewhere on the net (potentially validate that no link results in a 3xx redirect). The later is potentially a good site-test to run periodically which can be used to let me know when a page somewhere else disappears (and thus I might need to find an archived copy).

The daunting scope of my ability to ramble aside, the above list is really not that large in scale, or even that complicated. I suspect I spent more time rambling than it will take to actually do the work (this tends to be the norm for me).

Soo .. on to the actual work:

  1. Follow the pre-commit Install Instructions. For me this was pip3 install pre-commit.

  2. Install pre-commit into my site’s Git directory with:

    pre-commit install
    

    yes .. it is really that simple

  3. Install an example .pre-commit-config.yaml with:

    pre-commit sample-config > .pre-commit-config.yaml
    

    If it feels like I am mocking the entire dev world that is not actively using pre-commit .. then you are right, I am. It is such an easy tool to set-up, and it works with nearly all-other-testing frameworks.

    note: this last statement is absolutely mocking those people who think that they would rather have reduced options by using their language-specific testing framework over a framework that supports their framework and more.

  4. Configure .pre-commit-config.yaml to include hooks which match my above requirements. For this step most of the work is just filtering through the huge swath of existing hooks. For this project I think I will be enabling:

    • yaml-check: because I use a lot of YAML

    • json-check: I expect I will be touching these occasionally.

    • check-merge-conflict: Because .. I occasionally commit files that have merge-conflict-mangling in the file, and it is embarrassing..

    • check-toml: Not that I use TOML, but because things in Hugo do and I will end up tweaking them, and I don’t trust myself to get something I know right 100% of the time, so why would I not test something I don’t know?

    • check-xml: I personally think there is a special place, somewhere, in which the advocates of XML will eternally be tortured for inventing this monstrocity. When you can’t find two different XML generators that will produce the same output from the same input, then how can I be trusted to inspect XML for potential errors?!?

    • end-of-file-fixer: See my rant earlier in this post.

    • trailing-whitespace: See my rant earlier in this post.

    • mixed-line-ending: See my rant earlier in this post.

With the above configured my .pre-commit-config.yaml looks like:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.1.0
    hooks:
      - id: check-json
      - id: check-merge-conflict
      - id: check-toml
      - id: check-xml
      - id: check-yaml
      - id: end-of-file-fixer
      - id: mixed-line-ending
        args:[ --fix=no ]
      - id: pretty-format-json
        args: [ --indent "   " ]
      - id: trailing-whitespace
        args: [ --markdown-linebreak-ext=md ]

Which produced the following output during git commit:

$ git commit --amend
check json...........................................(no files to check)Skipped
check for merge conflicts................................................Passed
check toml...........................................(no files to check)Skipped
check xml............................................(no files to check)Skipped
check yaml...........................................(no files to check)Skipped
fix end of files.........................................................Passed
mixed line ending........................................................Passed
pretty format json...................................(no files to check)Skipped
trim trailing whitespace.................................................Passed
[master 2b55668] feat: new post about setting up pre-commit for Hugo
 Date: Fri Feb 18 08:28:36 2022 -0800
 2 files changed, 219 insertions(+)
 create mode 100644 .pre-commit-config.yaml
 create mode 100644 content/post/hugo-pre-commit.md
 

Curiously, these results do not imply that in anyway my existing code is correct. Due to the way pre-commit works, the above tests are only run against files that are being changed as part of the checkin, as pre-commit avoids running all tests against the entire tree (it would be a horrible tool otherwise). We have to go a bit out of our way to test everything in the tree by running pre-commit run --all-files. Which produced the following output.

$ pre-commit run --all-files
check json...........................................(no files to check)Skipped
check for merge conflicts................................................Passed
check toml...........................................(no files to check)Skipped
check xml............................................(no files to check)Skipped
check yaml...............................................................Passed
fix end of files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing archetypes/default.md

mixed line ending........................................................Passed
pretty format json...................................(no files to check)Skipped
trim trailing whitespace.................................................Passed

As you can see in here, we have an end-of-file-fixer problem. Curiously, the file in question, archetypes/default.md, was automatically generated by Hugo when it initially created the site. Luckily the pre-commit hook already fixed up the file, so we can simply commit the fix and be done with it.

We have now solved nearly all of my original goals. We still need to locate something to validate MarkDown and we need to locate something to fix-up my horrific spelling errors, and lastly we need something to validate links.

The pre-commit website lists a number of MarkDown checkers, so it was just a matter of playing with a few of them to figure out which ones met all my requirements. Curiously, I landed on a fairly solid solution out of the gate with markdownlint-cli. This found a number of issues with this post as well as with my first post. One item that did give me a bit of a hiccup was the Hugo Shortcodes that I used for displaying console output. I found I needed to surround those w/ HTML comments which disabled markdownlint-cli for those sections of the post.

Example:

<!-- markdownlint-disable -->
{{ < highlight console > }}

...

{{ < /highlight > }}
<!-- markdownlint-enable -->

As for spell-checking changes during pre-commit, after some serious effort I have found that there are a slew of semi-usable options, but not a lot of good ones. In the end I settled on codespell which supported giving advice about the right word to use, and the output was fairly clean. Sadly, I could not configure an args param for codespell in the .pre-commit-config.yaml and was forced instead to configure it using a .codespellrc in the top of my Hugo site.

[codespell]
skip = *.css

Lastly, link checking. I am not going to lie, it caught me off guard that my searches for any form of link checking tool that could be leveraged as part of a pre-commit turned up a single lonely result. I was also a little nervous to find that it only had 13 stars on GitHub and hadn’t been touched in a year. Still, I decided to give linkchecker-markdown a try and see if it worked. Alas, the tool itself is not actually designed to work w/ pre-commit directly and so I was forced to fork the project and add a .pre-commit-hooks.yaml such that it was usable as a drop-in tool. At this point at I found out that the tool only deals with links embedded in the MarkDown document and not reference links, so it looks like I will need to put this on hold while I fix-up the tool.

Still, I did include the following config into my .pre-commit-config.yaml to remind myself later to address this particular problem space:

#- repo: https://github.com/major0/linkchecker-markdown
#  rev: HEAD
#  hooks:
#    - id: linkchecker-markdown

Anyway, there you have my thought-process and final configs for adding pre-commit to Hugo. Now it is time for a drink…


See also