Measuring execution coverage of shell scripts

August 18, 2020•577 words

Today I will talk about measuring test coverage of shell scripts

Testing is being honest about our flawed brains that constantly make mistakes regardless of how much we try to avoid it. Modern programming languages make writing test code a first-class concept, with intrinsic support in the language syntax and in the first-party tooling. Next to memory safety, concurrency safety, excellent testing support allows us to craft ever larger applications with an acceptable failure rate.

Shell scripts are as old as UNIX, and are usually devoted to glue logic. Normally testing shell scripts is done the hard way, in production. For more critical scripts there's a tendency to test the end-to-end interaction but as far as I'm aware of, writing unit tests and measuring coverage is unexpected.

In a way that's sensible, as long as shell scripts are small, rarely changed and indeed are battle tested in production. On the other hand nothing is unchanged forever, environments change, code is subtly broken and programmers on the entire range of the experience spectrum, can easily come across a subtly misunderstood, or broken, feature of the shell.

In a way static analysis tools have outpaced the traditional hard way of testing shell programs. The utterly excellent shellcheck program should be a mandatory tool in the arsenal of anyone who routinely works with shell programs. Today we will not look at shellcheck, instead we will look at how we can measure test coverage of a shell program.

I must apologize, at all times when I wrote shell I really meant bash. Not because bash is the best or most featureful shell, merely because it happens to have the right intersection of having enough features and being commonly used enough to warrant an experiment. It's plausible or even likely that zsh or fish have similar capabilities that I have not explored yet.

What capabilities are those? Ability to implement an execution coverage program in bash itself. Much like in when using Python, C, Java or Go, we want to see if our test code at least executes a specific portion of the program code.

Bash has two features that make writing such a tool possible. The first one is most likely known to everyone, the set -x option, which enables tracing. Tracing prints the commands, just as they are executed, to standard error. This feels like almost what we want, if only we could easily map the command to a location in a source file, we could construct a crude, line-oriented analysis tool. The second feature is also standard, albeit perhaps less well-known. It is the PS4 variable, which defines the format of the trace output. If only we could put something as simple as $FILENAME:$LINENO there, right? Well, in bash we can, although the first variable has a bash-specific name $BASH_SOURCE. The second feature which makes this convenient, is the ability to redirect the trace to a different file descriptor. We can do that by setting $BASH_XTRACE_FD=... to a file descriptor of an open file.

With those two features combined we can easily run a test program, which sources a production program, exercises a specific function and quits. We can write unit tests. We a can also run integration tests and check if any of the production code is missing coverage that indicates important test is missing.

I pieced together a very simple program that uses this idea. It is available at https://github.com/zyga/bashcov and is written in bash itself.

Measuring execution coverage of shell scripts

More from Zygmunt Krynicki
All posts

Signal to noise ratio in build systems

Poor man's introspection in bash

Measuring execution coverage of shell scripts

More from Zygmunt KrynickiAll posts

Signal to noise ratio in build systems

Poor man's introspection in bash

More from Zygmunt Krynicki
All posts