On writing reusable, testable, and safe Bash code

Posted on Saturday September 5, 2020

First, a disclaimer. You might ask why Bash and not something “better” like Python? Well, as I mentioned in (a footnote of) my (fairly old yet still relevant) blog post about the technologies I use for this website, it is a pain in the ass to ensure that a) you have the right programming environment installed b) the right way so that you can use it and c) that all the dependencies you need are also installed the right way. That is why there are still cases where writing the script in Bash makes sense.

This blog post is hugely inspired by the Defensive BASH programming blog post by Kfir Lavi. If you have not read that post yet, please do it now. Haven’t read it yet? Shame on you. Go read it now! Now you have read it? Good, now we can continue!

The summarize the “best practices” from that blog post:

Use functions to split the functionality into clear and coherent components.
Use local variables and minimize the use of (immutable) global variables.
Write the tasks in a function one per line so that code is easier to read.

This is list a good start. However, I have noted that the following things make the Bash code even more re-usable, testable, and safe.

Apply the UNIX philosophy to the functions

In other words, let a function do only one thing and do it well. This is a rephrasing of the point #1 on the list above. Of course, that is not to mean that you could not have functions that call a set of other functions. That is actually advisable as it enables splitting complex tasks into smaller components that can be tested. In case you have not done any Bash (unit) testing, I recommend that you take a look at shunit2 that makes writing the tests pretty nice and easy.

Expect the unexpected and exit as early as possible

When processing some data in a function, do not expect the input and output always be what you are expecting it to be. Should something change in the input, the script can start running wild, thus potentially causing irreversible damage to the underlying (file) system. Similarly, never expect the task the function is calling to run smoothly. Therefore, always check the result of the task(s) the function calls and exit as early as possible:

doAThing() {
	local output
	local result
	# Get the output of a task
	output=$(run_a_task)
	# Get the result of the call
	result="$?"
	# If the call failed, exit!
	if [ "${result}" -gt 0 ]; then
		exit 1
	fi
	# Other tasks
}

main() {
	doAThing
	# Other function calls
}

If the script creates some temporary files that you would like to clean up also when the script fails, then the functions should not exit but return with a non-zero value. Then, the calling function should handle the return value of the function and do the necessary steps to clear the mess the script has created:

doAThing() {
	local output
	local result
	# Get the output of a task
	output=$(run_a_task)
	# Get the result of the call
	result="$?"
	# If the call failed, return a failure
	if [ "${result}" -gt 0 ]; then
		return 1
	fi
	# Other tasks
	return 0
}

cleanUp() {
	# Do whatever is needed to clean up the mess
	# Exit after the clean-up is complete
	exit 1
}

main() {
	local result
	doAThing
	result="$?"
	if [ "${result}" -gt 0 ]; then
		cleanUp
	fi
	# Other function calls
}

Move “library” functions to a separate file and load them using `source`/`.`

Bash (as well as any other shell) has the very same import/include paradigm that many higher-level programming languages. This makes it possible to have a set of (library) functions in a separate file and then include them to the actual script you are writing. For example, when single-purpose functions are defined in a separate file (for example, library.bash):

# Contents of library.bash

# Constants
export readonly A_THING="foo"
export readonly ANOTHER_THING="bar"

doAThing() {
	# Do your thing here, for example
	echo "${A_THING}"
}

doAnotherThing() {
	# Do another thing here, for example
	echo "${ANOTHER_THING}"
}

combinedFunction() {
	# Do both a thing and another thing
	doAThing
	doAnotherThing
}

Then they can be used in the main script file like this:

# Contents of script file
source "path/to/library.bash"
# "source" can be replaced with a period (.) to have the same effect

main() {
	# You can call…
	combinedFunction
	# …any function from that sourced file…
	doAThing
	# …like they would be defined in this file
	doAnotherThing
}

main

This functionality is handy when you have a set of functions that can be used as library functions in other scripts. Furthermore, having the functions in a separate file makes writing (unit) tests for them a lot easier. If all your functions of the script are in one file, including the main function, as per the guidelines given by Kfir’s blog post, testing the individual functions of the script requires rewriting them as test code.

But if (some of) the functions are in a file separate from the main function, you can test them directly using source:

# Load the functions from the library.bash
source "path/to/library.bash"

testDoAThing() {
	# Call the function and store its output
	local output
	output=$(doAThing)
	# Check the output
	assertEquals "${A_THING}" "${output}"
	assertNotEquals "${ANOTHER_THING}" "${output}"
}

testDoAnotherThing() {
	# Call the function and store its output
	local output
	output=$(doAnotherThing)
	# Check the output
	assertNotEquals "${A_THING}" "${output}"
	assertEquals "${ANOTHER_THING}" "${output}"	
}

# Load the shunit2 unit testing functionality
source "path/to/shunit2"

Obviously, any change in the tested function will automagically reflected in the test and any breaking changes will result in failing test cases, thus making them easier to spot.

Use shellcheck

Shellcheck is a static analysis tool (aka a linter) for shell scripts. It can point out potential issues that could lead to issues. Furthermore, its suggestions tend to make the code even more clearer and readable (points 1 and 3 above).

I have learned a lot from the errors shellcheck has pointed out. Sometimes the Bash “feature” you are (ab)using does the thing you wish it to do but shellcheck complains about it. Rewriting the solution to make shellcheck happy actually makes the code safer as the “feature” might have some funky side effects if the input for that feature is not exactly what you expect it to be. That is, trust shellcheck to point out the stupid ideas you had when writing that code.

Summary

To write safe, re-usable, and testable Bash code, you should

Use functions to split the code into small do-one-thing-well components,
Use local variables and minimize the use of (immutable) global variables,
Handle task return values and exit as early as possible,
Have library-like utility functions in a separate file or files and source them, and
Use shellcheck to point out potential issues.

Happy Bashing!

Tags: Bash, Test driven development (TDD), Code safety