Currently at my job, I am writing computer code to model a sophisticated machine at an electric power plant. In order to verify that my code is up-to-snuff, my boss has required me to learn a new skill – software unit testing.
I originally saw this as a boring, obnoxious chore, but I’ve come to see that it is actually a very useful tool for ensuring whatever code you write operates correctly – guaranteed.
The General Process
The general process, as I have learned to do it thus for over the past 1.5 weeks, goes like this:
- Break your program into a variety of small, modular programs.
- Write a variety of unit tests for each module.
- Write a comment in each test as to what is being tested (you will not remember 6 months from now).
- Now re-unite the modules into your main program, and then write unit tests for the master program.
Perhaps my logic is off, but I think the act of breaking your code into multiple code modules actually reduces your total amount of required testing. Suppose I have a program with 6 inputs; the number of test cases should be bounded by . On the other hand, if I have three code modules with two inputs each, the total amount of tests should be bounded by . Readers with more experience on this issue are strongly urged to comment and share their insight.
An interesting side effect of all this unit testing, is that you end up with more testing code than actual software code. This sounds like a big waste of time, but good reasons are given here and here for writing lots of test code. Software Carpentry reports that a mere sign error in a program led to the retraction of five papers by one unfortunate researcher.
Broad Ideas on What Unit Tests to Write
From what I have read, typically you test your software inputs using a design-of-experiments approach with a variety of inputs. This gives high “coverage” of the set of possible test cases, without going batty from writing too many tests in a full-factorial design. Read more about unit testing in MATLAB here.
Some ideas I learned from the internet, for what test cases to include:
- Try an alternate method of computation for a simple case, and compare it to your code. Do the outputs match?
- Try extremely small, extremely large, exactly-zero, and negative inputs. If such inputs are not allowed in your code, dummy-proof your code to stop such inputs from entering.
- Try to think up upper and lower bounds on feasible inputs, and clamp them accordingly. For example, one of the inputs to my program is the ambient air temperature. There are many possible bounds one could pick for this value. For one, oxygen starts to liquefy at -297.3 F so that is one extreme lower bound the program should accept. The lowest recorded temperature recorded on earth was -128.6 F, which is also a logical choice of lower bound.
- While I can’t go into details on my work, I was able to identify a suitable upper bound on air temperature as well before my program became numerically unstable. A little bit of logic was required to arrive at this upper bound.
- Physical bounds and transition points make for realistic choices of bounds, for example, the aforementioned boiling point of oxygen as a lower bound on inlet air temperature. The dissociation temperatures of various gases are also possible choices, e.g. when a temperature must be specified for a given flow stream of a gas.
- Ruthlessly comment your code. Your future self will be forever grateful.
- For those readers using MATLAB, I highly recommend the MATLAB program “xUnit” for doing unit testing.
- A painful method, for complex science and engineering projects, is to manually compute several quantities of interest by hand, and compare your program’s output.
- Be very wary when your code requires iterative solution of a subproblem. The behavior of iterative solvers is very complicated, and they are a prime spot for something to go wrong. It is best to restrict your inputs such that your code always operates in a region where convergence is guaranteed – but ensuring you’re in that region is very difficult to do.