Why Testing code should be Laissez-faire

I’ve been working on the HFA Library some more, and got to the point now were I really want to start testing it deeply. Since HFA is in Haskell, the weapon of choice for testing pure code is QuickCheck. The interesting bit about QC is that test data is fundamental random. This is great for two reasons.

1) It forces you to actually test for _general_ properties of objects/functions, rather than specific properties of certain instances of those structures, like a unit test does
2) By Generating Random Data, you don’t have to worry about being too nice or too mean to your program, as with unit testing.

I’ll state it now, for the record, Unit Testing is great. I wouldn’t test my Java (or soon enough, Scala, ^_^) programs with (mostly) anything else. However, when you have referential transparency on your side, QuickCheck is a superior system.

The real brilliance of QuickCheck though, is that it is extensible enough that you can define new “checkable” items, that is, instead of having to use the standard checkable types when checking a function, you can define how QuickCheck can generate random data for a particular type by making it an instance of the Arbitrary class, which is defined by QuickCheck. This means that, as long as you can define a couple of methods for your datatype, it is damn near trivial to have QC generate examples of your datatype and test them quickly.

Why is this good? Well, consider you’re writing unit tests for your code. You’ve been intimately involved with this mangled peice of imperatively worded text for days and weeks. You know every inch of it, and you have in your mind the next new feature you want to add. It is not uncommon (at least for me) to begin writing and toying with the code in your mind, figuring out where potential bugs might be. As a human being, you are typically not looking to make things harder for yourself than needbe. So maybe, when you’re writing those unit-tests that will guide your programming to ensure that the code is written correctly, you — maybe subconciously, maybe not — decide to make those few unit tests a little bit easier on the part of the code that you think is more fragile. I’m guilty of this, certainly, and I’m sure if you’re honest with yourself and you’ve developed a good bit of code with the test-first methodology (which I like only mildly less than the test-shortly-after-you-realize-you’re-supposed-to-write-tests-first methodology), that you would find that you’ve done it too. QuickCheck fixes that wagon pretty quickly, you don’t get to reason about how hard some tests will be on the code, QuickCheck enforces a “Hands Off” or “Laissez-faire” (I’m french, and I like history, sue me.) form of testing. You don’t get to decide what is tested, just what the result should be, which is how it _should_ be done. I _shouldn’t_ be thinking about what data I want to test, I shouldn’t have to write all the test-data, ideally, I should only have to say, “minimizing a DFA twice is equivalent to minimizing a DFA once” or “if the regular expression foo is generated by DFA f, and the expression foo’ is generated by the minimized version of f, then foo == foo’ for all f::DFA.” I guess the point is, the computer is unbiased, it won’t be nice to the code, it won’t be mean to it, it’ll be fair. Is that to say that coders will be biased towards or against their code? Yes, it is, we spend alot of time with these projects, we develop a vested interest in seeing them work, finding out that you did something wrong can be discouraging. Granted, small things may not devastate you, like using the wrong function name, or misplacing a variable. But if you’re unit test catches a major flaw in the structure of a program you designed, that represents alot of work that just got blown to peices. At the very least, if not for your pride, testing systems like QC are good for your productivity, they allow you to test a (potentially) huge number of arbitrary cases every time you run your program. Heck, you could even have QC running live against your code in the background, telling you in real time what tests your failing, what cases gave you failures, etc. All of that is Automatic and Vital data, its 24/7/365 testing of your code, for free. Slowly building assurance that your code is, in fact, doing the right thing on all inputs.

By the way, Yes, I know many, many people have written about QuickCheck before, and about how wonderful it is, but I think it’s always worth saying again. Good Ideas deserve to be talked about, QuickCheck is a _very_ good idea.

Published in: on August 10, 2007 at 6:35 am  Comments (5)