Yesterday, I described a leaky abstraction in the Fixie test framework. I introduced a complex and viral solution to a simple and localized problem. Eventually, it caused a bug: consumers who customized Fixie in just the right way could experience apparently-passing tests even though exceptions were thrown during test setup. That unacceptable behavior had to be fixed. To make matters worse, my test coverage wasn’t enough to catch the bug; all of Fixie’s self-tests were green. Today, I’ll share the approach I took to safely mend Fixie’s flat tire.
We use test frameworks so that we can make changes with confidence. I’ve used Fixie to test Fixie, in order to give me the confidence to make changes. A fundamental bug amidst all-green tests was a slap to the face. I immediately lost confidence in my ability to make changes, yet I had to make a significant change.
The solution came in several small steps over the course of a week:
- Add legitimate test coverage, revealing current behavior including the bug. The build fails.
- Apply a local fix to the immediate bug. The build passes.
- Find and clean up the smallest part of the ugly pattern that could be phased out on its own, while keeping all the tests green.
- GOTO 3
Legitimate Test Coverage
Because my current test coverage was no longer giving me confidence in Fixie’s most fundamental behaviors, I needed to start by shoring up my tests. I needed test coverage which would confirm the current correct behavior while also exposing the bug.
The features in question all had to do with treatment of a test class’s lifecycle: end users can customize test class construction; customize frequency of construction; and wrap behaviors around each test method, around each test class instance, and around each test class.
The customization aspect led to my insufficient test coverage. I offer several degrees of freedom, but I need to promise that we’ll do something reasonable when end users exercise that freedom. There are a lot of combinations you can produce with those building blocks. “Create an instance once per test, wrapping each test in a setup/teardown,” “Create an instance once per test class, using a custom factory method, wrapping each instance in a fixture setup/teardown and each test in a transaction,”…
Testing these lifecycles gets extra interesting when you want to prove what happens when an end user’s own test code fails. What happens when a test class constructor fails? When a custom factory fails? When a setup fails? When test teardown fails and class teardown fails and disposal fails?
The number of combinations grows quickly, and I wanted to test them all. Normally, testing everything has diminishing returns, but these combinations are the whole point of this project, so they deserve it. Also, the problem I wanted to solve deals with my exception handling scheme, which directly affects the test class lifecycle.
I mitigated the combinatorial explosion by recognizing that if I covered some features (like construction and custom factories) with a few targeted tests, I wouldn’t have to bother re-testing those in combination with all the other scenarios. Still, I was left with many scenarios to test.
My test classes work with a nested hypothetical end user test class, containing a passing test and a failing test of its own:
The WhereAmI() helper method simply writes the member to the console. The constructor writes “.ctor”, the Pass() method writes “Pass”, etc. From the outside, I can force any of these members to deliberately throw some generic exception, in order to prove what happens when end users’ test classes throw exceptions at any step.
When Fixie tests itself, it executes the full lifecycle of this sample test class, given an example customization of the lifecycle. All console output is captured so that we can assert on the exact order the members were hit. The overall results of the mini test run are also collected, so we can assert on which tests passed, which tests failed, and which exceptions were the cause of each expected test failure. It’s a little mind-bendy. Fixie’s own tests fire up test runs of this hypothetical mini test class. This way, I can make positive assertions about what happens when your test classes go negative.
Finally, I had meaningful coverage of the test class lifecycle. I had confidence that I could dramatically change my exception handling scheme without breaking all of these user-facing promises.
The Fix: Preserving Stack Traces
Yesterday’s mess resulted from the desire to rethrow the InnerException within a TargetInvocationException:
Alas, rethrowing like this destroys the stack trace we want to report to the user, so I had done this:
Returning the ExceptionList to the caller started the complexity virus I described yesterday. The real fix is to wrap-and-rethrow within an exception type devoted to this purpose, PreservedException:
When I later catch exceptions, I do have to take care to unwrap the OriginalException before reporting to the user. Why not just let the TargetInvocationException throw to that same catch block? If my end-of-the-line catch block sees a TargetInvocationException, I don’t know today and forevermore that it came from this call to MethodInfo.Invoke(). If I catch a PreservedException, though, I know that it is mine and that I should unwrap it.
This approach is structurally similar to NUnit. xUnit appears to poke a private field on the Exception type instead. Still, we can do even better. In yesterday’s post, super-helpful commenter jdogg13 alerted me to the new ExceptionDispatchInfo type, which can rethrow an exception while preserving its stack trace. Fortunately all my work here doesn’t go to waste. My solution will simply get shorter and simpler: the innermost catch will use ExceptionDispatchInfo, and the end-of-the-line catch block won’t have to look for PreservedException as a special case.
At this point, I had meaningful coverage and a fix to apply at the deepest part of the mess. Still, I took several small steps to apply it. I identified the smallest place where I could partially apply the fix: exception handling around the attempt to Dispose() an instance of a test class. All the tests passed, so I grabbed the next smallest place, and repeated until I was finished:
- Fix test class disposal. Tests pass. Commit
- Fix test class construction. Tests pass. Commit
- Fix test method invocation. Tests pass. Commit
- Simplify the original buggy class. Tests pass. Commit
- Simplify. Tests pass. Commit
Each commit dramatically simplified yesterday’s mess. Ugly helper classes in support of the mess melted away. The confusing, paranoid “Did we fail yet?” checks are gone. I dramatically “rewrote” the innermost secrets of the system, but at no time did I take a large step. The bug is fixed, and I have coverage I can actually rely on going forward. Now that I’m back to writing simple idiomatic C#, there’s room for even more simplification. Phew!