Tiny Steps: Creating Fixie 2.0


    With the recent release of .NET Core, it’s time to upgrade the Fixie test framework. Fixie needs to support the new project structure, tooling, and cross-platform behavior introduced by .NET Core: not only should developers of this project benefit from all the new things, but more importantly end users should also be able to use Fixie to test their own .NET Core projects, and even do so while developing them outside of Windows. That’s no small feat. Right now, I’m about halfway through that effort, which you can track on Fixie’s GitHub issue #145. For the bigger picture, see the new Roadmap.

    Recently, I gave a talk about .NET Core and the challenges it poses for test frameworks like this one.

    — Well, that’s a lie. —

    It’s not about that at all. It’s really about how to recognize a big family of architectural problems in software projects, and how to go about solving those problems in tiny steps:

    Tiny Steps – Creating Fixie 2 from Patrick Lioi on Vimeo.

    Powerful Integration Testing


    The Fixie test framework has been in production use for over a year now, and I’ve had a chance to work with it on a number of real world projects as well as a large project developed for training purposes. In the last few months, I’ve refined my recommended integration testing strategy in light of what I’ve learned. Today, we’ll see how to structure your tests so that they can adhere to a few driving principles:

    • Matching production as closely as possible.
    • Helping our teammates fall into the pit of success.
    • Brevity.

    We’ll integrate Fixie with AutoMapper, Respawn, StructureMap, FluentValidation, Mediatr, AutoFixture, and Entity Framework to see how it all fits together for a realistic testing experience. Fair warning: by being realistic, this example will be fairly long, but seeing it all really work together is kind of the point.

    The Sample Application

    Imagine a typical ASP.NET MVC application for maintaining a contact list.

    I will assume some familiarity with Mediatr, which keeps our controller tiny. If you haven’t seen it before, think of Mediatr as a bit of reflection that allows controller actions to be short: “Please, somebody, handle this request!” Mediatr finds the corresponding “Handler” class and invokes it to do the work.

    Why talk about Mediatr in an article on testing? By letting us pull the meat of a feature out of the controller and into separate classes, we wind up with classes that are easier to construct and invoke from a test than an ASP.NET MVC controller would be. No HttpContext, no insane mocking. In other words, Mediatr enables the meat of an action to be pulled out to the side to be tested in isolation, and lets the controller focus entirely on routing concerns.

    Here’s our sample domain:

    …and our controller:

    We’ll also pull a funny trick so that most of the “Contact Edit” feature can go into a single file. Instead of having many similarly named files like ContactEditQuery, ContactEditQueryHandler, ContactEditCommand, ContactEditCommandHandler… we’ll introduce one wrapper class named after the feature, ContactEdit, and place short-named items within it, each named after their role:

    Where does our Entity Framework DbContext subclass, ContactsContext, get saved? So that we never have to think about it again, we’ll establish a Unit of Work per web request with a globally-applied filter attribute:

    DbContext doesn’t provide these convenient BeginTransaction() / CloseTransaction(Exception) methods: they’re custom. We need to deal with the web request throwing an exception before the end of the request, as well as the case that the request succeeds to this point and then fails during SaveChanges(), committing only when all of that actually works:

    Assume, as well, that we’re using StructureMap as our IoC container, and that in order to get one nested container per web request, we leverage a package like StructureMap.MVC5 to handle that challenging setup for us.

    To sum up, the application has one nested IoC container per web request, one transaction per web request, and one DbContext per web request. We see code defining our form’s model, validation rules for that model, and handlers that actually do the work of fetching and saving a contact. Now, we’re ready to test the Contact Edit feature.

    The Testing Convention

    We customize Fixie by adding a Convention subclass to our project.

    Each of the classes this references, InitializeAutoMapper, ResetDatabase, NestedContainerPerCase, and AutoFixtureParameterSource, are custom classes included in the test project. We’ll see them each in detail later.

    At a glance though, we can describe our testing style to a new team member by scanning this class:

    A test class is a class whose name ends with “Tests”. A test method is any public void or async method within such a class. Whenever a test class runs, we’ll ensure AutoMapper has already been initialized. Whenever a test case runs, we’ll first reset the contents of the database, and we’ll wrap the whole test case in a nested IoC container. When test methods have parameters, they’ll be created and filled by AutoFixture.

    AutoMapper

    When using AutoMapper, to make your property mapping code error- and future-proof, you want to ensure that it is initialized once and that it will automatically enlist any AutoMapper profile classes. Here’s a wrapper for AutoMapper’s own initialization code. I’d include this in any web application and invoke it during application startup:

    Here, we’re playing a few tricks to ensure that AutoMapper definitely only gets initialized once. Our production code needs to use this to initialize AutoMapper early in its life.

    In order to match production as closely as possible during our tests, we ought to execute the same code at test time, very early in the life of any particular test class execution. The first thing our testing convention needs, then, is a definition for how to ensure AutoMapper gets initialized before any test class runs. We saw the rule mentioned earlier, and here is its implementation, which we drop into the test project near the TestingConvention:

    In other words,

    Whenever we’re about to run a test class, call AutoMapperBootstrapper.Initialize() first, and then proceed with actually running the tests in that test class.

    Since we already ensured that AutoMapper.Initialize() will only ever really initialize things once, there’s no real cost due to invoking this once per test class. We just needed to invoke the bootstrapper at least once very early in the life of each test.

    Additionally, we’ll add a test that will fail if any of our custom AutoMapper rules don’t make sense:

    Respawn

    It’s important that integration tests be independent, and when your tests hit a database that means we need to start each test from a well known state. The simplest well-known state is empty, and that’s where Respawn comes into the picture. Respawn can nuke every row from our database prior to each test. The only records that exist are the ones our test puts there.

    Our convention claims to reset the database with every test case. We saw the rule mentioned earlier, and here is its implementation:

    In other words,

    Whenever we’re about to run a test case, call Respawn’s Reset(…) first, and then proceed with actually running the test case.

    StructureMap

    Earlier, I said that I apply the package StructureMap.MVC5 to a web application to help get the ball rolling on integrating StructureMap with MVC. I like to customize the code that package places in my system before proceeding. As with AutoMapper, I want the production code to include a class that simply initializes StructureMap exactly once at application startup:

    I claimed that installing StructureMap.MVC5 will set up “one nested IoC container per web request” in our production code. Each web request will get its own little bubble of dependency creation. For instance, this gives me exactly one DbContext per web request, which satisfies our Unit of Work pattern.

    I want my tests to mimic production as much as possible, so I similarly want one nested IoC container per test case. A test case mimics one user interaction, so it better run in the same kind of environment as one actual user interaction!

    Our convention claims to set up one nested container per test case. We saw the rule mentioned earlier, and here is its implementation:

    In other words,

    Wrap each test case in a new nested IoC container dedicated to that test case, just like the one that each web request has in production.

    Note that this works by invoking the IoC class. In other words, every single test case exercises the same IoC setup that we’re applying in production. Again, the use of Lazy saves us from actually being wasteful about it.

    AutoFixture

    Within a test, we often need to construct a sample object and fill in its properties with fake data. Doing so explicitly is annoying and fails to be future-proof, so we can defer to AutoFixture to construct-and-fill our dummy test objects for us.

    We could invoke this tool explicitly within our tests, but we can do better. Our convention claims that test method parameters come from AutoFixture. We saw the rule mentioned earlier, and here is its implementation:

    We’ll see this AllEntities attribute come into the picture a bit later. At this point, we can focus on the primary purpose of the AutoFixtureParameterSource. In other words,

    Whenever at test case has input parameters, the parameters will be constructed and filled with fake values using AutoFixture, including any AutoFixture customizations found in the test project.

    FluentValidation

    We saw a FluentValidation validator class earlier as a part of our Edit feature. For tests, I like to include a few extension methods so that we can make expressive assertions about validation rules:

    FluentValidation has a few built-in assertion helpers, but in my experience they make it far too easy to write a test that is green even while the validation rule under test is wildly wrong. Our own assertion helpers make it clear: with this sample form submission, we get exactly these expected failure messages.

    Create a Testing DSL

    C# 6 includes a feature in which the static methods of a class can be imported to a code file. A using directive can now take on the form using static Some.Static.Class.Name;

    When you use such a using directive, your code file gets to call the static members of that class without having to prefix them with the class name. For our tests, we’ll take advantage of this new syntax to define a little Domain Specific Language. In our test project, near the TestingConvention, we’ll add a static class of helper methods. Note how these greatly leverage the infrastructure we’ve already set up:

    We’ve got a lot going on here.

    First, we can interact with the one nested IoC container per test case, in order to resolve types in our tests. If one of our tests needs to swap in a fake implementation of some interface, it can call Inject(…).

    Second, if we ever want a test to temporarily dump the generated Entity Framework SQL to the console, we can call LogSql() inside that test.

    Third, we have some helpers for working against the database. Transaction(…) lets you interact with the database in a dedicated transaction. Save(…) lets you initialize your recently-respawned database with a few entities. Save(…) also protects your fellow teammates from mistakenly abusing the DbContext involved by requiring that you save all your sample entities with a single DbContext. Query(…) lets you inspect your database during assertions after exercising the system under test.

    Fourth, Validator(…) lets us get a handle on the same validator instance that would be used prior to entering a controller action. We want a test case to mimic a controller action. If our validation rule tests construct a validator explicitly, they would be missing out on the chance to catch poor validator declarations that reference the wrong model. This helper method lets us exercise the same validator lookup that would happen right before a controller action gets invoked in production.

    Lastly, we integrate with Mediatr in our tests with the Send(…) helper. Our tests can send a message through Mediatr just like our controller actions can, and when our tests do so they operate in their own Unit of Work just like production. Additionally, our test will only be allowed to Send(…) when the sample form would have passed validation, which keeps us from writing misleading tests for scenarios that would never happen in production.

    Actually Write A Test, Darnit!

    Enough infrastructure, let’s write some tests! First, I want to be confident that when the user goes to edit a Contact, they’ll see the form with the right values populated for that selected Contact:

    Recall what all is actually going on here.

    Respawn steps in to nuke any existing database records. We’re starting this test from a clean slate every time it runs.

    AutoFixture steps in to fully populate our incoming Contact instances with essentially random data. I don’t care what the property values are. I only care that they’re filled in with something.

    We set up the well-known state of our database with the Save(…) helper. This works in its own DbContext and its own transaction, since this is just setup code rather than the system under test.

    We execute the system under test by calling Send(…), passing in the same request object that would be used in production to select a Contact for editing. This operates within the nested container mimicking production, in a database transaction mimicking production. We’re not just calling the handler’s Execute(…) method. We’re exercising the whole pipeline that our controller action would execute in production.

    Finally, we just assert that the view model we got back is fully populated with the right Contact’s information. Since we saved two Contacts and fetched one, we have confidence that our query actually works.

    Testing Validation Rules

    Most validation rule tests out there are horrifically useless. They say things like, “With this sample form, the such and such property should report some error of some kind.” Such a test seems to be testing something, but it’s so vague that you wind up being able to get a passing test even when everything is buggy.

    Instead, let’s actually assert that the validation rule fails for the reason we think it is set up to fail, by asserting on the error message too!

    “Oh, but that’s brittle!” you say? Without it, your validation rule tests are such a misleading time bomb that I’d rather you not write them at all, thankyouverymuch.

    Yet More Testing

    We still need to test that our user can save changes when submitting their form:

    Again, we’re not just testing the handler’s Execute(…) method. We’re working in a fresh database, with automatically populated sample records, within a production-like nested IoC container and a production-like Unit of Work. Send(…) will fail the test if the sample form wouldn’t really have passed validation in production, so we have confidence that the scenario actually makes sense. Our assertion rightly uses its own transaction so that we don’t fool ourselves by misusing Entity Framework change tracking: we assert on the reality of the operation’s effects on the world. Lastly, we’ve demonstrated that we’re affecting the right record. It would be very difficult for this test to pass incorrectly.

    Automatic Persistence Testing

    One last thing. We saw a strange attribute, AllEntities, referenced within our AutoFixture parameter customization earlier. I use it in one special test:

    Despite it’s size, a lot is happening here.

    This test method is called once for every single Entity subclass in the system. Every entity gets its own individual pass or fail.

    For each entity, we get a test that attempts to fully populate, save, and reload a record, asserting that every single property on it “round tripped” without loss to and from the database.

    The ShouldMatch assertion helper just takes two objects of the same type, and asserts that they have the same JSON representation, giving us a quick way to deeply compare all the properties for equality:

    Imagine the effect of having this test in your project. You embark on a new feature that needs a new table. You add an Entity subclass and run your build. This test fails, telling you the table doesn’t exist yet. You add a migration script to create the table and run your build. This test fails, telling you that you have a typo in a property name. You fix it and run your build. This test passes. You can reliably save and load the new entity. Then you start to write your actual feature with its own tests. You and your teammates, cannot forget to do the right thing at each step.

    Expressive Testing

    All in all, this approach to integration testing leaves me with a great deal of confidence in the system under test, with very little code having to appear in each test. A new team member cannot forget to start their tests from a clean slate, they cannot forget that they need to avoid misusing DbContext when setting up sample records, they don’t have to come up with silly random values for properties, they cannot forget to exercise the full IoC container/transaction/validation/Mediatr pipeline, they cannot forget to test that their entities actually persist, and their resulting tests are clear and concise, telling a story about how the feature should behave.

    Fixie Turns 1!


    Almost two years ago, I announced the start of development for a new .NET testing framework. This week, I’ve published Fixie 1.0. Looking back on that announcement, it’s refreshing to see that the original vision held true throughout the whole process. I knew what I wanted from a testing framework, and now I have it. It’s a happy coincidence (and a boost to my ego) that others have found it useful, too.

    What Does 1.0 Really Mean?

    Fixie is ready for production use. Before now, early adopters have had to face two major issues: a “moving target” API, and missing features compared to similar frameworks.

    While the developer-facing convention API was under development, each new release meant breaking changes. Types, properties and methods would change their names as I fleshed out the underlying model and learned from my missteps. Overloads would disappear and return again in later versions. The command line argument syntax has been through a few revisions. Early adopters tend to be open to a bit of a moving target early in a tool’s development, but at some point things have to settle down and become reliable in order for the tool to really take off. 1.0 means that breaking changes will no longer be taken so lightly. Once you’re on 1.0, subsequent upgrades should go smoothly.

    During that early adopter phase, people naturally faced a lack of familiar and important features. In this blog series, I’ve documented the project from “File > New Project”, to a comedically minimal minimum viable product, all the way to the current mature state. One of the earliest steps was to produce a NuGet package, so people have been able to try it out since before it had most of its current features. The first version was only powerful and expressive enough to run its own suite of tests, no more. You wanted to mark a test as skipped? You had to wait. You wanted parameterized tests? You had to wait. You wanted to reuse your conventions across the test projects of your solution? You had to wait.

    Now, all of the must-have features are in place, and the patterns of the implementation will give guidance around extending the system without having to introduce breaking changes.

    The biggest missing feature leading up to this release was that there was no visually-friendly test runner. From very early on, it had a console runner and a TestDriven.NET runner. This happened to fit my own workflow, so the fact that they were text-only runners didn’t bother me. However, anyone who uses a visual runner like Visual Studio Test Explorer or ReSharper would feel an immediate regret as soon as they switched to Fixie. Since NUnit and xUnit are Fixie’s most direct “competitors”, people should be able to make the switch from those tools without completely upending their own workflow. At last, Fixie has support for the Visual Studio Test Explorer. Work on this feature in December pushed me to greatly improve the API that all test runners can use, so the third-party ReSharper runner will be brought up to speed with 1.0 as well, in the not-too-distant future.

    GitHub Organization

    The official repo has been moved from my personal GitHub account to a new GitHub Organization. The organization contains the main repo (the core library, console runner, Visual Studio runner, and TestDriven.NET runner), the documentation site, and a sandbox repo useful when testing out the various runners.

    Thank You!

    This was definitely not a one-man show.

    First, I have to thank the people who made NUnit, xUnit, NSpec, and Machine.Specifications. Any .NET testing framework is going to run into a lot of the same challenges. Whenever I was stumped on a tough problem, I found myself studying the solutions in these other projects. One example is their treatment of AppDomains, which proved to be the most challenging concept throughout the entire project. I could see how these other projects dealt with AppDomains, and in comparing them I was able to learn what aspects of AppDomain interactions were essential versus optional. It was similarly enlightening to see how the other frameworks dealt with integrating with the Visual Studio Test Explorer, which has a number of infuriating quirks to work around.

    Next, people around the world have submitted GitHub issues, pull requests, and awesome analyses of bugs.

    I love this one. It’s the weirdest bug report I’ve ever seen, and it revealed a real problem: TargetParameterCountException during test execution.

    My own coworkers used early versions of Fixie on their projects, once it had enough features to be useful. It’s been great to see what happens when your project bumps up against the real world, and it’s been great to see it used in ways I hadn’t even anticipated. In addition to being my guinea pigs, everyone at work has been very supportive of the project, all the way up to the president of the company who provided positive reinforcement of my self-imposed deadline and helped to guarantee a solid block of personal days during an otherwise busy time of the year.

    I would not have gotten this far without all your help. Thanks!

    Cleaning Up Test Failure Noise


    The Fixie test framework has no built-in assertion library, since assertions are an orthogonal concern to the nuts and bolts of test discovery and test execution. Although I stand by the decision to avoid including a built-in assertion library, there is one neat thing NUnit does with its own assertions in order to simplify output of test failures. Today, we’ll see Fixie’s answer to that feature.

    NUnit simplifies its own output when exceptions are thrown by its own assertion library infrastructure. For instance, when NUnit’s Assert.AreEqual(int, int) fails by throwing an exception, the output deliberately excludes stack trace lines pointing within the implementation of AreEqual, and deliberately excludes the name of the exception type. This filtering allows typical test failure output to remain as simple and direct as possible, pointing the developer to the line where their own test failed.

    Since Fixie has no assertion library of its own, you may instruct it which types make up your assertion library’s implementation details. Consider some hypothetical assertion library:

    Out of the box, Fixie doesn’t distinguish AssertionException from any other Exception (How could it?), so all of the exception details appear in the output. Consider a test class with some tests that will surely fail, and the corresponding default verbose output:

    The implementation details of assertion libraries are rarely interesting to the developer. A custom convention can be instructed to simplify failure output by listing the types that make up the assertion library:

    Rerunning the failing tests, Fixie simplifies the output, directing the developer to the actual failing line of test code:

    In addition to identifying the types which make up your assertion library of choice, your custom convention may also list assertion extension classes defined in your own projects, further simplifying your output during failures.

    Listening to Leaves


    In Fixie XML Reports, I described a new Fixie feature that enables integration with build servers. When you need to output test results in the XML formats made popular by NUnit and xUnit, you can enable that extra output at the command line. This feature initially proved more difficult than expected, and all the trouble originated from trying to work within the confines of the wrong abstraction. As soon as we switched to the right abstraction, the implementation became simple.

    The Wrong Abstraction

    Test frameworks like NUnit, xUnit, and Fixie can each be split into two main functional areas: the framework part, which knows what tests are and how to execute them, and then various runners. Runners include things like the console exe you call from a build script, but also runners for third-party systems like TestDriven.NET, ReSharper, and TeamCity.

    From early on, Fixie has used the following abstraction for runners to implement (NUnit and xUnit have similar abstractions):

    Fixie calls each method as the corresponding action takes place. The ConsoleListener reacts by echoing things to standard out, the TestDrivenListener reacts by echoing things to TestDriven.NET’s own listener abstraction, etc.

    When it became clear that being able to output an NUnit-style XML file would help with build server integration, and in turn adoption of Fixie, I thought, “Gee, that’s just another Listener. It’ll be a breeze.”

    I even got a pull request that did exactly what I’d been picturing. It included a Listener which reacted to each event (AssemblyStarted, CasePassed…) by producing XML nodes of the NUnit style. The original developer realized there were two main problems with using the Listener abstraction:

    1. Listener has the wrong lifetime for reporting on the run. Each test assembly must be executed within its own AppDomain, and the Listener itself lives within that AppDomain. A single Listener can only ever know about the results of a single test assembly, but the XML report needs to include results from all assemblies in the run. The initial implementation of the feature could only work if your solution had one test project.
    2. Your goal is to build a tree, but all you get is leaves. A run of your tests, conceptually, produces a tree of results: tests within classes within assemblies. The Listener abstraction, though, gives you that information by only reporting the leaves of that tree: a passed test case, a failed test case, or a skipped test case. The NUnit XML Listener had to infer the class associated with each case and build up an internal dictionary mapping classes to cases, so that it could finally output the tree structure as an XML document (otherwise, you might accidentally report the same class several times over, once for each method in that class).

    The initial implementation gave nearly the results I wanted, but the implementation details seemed needlessly complex.

    The Right Abstraction

    With help from Sharon Cichelli at a meeting of the Polyglot Programmers of Austin, we realized the problems stemmed from using the wrong abstraction. We needed to acknowledge that we wanted to do tree-processing instead of leaf-event-listening. The fix was to have each assembly run return a tree of results to the console runner, so the console runner could build the One Complete Tree representing the entire run of N assemblies, and then do a trivial traversal of that tree to spit out the corresponding XML. It turns out this is exactly what NUnit does to solve the same problem.

    The resulting tree processing class (no longer a Listener), is NUnitXmlReport. It is clean and straightforward: the shape of the code mimics the shape of the resulting XML document. It works no matter how many test projects your solution has. Supporting the similar xUnit XML format was likewise simple. No need to infer the tree structure given only leaves; instead we simply turn one tree into another.

    When you find yourself having to jump through hoops to make something simple work, take a step back and evaluate whether you’re using an abstraction that helps you or hinders you.

    Fixie XML Reports


    With help from Pete Johanson and Jonas Samuelsson we’ve implemented NUnit-style and xUnit-style XML reporting for the Fixie test framework. Today we’ll cover what the feature is and how to use it. Next time, we’ll see why it was initially challenging to implement the feature cleanly.

    Why Bother?

    Before this feature, Fixie could already report to the console, TestDriven.NET, and TeamCity, so why bother replicating the XML reports of other frameworks? Who would willingly elect to work with XML anymore? If you’re already running a .NET-friendly build server other than TeamCity, that build server almost certainly knows how to read and display results in these XML formats, but has no idea what Fixie is. By allowing Fixie users to opt into these formats, their build server can treat Fixie as a first-class citizen.

    Usage

    A typical command for running Fixie as part of your build script is to specify a test assembly and optional custom name/value pairs of the form “--name value --othername othervalue...“:

    This command will write results to the console in a format that is meant for human consumption. It also makes the given key/value pair available to your conventions. Fixie doesn’t know or care what a category is, but it will gladly hand that information to your convention.

    To produce an NUnit- or xUnit-style XML report, you need to include one of the Fixie-specific reporting arguments:

    It looks like the xUnit file format can only describe a single test assembly, while the NUnit file format supports any number of assemblies.

    I’m using the “fixie:” qualifier for built-in arguments to avoid ever conflicting with an end user’s own custom arguments.

    Each of these commands will produce a file named TestResult.xml of the desired format, which your build server can be instructed to read and interpret. Here are some samples produced when running Fixie’s own self-tests: NUnit format, xUnit format.

    Nailing Down Generics


    Calling Generic Methods

    You write a class containing a harmless generic method:

    Actually, for the purposes of this discussion, this is not one method declaration. Rather, it is an infinite number of method declarations:

    • void GenericMethod<int>(int input)
    • void GenericMethod<string>(string input)
    • void GenericMethod<Customer>(Customer input)

    Each time you make a call, the compiler decides which of the infinite methods you really meant. It makes this decision each time you call something called “GenericMethod”. It makes this decision based on the compile-time types at each call site:

    The compiler knows the compile-time type of each argument, and compares this to the generic type definition in order to pick the single winning “specific” method.

    Easy, right? C# 101 stuff.

    Calling Generic Methods Via Reflection

    Let’s try to call the same method, with the same inputs, via reflection:

    MethodInfo.Invoke(…) wants you to throw an object[] at the method. It wants to take the first item of that array for the first parameter of the method, the second item for the second parameter, etc.

    Easy, right? Wrong. Each call to Invoke above would throw a System.InvalidOperationException with the message:

    Late bound operations cannot be performed on types or methods for which ContainsGenericParameters is true.

    This exception message achieves the high honor of being both 100% accurate and 10% helpful. “Late bound operation” is compiler-speak for “something that is figured out dynamically at runtime instead of statically at compile time.” The figuring out happens later than compilation. The late bound operation, the thing we’re trying to do dynamically at runtime, is the method invocation itself.

    Translation: the exception message is saying, “You cannot invoke the method because it still has generic parameters.” In order to invoke it, we’re going to have to first nail down the generic parameters to specific concrete types so that they won’t be generic anymore. We asked .NET to do the impossible:

    You: “Please call this method.”

    .NET: “No. That is not a method. It is an infinite number of methods. Try again, and tell me which one you meant.”

    Once More, With Feeling

    Lets narrow things down, each time we want to call the method via reflection. Just like the compiler did for us in the original compile-time example:

    This time, thankfully, we get the same output as the original plain calls to GenericMethod(T). As with the original example, we know _exactly which 3_ of the _infinite_ methods are being called.

    Generic Test Methods

    Let’s say you’re using the Fixie test framework and you have defined an [Input] attribute with an associated Fixie Convention in order to have parameterized tests. If one of your test methods is generic, Fixie faces the same problem we faced above:

    Fixie calls test methods via reflection, using an object[] of inputs. In this case, the object[] has length 1, and the values come from the [Input] attributes. In order to successfully invoke the MethodInfo, Fixie must also call MethodInfo.MakeGenericMethod(…), passing in the right concrete Type, in order to get a handle on the specific, concrete version of the MethodInfo. Finally, Fixie can invoke that MethodInfo.

    Thanks to Anders Forsgren, Fixie can handle tests like this one. If the generic type parameter T can be nailed down to something unambiguously specific, it will be. If a generic test is declared with multiple generic type parameters like T1, T2, etc, it’ll try to nail them all down. When there’s any ambiguity for a T, though, Fixie has to assume object as the most specific type possible.

    In short, Fixie will do the most reasonable thing possible without you ever needing to know that any of the above is happening.

    The Feature I Almost Skipped


    Until recently, the Fixie test framework had no notion of skipped tests for two reasons.

    First, skipped tests are like support beams for technical debt. You know there’s a problem, and you’ve decided to either ignore it or delay solving it, contrary to the whole point of automated test coverage. It’s unclear to the next developer whether it should be brought back into the fold, or when. The problem being ignored lurks there, waiting to bite you sometime in the future. Yellow warnings in the console runner’s output go unnoticed and become the norm. People start skipping over them mentally as much as the test runner does, and it teaches other developers on the project that it’s not a big deal to skip other tests once they start failing.

    Second, if you really wanted to skip tests in Fixie, you could already effectively do that with a modified convention, like so:

    In other words, “a method is a test method if it isn’t marked as skipped.”

    I don’t like skipped tests, and I basically had poor-man’s skipped tests anyway, so why bother to implement true support for them in Fixie? Well, the convention trick above is even worse than normal, because the skipped tests would not even show up in result counts and would not come with warnings in the output. They’d be even easier to forget and let rot. Also, Fixie’s convention-based approach to test discovery and execution opens the door to creatively mitigating the risks of skipping tests, as we’ll see in our example below.

    My conclusion was to let Fixie know what it means for a test to be skipped, so that it can count them and warn the user like other test frameworks, but to deliberately not include any way to mark tests as skipped in the DefaultConvention. Out of the box, tests can’t get skipped. If you want skips, you’re going to have to ask for them. We can change the earlier poor-man’s skipping convention so that you can alert the user to their pending disaster:

    Since the hook is a Func<Case, bool>, you can include custom logic aside from the mere presence of an attribute or naming convention. One way to mitigate the risk of skipping tests is to place an expiration date on them:

    You might define a skip attribute that will skip a test until a specific GitHub issue gets closed, for instance (though that one might be a tad overkill, checking GitHub on every test run). Fixie only cares whether or not you want a given test to be skipped, and sets it aside for counting and reporting.

    The Implementation -or- The Greatest Pull Request Ever

    Other test frameworks treat skippedness as a kind of execution result. “Run this test.” “No. I’ll pretend I did and then tell you that I pretended I did.” This is weird, and I was about to mimic that weirdness when Max Malook (aka mexx on GitHub) found a much more clear way to implement it.

    From a user’s point of view, 3 things can happen to a given test: it can run and fail; it can run and pass; or it can be set aside, counted and warned as never having been run in the first place. mexx’s implementation does exactly that: after determining that a method is a test, a separate decision is made about whether it should be skipped. If it should be skipped, it is counted and reported but we never bother running it and we never bother pretending to run it. Simple, and a perfect match for the end-user’s expectations. We didn’t have to muddy the waters of what it means for a test to be executed. mexx’s pull request was fantastic: a clear implementation, with test coverage that matched the existing test style I was using for similar features.

    Time, Black Holes, and Console Capture


    In Listen Up!, we covered the implementation of Fixie/TeamCity integration, focusing on the main abstraction that feature introduced. Integrating with TeamCity forced me to improve the project in other ways, affecting more than just TeamCity: tracking the execution time of each test, addressing a console redirection bug, and capturing console output for report generation.

    Test Timing

    The first improvement was simple. TeamCity wants to display the time each test takes to run, and the messages we output to TeamCity therefore contain a duration value in milliseconds. This was the first time I needed to know the duration of a test, so I had to implement the timing functionality first. The main test execution loop gained a Stopwatch for each test case:

    The collected duration, a TimeSpan, is one of the things passed along to the Listener interface upon each test completion. TeamCityListener, for instance, receives that TimeSpan and outputs the duration in the units expected by TeamCity.

    The Black Hole Bug

    Next, TeamCity integration motivated the discovery and fix of a bug which could allow a “rude” test to trample Fixie’s own output.

    TeamCity, and potentially any other test reporting tool, wants to be handed the string equivalent of any output that a test wrote to the console. In general, tests should rely on assertions as their main form of “output”, but sometimes console output can help with diagnosing tricky problems. Maybe stepping through a test in the debugger locally works fine, but then the same test fails on the CI machine. Resorting to “low tech” debugging with Console.WriteLines may be the quickest path to a diagnosis.

    As I started to work my way through the console-capture feature, I realized that a related bug was in play. Consider a test method that redirects the standard output stream:

    Fixie starts up a test run, outputting results to the console as it goes, either via ConsoleListener or TeamCityListener. It reaches this evil test, outputs “All is well.”, and then the developer sees nothing else. All subsequent attempts by Fixie to write anything to the console, such as other tests’ failures, fall into the blackHole StringWriter instead of the real console.

    Imagine you’re using Fixie to test your project, and your project happens to redirect console output for legitimate reasons, but fails to gracefully return things to the original output TextWriter? Suddenly, Fixie stops telling you what’s going on. Imagine having to diagnose that issue. Bad news.

    Thankfully, implementing the console output capture feature fixes this bug at the same time.

    Console Capture

    The RedirectedConsole class below allows you to temporarily redirect all console output to a StringWriter for the duration of a using block:

    We wrap the execution of each test in a RedirectedConsole for two reaons. First, we need to obtain the string equivalent of anything the test wrote to the console. Second, we need to ensure that we redirect console output back to the right place at the end of each test, so that our EvilTest can no longer interfere with other tests. Fixie captures the output of each test, but can still confidently write output of its own without fear of it falling into a black hole:

    Listen Up!


    I recently wrapped up TeamCity integration for the Fixie test framework. Today we’ll cover the feature itself. Next time, we’ll see how this feature drove a number of other improvements throughout the project.

    ##teamcity Messaging

    TeamCity captures all console output during a build, mainly so that it can echo that output back to the user under the “Build Log” tab. While capturing the output, though, TeamCity also looks for any line of output that looks like this:

    Like an XML tag, these specially-formatted lines have a name and potentially-many key/value pairs. They differ from XML, though, to avoid ambiguity. Some part of your build script may just happen to write some XML to the console, and we wouldn’t want TeamCity to mistakenly think we want it to treat that markup as special. By looking for this ##teamcity prefix, it’s easy to be clear when we do want TeamCity to react to our output.

    When you want to tell TeamCity something during the execution of your build, your build process should output these messages. There are many message types available, but today we’re only concerned with the messages used by test runners. A typical sequence may look like this:

    Here we have a passing test and a failing test. Both tests happen to write a little bit to the console themselves, called out by the “testStdOut” messages. Both tests have a “testStart” and “testFinished” message. The second test fails, so it also has a “testFailed” message with the complete stack trace. TeamCity now has all the information it needs to display the test results.

    Note that TeamCity has its own rules about how special characters should be escaped. For instance, the Windows line ending marker (“\r\n”) needs to appear as “ r n”, and slashes in file paths don’t need to be escaped.

    Implementation Goals

    My goals with the implementation were to:

    1. Decouple knowledge of output formatting from the test runner logic.
    2. Cleanly bottle up the ##teamcity message format and string escaping rules, so that writing each message would feel as simple as using String.Format(...).
    3. Automatically detect whether TeamCity kicked off the test run, switching the output style accordingly.

    Decoupling Formatting from the Test Runner

    We’ve got at least two output formats: regular console output and TeamCity-style output. It’s likely that Fixie will need to deal with other output formats, such as the XML format other CI tools expect from NUnit. I’d hate for the test runner logic to get all mixed up with output format details, and I’d like it to be easy to swap in other formats later, so I have a “Listener” abstraction:

    The two main implementations are the regular ConsoleListener and the TeamCityListener, and both are fairly small. The test runner calls each Listener method at key moments of the run, but doesn’t care about what happens as a result.

    Why “Listener” and not “Formatter”? I’ve actually got another implementation, one for integrating with TestDriven.NET, and its implementation has nothing to do with formatting output. My TestDrivenListener‘s job is to echo each of the test runner’s events over to TestDriven.NET’s own similar interface. TestDriven.NET then cares about output formatting all on its own. Therefore, the abstraction really is just about listening to key events; only some implementations will react by formatting output.

    The other test frameworks I’ve inspected have similar “listener” interfaces. If there’s a better pattern to follow, though, I’d like to hear about it. It feels a little weird to have an interface where some implementations end up leaving methods completely empty, but the number of methods on the interface is small enough so far that it doesn’t hurt.

    Encapsulating ##teamcity Formatting

    In TeamCityListener, the Listener methods focus on the specific message and key/value pairs applicable in each situation, deferring to a couple private helper methods:

    The public methods can use the Message(…) method as easily as String.Format(…) without being concerned about special characters.

    Detecting TeamCity

    When running under TeamCity, we want to replace the normal ConsoleListener with our TeamCityListener. Thankfully, it’s as simple as checking for the existence of a special environment variable, which TeamCity sets for the duration of the build:

    Feature-Driven Development

    This feature motivated several design decisions. As we’ve seen, it motivated the Listener abstraction, which immediately proved useful for TestDriven.NET integration as well. Additionally, working on this feature forced me to discover new requirements and to avoid a pretty gnarly bug. We’ll cover those other improvements in my next post.

subscribe via RSS