Shifting testing strategies away from mocks

Over the past couple of years or so, I’ve started to come to the opinion that reliance on mocks for driving out design can seriously hamper large- to medium-scale refactoring efforts.  Things like adding method parameters, renames and the like aren’t affected.  But many of the techniques in Refactoring and Refactoring to Patterns become quite difficult to do when our tests are overly concerned with the implementation details of our system under test.

Before I get to the punch line, it might be worth setting up the context of how I got here.  Up until of a couple of years ago, I had written my unit tests in a very top-down, one-layer-at-a-time approach.  I’d do some whiteboarding, figure out what my top-level component looked like through tests, and built interfaces and mocked out interactions with lower-level components.  At no point was I testing the top-level component using the actual low-level component.

Two things started to influence my testing strategies away from that notion.  One was experience trying to do large refactorings, where I was pulling out cross-cutting concerns across an entire layer of the system.  At that point, all my unit tests with all their mocks were a large barrier to these design changes.  The tests were concerned with low-level interactions between components.  But at this level of refactoring, those interactions were about to get deleted, but I still had interesting parts of those tests I wanted to keep.  Namely, the direct inputs and outputs of the component, and I could give a flying [bleep] about the indirect inputs and outputs.

At some point, I wanted a test that only concerned itself with observable results.

Shift towards BDD

Following the talks and guidance from Scott Bellware, I started writing more tests in the Context-Specification style of tests.  That is, combining testcase-class-per-fixture and a naming style that focused on describing the behavior of the system from a user’s point of view (though who that user might be would change depending on the perspective).

At that point, talking about the interactions of between dependencies became less interesting.  Specifically, because I found that except for unobservable indirect outputs (sending an email, webservices and the like), the language centered around things that I could observe.

If implementation detail language was removed from the test names, I found myself far more concerned about how the real system behaved.  When test names and code were then using the real component and its real dependencies, I found myself far more amenable to making big sweeping changes in responsibilities, because at every level, I was describing the behavior of the direct, observable behavior.

Observing indirect outputs

Indirect outputs (a void method) are usually tested in a mocking framework by either setting an expectation that the method will be called, or asserting afterwards that the method WAS called.  But what if the real component was used?  The method was called in order to affect a change in something, that’s the whole point of commands.  Commands are intended to induce a side effect, it’s now just up to us to observe it.

For unit tests, I draw the line here when the request crosses the process boundary (database, queue etc).  However, I’ll still build full-system tests that do perform a command and directly observe all that there is to observe (granted that I actually own the observable systems).  If I don’t own that system that I can observe, that’s usually when our partners complain that I’m calling their tests systems too often and please stop.

When building out the components, I still do things like program to interfaces, as it still helps me build out a component’s shape without worrying about an implementation yet.  In the test, I’ll use a container to instantiate the components, only filling in process-crossing components with mocks.

Supplying indirect inputs

Indirect inputs are dependencies whose methods that are called return a value.  Same as indirect outputs – if the component crosses a process boundary, I’ll supply a fake or a stub.  In integration tests, I’ll supply the real component if it’s a system I own, otherwise I’ll keep the fake or stub.

Side note – I’m not a fan of committing always-running tests that hit systems I don’t own.  I don’t want a failed build because someone else’s test system isn’t up.

If it becomes difficult to set up these inputs, there are patterns for that.  Whether it’s object mother, fixtures or the builder pattern, I can still keep my tests concise and understandable, and push down the setup logic into helper components.

The end result

I’m still doing top-down design, but now I’m supplying the real components as much as possible.  I test at the detail as directed by the design of the component I’m building, but I allow the underlying components to do their work.  This cuts down on mock setups that don’t actually match what can happen in production – if you use the real deal, you can’t fake an impossible situation.

There are downsides to this approach of which I’m well aware – defect isolation being a big one.  But the tradeoff for an actual safety net instead of a noose makes up for this.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in BDD. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • I think I’ve come to the same conclusion as you have, but I’m not sure if it’s for the same reason. I have trouble understanding the vocabulary you’re using past the section titled “Shift towards BDD”.

    Is it because I’ve never read a book on BDD (I have read multiple on testing and TDD)? I’ve used mocks many times.

  • Absolutely agree 100% here. And when going to a BDD style of testing wherein you have a fixture per system under test, A – your tests are more readable (aka a spec) and B – less likely to have side effects. I’ve started to look at m-spec – but I’m not a big fan of it. Quite frankly, it is easy to roll your own framework.

    Nice post Jimmy!!

  • Interesting comments and a good point of reference for anyone experiencing the problems you mention.

    I have this feeling this isn’t a post that will inspire the TDD purists.

    A few examples would be nice, Jimmy.

  • So are you saying you supply hand rolled fakes, stubs or doubles?

  • James Morcom

    @Paul Cowan I think he’s saying that you should “bootstrap” all the services you need before running the test. i.e. Set it up to use a real database (be it SQLite or whatever), like you would with an integration test.

    I’ll be intrigued (in a good way!) if that’s not the case :)

  • @Paul

    Fakes/stubs for things I can’t control/or are slow (DB, web service etc.). i.e., using an in-memory DB or collection-backed repositories, for example.

    Integration tests that go end-to-end as the first test I write, and usually the last test that passes.

  • James Morcom

    Using the collection-backed repository as an example, could you explain how that overcomes the problem of exposing implementation?

    I ask because in my mind that collection-backed repository is essentially doing the same thing as a mock/stub, except you’ve spent more time implementing it.

    Sure the repository class is usually situated somewhere other than the unit tests so you’re less aware of it, but you could just as easily do the same with fakes I think?

    I still feel like I’m missing the point :)

  • Where I agree over mocking makes tests pointless, I am still struggling to see how you test that a void method was called.

    I use doubles a lot but I don’t want to create extra classes just to return fake data.

    I might be missing the point also :-)

  • Confused

    I think a simple example is in order here. Something that demonstrates the process that you go through using “Integration tests that go end-to-end as the first test I write, and usually the last test that passes” as a theme.

    Thanks in advance.

  • Nick


    I think in your case the answer may be in his post. If you are testing that a method has been called, it is a barrier to refactoring if you wanted to break that method down into two etc.

  • jrnail23

    Hi Jimmy,
    I’ve asked some of these same questions myself, and I think it might come down to setting up some sort of mechanism in your testing architecture that can swap out a fake/mock/stub with the real implementation (likely via IoC container) according to some test category/attribute, but keeping the same test logic. I haven’t really chased it much myself, but it seems like it might be worth the effort.

    Also, with respect to BDD, I gave it a good try a couple months back, but I kept getting frustrated with the tooling (specifically that the libraries I tried either involved too much setup effort, or they never seemed to match my abstract abstract mental model for what I was trying to accomplish).

    That said, there also didn’t seem to be a lot of practical guidance out there for successfully applying BDD.