Cost vs. Risk In Testing

There was a bit of interesting discussion on twitter this morning, concerning the cost of test-first vs. risk. Here’s the visual version of what I’m saying:

image

The premise behind the value of test-first is that we will wash out (or reduce) the cost of a system over time by having a system that is easier to maintain, etc. I believe in the value of test-first development and I have seen the benefit that it creates when done right. I’ve also seen the downside of it when done poorly or in the wrong circumstances.  Notice that I’m not putting any specific numbers on this chart. I left any specific numbers off the chart because the context of every application, every team, ever customer and everything else involved in a system is going to change the numbers and modify the curves.

There is a cost associated with a test-first approach. If you put in 100% unit test coverage, and 100% integration test cover and 100% end-to-end functionality test coverage, then you end up with 300% coverage of your system. Is the cost of maintaining 300% coverage worth it in your system? Can you get away with a grand total of 100% coverage while limiting your system to an acceptable amount of risk?  Do you even need 20% coverage to reduce the risk to an acceptable level? Does it make sense to take on a test-first approach for a given feature, system, bug fix, etc? What’s the risk vs cost? Is there a mixed approach of getting it done now and writing a test for it later, that will provide both risk mitigation and cost effectiveness?

I can’t answer these questions for you. Your answers will vary depending on a large number of factors. For example, a startup on it’s first iteration of a product may not find any value in a test-first approach while a well established business creating critical systems for itself or a client will likely find the risk of not using test-first development too high. You have to consider the cost vs risk in your specific circumstances. Find the sweet spot for your circumstances and don’t assume that the sweet spot for two different projects will ever be the same. Don’t be surprised when you realize that the sweet spot changes within the same project, either. Sometimes the sweet spot will be test-first. Sometimes it will be test-after. Sometimes it will be not to bother with an automated test at all.

Be pragmatic, not dogmatic, about your approach to testing and development.


Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Derick Bailey

Derick Bailey is an entrepreneur, problem solver (and creator? :P ), software developer, screecaster, writer, blogger, speaker and technology leader in central Texas (north of Austin). He runs SignalLeaf.com - the amazingly awesome podcast audio hosting service that everyone should be using, and WatchMeCode.net where he throws down the JavaScript gauntlets to get you up to speed. He has been a professional software developer since the late 90's, and has been writing code since the late 80's. Find me on twitter: @derickbailey, @mutedsolutions, @backbonejsclass Find me on the web: SignalLeaf, WatchMeCode, Kendo UI blog, MarionetteJS, My Github profile, On Google+.
This entry was posted in Analysis and Design, Pragmatism, Productivity, Quality, Risk Management, Testing. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://www.blogcoward.com jdn

    +1

  • Jon Kruger

    Amen. Exactly correct.

  • John Teague

    Interesting. Seems like I remember some older posts or conversations with you about wanting 100% coverage. Has your position changed?

  • http://www.lostechies.com/members/derick.bailey/default.aspx derick.bailey

    @John,

    yes, and i reserve the right to continue to be wrong and change my mind again. :)

  • John Teague

    Hey, it’s a journey right! I remember when I liked datagrid_itemdatabind ;)

  • http://kyle.baley.org Kyle Baley

    In the future, I demand you make posts like this *before* I learn the lesson the hard way.

  • http://postrequest.blogspot.com Tom Langston

    I get your point and I am not saying it was intentional, but the percentages are misleading.

    Each type of testing covers a different class of bugs, which combined only add to 100% of error cases coverable by testing. Yes there is some overlap, but that brings me to my next point.

    Different types of testing also have drastically different costs associated. If we could afford true 100% end to end testing, maybe unit testing wouldn’t be required. In reality, we rarely ever do. 100% end to end testing is too expensive in terms of time and money.

    We do unit testing and integration testing because each is much cheaper than end to end testing. This is in no small part do to the fact that it is agile. We get a cheaper bang for our buck, but we also get it during early development.

    So why not skip end to end all together? Well we have some diminishing returns. Your first end to end test is going to be more productive than your unit tests at some point. It doesn’t matter if all your code is unit tested if you have a show stopper that an end to end test would have caught.

  • http://lostechies.com/blogs/derekgreer Derek Greer

    Interesting timing, as I’ve been contemplating my next few articles being a series on testing.

    I’ve been on the TDD journey for a few years now, but unfortunately I didn’t have the fortune of coming by my current practices through immersion which I feel would have greatly decreased my learning curve. Over the past year, I’ve been mentoring a team using TDD and have come to the conclusion that there’s still a lot I have to learn. Starting the practice of red/green/refactor and achieving high test coverage is one thing, but learning to write clean tests which clearly communicate intent, have good failure feedback, low fragility and low overlap can take time to get good at.

    I think I agree with the “sweet spot” sentiment, but I also think test writing skill is an important dimension to factor into the equation. I think the cost/risk benefit sweet-spot begins to move closer to 100% coverage the better you get a writing tests. A novice TDD-practitioner might start out writing a Web Service that has 1500 specs which may take more time to write than the actual code and may actually make the service harder to maintain due to the low quality of the specs, but a year later they may be able to achieve better quality with the same coverage with only half or a third that number of specs.

    For those who’ve been doing TDD since its inception, 100% coverage (or near) may actually be their “sweet spot”. This may not yet be mine, but that is still my goal. For now, I consider anything that holds me back from this to be a deficiency in my test-writing skills rather than there being a flaw in my goal. That said, I too reserve the right to change my mind later :)

  • http://www.lostechies.com/members/derick.bailey/default.aspx derick.bailey

    @Kyle – i’ll get right on that. :)

    @Tom – overlap is potential waste – having to maintain multiple tests that cover the same code adds cost that may not be necessary. but again, “it depends” :) i can certainly see situations where the value is higher than the cost – especially in mission critical and highly varying parts of the system where you need to isolate something small, but still want to test the larger.

    @Derek – great insight into the circumstances! i agree that the 100% coverage might be the sweet spot for some, depending on the skill level, business and market needs, etc. great insight into learning, too. i made all those mistakes as well, including the over-specifying / over-testing of various parts of the system. i once had 27 tests around a presenter that only had 2 text boxes and a drop down list. :)

  • http://www.cauthon.com Darren

    Hi Derick,

    I agree with much of what you wrote here, but I have to disagree with the overall sentiment and the “sweet spot” idea.

    From what I’ve seen from myself and others who have adopted TDD, the main effect that it has on the developer is that he or she accepts the responsibility for writing an application that works. Now, most non-TDD developers will say they accept that responsibility, too, but they can’t be compared to someone who considers the tests before the production code. Test-first is more than a process, it’s an attitude. My code will do what I say, and it will continue to do so.

    The “TDD attitude” goes a long way, even when the testing skills don’t exist. For example, my first TDD’d project went live, got hammered by thousands of users in the first couple days, and there were no errors. It just worked, unlike every single one of my previous non-TDD applications. Sounds good, right? Well, recently I’ve had to go back to that application to make changes, and the tests looks disgusting now. I see SRP breaks, I see situations like yours where I’d have 27 tests around something that did three things, etc. However bad the tests look now, though, they

    1.) Did the job.
    2.) Gave me the experience necessary to write better tests later.

    I won’t go into a full description of my current testing approach, but obviously: I don’t get 100% coverage on most of my projects, and I’m leaning more towards Cucumber/SpecFlow tests instead of unit testing every single class. However, I’m still not satisfied with my own tests. Next year I should be doing it even better and faster, and the year after that even better and faster than before, and so on. I’m just as dogmatic about TDD as ever, but I’m the first to admit that I’m still learning how to do it better.

    I know you are a tester, but I think most people who talk about when it’s good or bad to test or mention something like a “sweet spot” are not testers. They don’t want to test, let alone test-first, and their feelings are considered when they measure their own “sweet spot.” They might produce a similar amount of tests than, say, I would, but without the “TDD attitude” the tests can’t be successful. How can they be successful? When TDD’ers are thinking about how to test, they’re thinking about how not to test. When TDD’ers are looking at the “sweet spot” as the first place to improve their testing, non-TDD’ers look at the “sweet spot” as the place where they can get back to coding how they want. “Don’t be so dogmatic about testing, ” they say.

    Another thing to keep in mind: All principles, programming or not, are meant to be applied in a context. Life is complex, and there are situations where some values will override others. For example, I can say that, on principle, I show up for work on time and put in a full day’s work, and I do so. But then if there’s a family emergency, I will not go into work. Does that mean the principle of showing up for work is something that I should be “pragmatic” about? If my kid asks me if it’s important to be on time do I answer “It depends?” Of course not. I’m going to teach him the principle. The exceptions will work themselves out over time.

    TDD is the same. I hold is as a principle, but the situation sometimes demands that I don’t do it. But heck if I’m going to elevate the exceptions to the same level as the principle itself.

  • http://www.lostechies.com/members/louissalin/default.aspx Louis Salin

    Any test that is hard to write and/or maintain needs to yield enough information to be worthwhile. A test suite that is a pain in the butt to maintain but always passes yields absolutely no information and is a waste of time. If the same test suite always breaks, it yields no information. The sweet spot in this case is a test suite that fails 50% of the time.

    Now, unit tests and the TDD approach are so painless that what I wrote above doesn’t really hold. But as the complexity grows, you have to make sure that to compensate for the extra effort the tests need to give you better information. And that should probably be the topic of an entire book!

  • Ed Guiness

    Nice picture.

    Regarding coverage percentages, I think it’s sometimes easy to forget that code coverage is not the same thing as comprehensive and good testing. In other words, it’s quite possible to have a poorly tested code base with 100% coverage.