Fighting technical debt with the wall of pain

Technical debt, even on the agile-ist of agilista teams, still accumulates.  Debt is inevitable, as initial design is always based on assumptions, not all of which pan out.  I never view that as a failure, as all we can do is make the best decision given the limited information we know at any given time.  Experience with DDD has shown me that often, design and architecture doesn’t undergo linear improvements.  Rather, we reach a critical mass of information and knowledge where it becomes plainly and painfully obvious that our current design will no longer suffice.

In other situations, technical debt arises out of creating hidden forms duplication where only large-level, architectural refactorings can take the codebase to another level.  This duplication comes not necessarily from domain concepts, but structural concepts, or what I like to call the “application domain”.  We often need to create a model of our application, with things like the Command pattern, Strategy pattern and so on.  Breakthroughs in this area happen just as they do in our model of the problem domain, but instead arise out of use cases and the structure of our application, rather than from a deeper understanding of our problem domain.

Finally, we have technical debt arising out of finding better ways of accomplishing the same result.  It could be a layer supertype, switching to a jQuery plugin, or just plain removing subtle copy-paste duplication.  Often, we see this last kind of technical debt arising out of larger teams, where individual frequency of duplication is rare, but project-wide duplication is far more pervasive.  For example, I might introduce subtle duplication around an area once every three weeks.  Because it is so sporadic, I don’t necessarily perceive the problem.  But if I have five other team members, the frequency of duplication might be three times per week, above the threshold for one developer’s perception.

To tackle the inevitable growth of technical debt, we need some sort of strategy in place to address that risk.

Option #1 – Do nothing

Yes, the team might say a lot, complaining and the like, but never do anything about it.  Or, they are simply not aware.  Ignorance, apathy or hopelessness, I’m quite sure this approach is not viable for applications that need to change.

Or, the team has made management or the product team aware of a problem, but were unable to explain or justify the work required to fix the problem, so it merely lingers.  I’ve been in this option far too often in my career, and it’s never a fun feeling.

Option #2 – Cowboy style

In the absence of any form of technical story artifact, it’s up to the development team to prioritize technical debt payments, in the form of refactoring stories.  Outside of any strategy, prioritization and planning is often cowboy-style, done immediately when found, without regard to impact or other more important issues.  Refactorings are chosen based on how interesting it might be, the level of immediate frustration of the developer finding the duplication, or how easy the fix might seem.

Most of the time, it’s done exactly when the fix is found, without much effort into analysis of the impact, cost or benefit of the fix.  Additionally, the fix is usually quite local, and not propagated to the rest of the system.  We might make it easier going forward working with a standard component, but all too often this new standard is not retrofitted back to the rest of the application.  This leads to multiple “right” ways of doing things, where the design of the next new component starts from the last component built, as long as we happen to remember what the last component was.

This happened on a recent project, where we created a new layer supertype, and all new controllers needed to implement this new base class.  Unfortunately, this design was refined several times, until we had a half-dozen versions of the “right” way to design a controller.

It’s quite enticing to cowboy-code technical debt fixes.  We, as the developer, get to play the hero type, announcing to the team “I fixed the Floogle problem!”.  It’s a good feeling, unless you’re met with blank stares and unspoken (or vocal) questions of “why did you waste time on that?”

Option #3 – Wall of pain

Instead of fixing technical debt issues at the exact moment we encounter them, our team instead keeps a prioritized queue of technical debt items on our whiteboard.  On it, we track two sets of items:

  • Problems with a proposed solution
  • Problems with no solution, and a plea for help

As a contract to ourselves, we took an oath to not go forward with a new solution without retrofitting the application to use the new concept.  This ensured that we didn’t have pockets of bad code intermixed with good code, or worse, seven different designs in our system of the same concept.  The retrofitting was key for us to continue to innovate and improve, and let us find broader concepts.  The more examples we had of a concept, the more we could find commonality.  If there are seven ways to do the same thing in our system, there is just as much duplication, but a duplication that is much much harder to both recognize and fix.

Kevin and Jeffrey teamed up to imbue in me a sense of pain-driven development.  The more something hurts, the more it needs to be fixed.  If local builds take 10 minutes, that’s a lot of pain throughout the day.  This concept led us to our current “wall of pain” that includes these items:

  • The problem
  • The solution
  • The level of pain (1-5)
  • The estimate to retrofit, in points

If something is very painful, but easy to fix, it is moved up on the list.  However, the effort to retrofit has to be a consideration, as it often indicates the risk introduced into our system of a large change.  Since a defined solution is part of this list, the point estimate is defined in terms of relative complexity, and therefore risk.

For items without a solution, we just leave out the solution and estimate to retrofit.  These items are analyzed and given a solution as needed.  Many designs need vetting before established as the “new standard”, so our team will wait to pick a solution until we’re quite satisfied with the result.

All of this assumes that technical debt items aren’t tracked in your normal taskboard/kanban/story wall.  Otherwise, you’ll likely have an established manner of prioritization.

We’ve had our wall of pain up for a few months, and at the very least, it provided visibility and a way for us to communicate the day-to-day development pain we might face.  We still introduce minor fixes and technical debt payments during our iterations, without using the wall of pain, but these are of the small and completely obvious variety.  Still key is the concept of retrofitting, which we find greatly helpful in maintaining a consistent codebase.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Agile, Refactoring. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://www.agilification.com Jeff Doolittle

    We don’t have a wall of pain for technical debt, but we recently introduced a wall of shame for defects. We also put a team member’s picture on that wall if he/she breaks the build. The build hardly ever breaks anymore. :)

  • http://www.lostechies.com/members/rssvihla/default.aspx Ryan Svihla

    I love Jeffrey’s posts and discussions on Pain Drive Development and have found it very valuable to how I approach solving problems, unfortunately I think I’ve gone cowboy with it a few times. Will try option #3 more now .

  • http://www.lostechies.com/members/gnschenker/default.aspx Gabriel N. Schenker

    very enlightening post!

  • http://www.lostechies.com/members/bogardj/default.aspx bogardj

    @Jeff

    Oh yes, the wall of shame. I’ve definitely been known to make that list a few times!

  • http://rhouston.lostechies.com Ray Houston

    We prefer to have developers wear the cone of shame when they introduce defects.

  • http://blog.visitanywhere.info dennis sellinger

    Interesting post. I like the idea of keeping a running record of code smells. However, I think we should be choosing “all the above” when we are dealing with code smells. Sometime it just doesn’t pay to “fix” working code, especially when you have other priorities. Second, if your team has competent software developers, I think when they see potential problems, they should just fix them. No need to call a meeting of the joint chiefs of staff. A competent developer usually “has a feeling” for the refactorings that can be “just fixed” and which should be escalated. We should not be killing the initiative of these people. Finally the wall of pain is an excellent idea for managing smells and seeing that they are dealt with, especially if they may require structural changes to the application.

  • http://blog.robbowley.net Rob Bowley

    There is another way of dealing with technical debt (and my preferred way) – being aware of where it lies and dealing with it when you need to change/extend the code base. If necessary (i.e. you’re not doing continuous flow), factor this in to your estimation of the time it will take to produce the feature which requires you to touch the legacy code.

    Having cards on walls could easily just become another dumping ground for problems that can be left for another day and why fix things that already work? It seems wasteful to me to fix problem code which may not need to be touched for a long time ahead of improving the code as you go. It’s not technical debt unless you have to change it.

  • http://www.lostechies.com/members/bogardj/default.aspx bogardj

    @Dennis

    Agree with all of the above! Sometimes, a fix is so obvious, it’s worth it just to fix it right then and there. It’s when the duplication is widespread, we need to stop and look at the cost to fix it everywhere. That’s where we’ve been burned, someone fixing it a couple of places, seeing how widespread the issue is, and giving up.

    @Rob

    That’s what we tend to do in legacy codebases – only improving places we need to change, and slowly rolling fixes out. Fixing it everywhere does have a cost that needs to be weighed. But what we found is that when we completed a site-wide refactoring, it opened doors to even more refactoring opportunities, simply because we had more examples to work with, a wider net was cast.