Failures of aimless large-scale refactorings

At the recent Austin Code Camp, I heard a few stories after my Legacy Code talk about teams attacking their legacy code in prolonged refactoring binges that never seemed to end.  Never ending, because no one had a good idea of what success looked like.

Once technical debt has built up to a certain point, some folks opt to declare bankruptcy and start over.  A place I came from recently did this fairly regularly, about once every two years.  It wasn’t that a new technology would solve new business problems (although this was how IT sold the re-work), but the current codebase was completely unable to change at the rate the business needed.

Starting over

When we have a large legacy codebase, how should we turn it around?  How much resources do we allocate to re-sculpting the big ball of mud?  In a previous job, we did this by hijacking an entire release as an “engineering” release, rewriting the entire plumbing of the application.

Huge disaster, to say the least.

Needless to say, the refactoring itself was technically successful, but in the business side, a complete failure.  We paused adding business value for months while we tinkered.  The business absolutely resented us for this, and held it against us up until I left.

I’ve personally never seen a developer-induced month-long refactoring session succeed, and the failures of these refactoring efforts do serious long-term harm on the credibility of the team.  The business never EVER likes to see basically a work stoppage to fix a mess that, in their eyes, are of the developers own doing.

Risk and angles of attack

Another angle of attack besides charging up the hill to the machine gun nest, armed with only a bayonet, is to take a more strategic aim at de-gunking your system.

One approach that worked well for us was to only refactor in areas that needed to be changed.  Change is needed for many reasons, whether it’s new features, bugs, performance problems, or others.

By only refactoring areas that needed to change, we were able to mitigate the risk of performing large-scale refactorings by making small, targeted steps towards a brighter future.  We didn’t always know what the end result would look like, but that was a good thing.

If anything, developers are terrible guessers.  My design guesses are always always always wrong the first time.  It’s a waste of time to argue relentlessly on the best way to refactor a certain area, as tools like ReSharper and methods laid out in Kerievsky, Fowler and Feathers’ books make it fairly easy to change direction.

A pact for continuous improvement

To make sure that our codebase continuously improved, our team agreed that every change we made would improve our codebase.  Even something as small as eliminating a couple duplicate methods was a big improvement.  In a big ball of mud, we deleted code far more than we added, as it was rampant duplication that got us in to our original mess.

Over time, small changes allowed us to see larger refactorings that became more apparent.  Since the large refactorings were driven from actual needs to change, we had confidence that we were moving in the right direction.

The whole point of the refactorings was to eliminate duplication so that we could add value, quicker.  Stopping for months to re-work the application, given no direction on what areas are most important, is almost guaranteed to be a business failure, if not a technical one as well.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Refactoring. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Thomas Eyde

    This is interesting. I have more than once favored to declare bankruptcy, but that is on tasks way smaller than you discuss. When the total estimate is one iteration or so, and the code is a mess, written by someone not around anymore, I still believe it’s easier to start over.

  • @Thomas

    It might be easier, but the question is “does it make business sense”. Something like a re-write has to be a business decision, based on opportunity cost and liability of the system.

    I have worked on systems that required complete rewrites, but even in those cases, it wasn’t an all-at-once proposition. We rewrote modules, turning them off in the original app as we went.

  • Ben

    This is an interesting post…

    This topic is something I’m dealing with at work. We have a large C++/MFC/COM desktop app that is suffering under its own weight and bad design. We’ve pretty much made the decision to rewrite in C#/WPF, but the requests for updates and fixes never stop coming in.
    In our case, I *think* the business decision is to go through with it because it will allow us faster update times, leading to more releases, a more attractive interface that will leap frog the competition, etc., but it’s always a gamble.

  • I’m heading up a bankruptcy right now. We started off with four full time developers and one part time. Three months of design and code got us to a skeleton for the rest of the system. But then we had to take time off to work on our other projects. It ended up being around three months. During that time only one developer remained on the original project, working on minor housekeeping tasks that had built up. Now we’re back on to two full timers.

    Management has come to accept that we can only do so much. If they limit our resources, they are not going to get a rewritten product on any set time frame. That’s an important concession in this situation. Another is that the original developer of the big ball of mudd is NOT on the rewrite. He is still maintaining and adding business value to the original product (i.e., make a small change, test for a week (!), fix the issues found in the test, repeat until we just give up and release the fixes). That also gives us breathing room we need with so few people.

  • @Ben

    As long as it’s a business decision, I think you’re safe. Changing technologies is certainly a business decision.


    One thing you might look at is a phased replacement. If there are ways to replace subsystems at a time, it’s the less risky option. Bugs and feature requests will still come in for the old system, which make all-at-once replacements nearly impossible to pull off successfully.

  • jlockwood

    I think that the primary reason for “declaring bankruptcy” is that too many development teams build huge systems while failing to buttress them with good code coverage. Refactoring should be constant and built into the development process.

    Large refactoring crusades are an organizational smell…some developer or group of developers are convinced that they could do better than their predecessors and launch a crusade (like putting together a code posse and setting out for a little southern justice). Such crusades often indicate a lack of cohesion and poor maintenance of the overall product vision.

    I think that the “large refactorings” that you are talking about are real reengineering effort. To refactor your code you are supposed to be able to make structural changes while preseving ALL the current behavior. The larger efforts often fail to accomplish this and often are a waste of time and money.

    In short, if you don’t have good coverage of automated unit tests in place, you really shouldn’t be refactoring anything. If you set out to reengineer a system without a commitment to automated testing then you are committing the future development team to the same nightmare that you are currently enjoying.

    Just my opinion of course. ;)

  • Bogardj:

    Subsystems? What are subsystems?

    We looked at it and figured that the task of hooking new code into the old ball of mud would be a complete waste of time. Currently a simple bugfix takes months to test. The idea of cutting whole chunks of legacy VB code and trying to patch it back together again with C# and NOT infect the new code with the insanity of the old, plus multiply testing time by the complexity of the part replaced…. I could go on.

  • @Will

    I guess I meant “business subsystems”…we’re doing this right now. We’re replacing the “order entry” subsystem, replacing with a new one, and writing to the existing database. The rest of the legacy system doesn’t know the difference.

    We then turn off this functionality in the legacy app. That way we don’t have to replace catalog management, customer management, order processing, fulfillment, etc in one fell swoop.

    Looking at a business process pipeline, we replace one section at a time, leaving downstream legacy steps intact. When one section is done (like order entry), we move to the next one, such as customer management. As long as we preserve the data going in, we’re set.