Abstraction: The Rule Of Three

I often hear people say something like “if you need it once, build it. If you need it twice, abstract it.” People often say then in the context of the “DRY” – or Don’t Repeat Yourself – principle. In theory this sounds great because you’re removing duplication in your code. This falls apart pretty quickly in a lot of circumstances, though.

The idea of DRY needs to be tempered with YAGNI – “You Aint Gonna Need It“. With that, we end up with The Rule Of Three, and it clearly says that code can be copied once but the third time you need it, you should abstract it.

Patterns In Math

Think back to early school years when you were learning about patterns in math class. You were most likely introduced to problems that read something like this:

Complete the pattern:

1, 2, 3, _, _, _

The obvious answer here is that the pattern continues “4, 5, 6″ because the pattern is to add 1 to the previous number. 

But what happens if we remove 3 from this pattern:

Complete the pattern:

1, 2, _, _, _, _

Is it possible to complete this pattern? … I don’t think it is – at least, not with certainty. You could assume that the pattern is 1, 2, 3, 4, 5, 6 but you might be very wrong. What if the pattern is not “add 1 to the previous number”? What if the pattern is “multiple the previous number by 2″? Then the pattern would completion would be 1, 2, 4, 8, 16, 32.

The point is that two positions does not create a guaranteed pattern. Two positions creates the potential for a pattern to emerge. We need at least a third position for the pattern to be identified in even the most simple cases. Often we need more than that to truly identify a pattern of more complexity, though.

Patterns In Code

If a mathematical pattern can’t be properly identified with only two positions, then why do we say “Don’t Repeat Yourself” in code and force abstractions on ourselves with only two reference implementations? Honestly this is often because the problems of code duplication begin the first time we copy & paste something. Most of us (myself included) have run in to so many problems that started the moment we copy & pasted code, that we force ourselves to abstract when immediately.

The problem that we face in abstracting code is the same problem that we face in math patterns, though. If we only have two use cases for a given pattern, we don’t yet know the exact shape of the pattern or whether we even have a pattern to begin with. 

An Example in JavaScript:

MyObj = {

  foo: "bar",

  // ...

  stuff: function(){
    if (this.foo === "bar"){
      this.doSomething();
    } else {
      this.doAnotherThing();
    }
  },

  moreStuff: function(){
    if (this.foo === "bar"){
      this.otherStuff();
    } else {
      this.moreStuff();
    }
  }

};

In this example, we have a condition that is being checked on an object. If that condition is true, we do one thing. If that condition if false, we do something else. This pattern of if-then checks is repeated between the “stuff” and “moreStuff” functions.

Now the engineer and pattern-matching-machine that I am says, “Looks like we’re checking the state of this object and doing something different. Let’s build a state pattern or statemachine.” But this is a premature optimization. We could spend a few hours or days building the state pattern or statemachine to handle this for us only to have the requirements for the system change or realize that there’s a bug and we need to update the code to look like this instead:

MyObj = {

  foo: "bar",

  // ...

  stuff: function(){
    if (this.foo === "bar"){
      this.doSomething();
    } else {
      this.doAnotherThing();
    }
  },

  moreStuff: function(){
    var result = "some default";

    if (this.baz === "quux" && this.getSomeStuff("foo")){
      result = this.otherStuff();
    }

    return result;
  }

};

Now we don’t have the same potential pattern in place anymore, and we don’t need the same state machine that we might have previously built. The abstract that we may have spent our time on is wrong and is probably going to get in the way and add overhead and cruft that we no longer need.

Instead of saying that we should abstract something the second time we need it, then, we should be saying “pay attention” the second time we need it. Then when we see this same implementation for a third time, a more appropriate abstraction can probably be made.

The Rule Of Three

The rule of three applies to both simple math patterns and code patterns equally well. In math, we need at least three positions before we even have a chance at identifying a pattern. In code, the exact same thing is still true.

If you need something once, build it. If you need something twice, pay attention. If you need it a third time, abstract it.

Intuition And Limitations

In spite of the validity that the rule of three has, it isn’t a golden law of software development that will solve all of your abstraction needs. Like every other “rule” we use in software development, it’s a heuristic – a guideline to give us an idea, a percentage match to judge against and see if we need to abstract or not.

The rule of three is not the only heuristic by which we judge, though. Intuition and experience also come in to play. We may see the pattern emerge after a single use, given enough experience and intuition. It may be dangerous to abstract at this point, though, because the details of the specific scenario are going to bleed in to the abstraction. A second implementation of the same pattern may show that we truly are going to repeat this pattern throughout our code. The second implementation will also give us a better idea of how the abstraction should be shaped so that we don’t bleed too many details of the specific scenarios in to it. There’s still some danger here, though. Our intuition might still be off a bit, but with experience and input from others we can probably adjust for that. 

In the end, though, the goal of “Don’t Repeat Yourself” needs to be tempered with YAGNI which leads us to The Rule of Three and the mathematical proof that two positions does not make a pattern. Given the combination of these and other principles as well, we have a good framework for moving forward and creating solid abstractions in our code.


Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Derick Bailey

Derick Bailey is an entrepreneur, problem solver (and creator? :P ), software developer, screecaster, writer, blogger, speaker and technology leader in central Texas (north of Austin). He runs SignalLeaf.com - the amazingly awesome podcast audio hosting service that everyone should be using, and WatchMeCode.net where he throws down the JavaScript gauntlets to get you up to speed. He has been a professional software developer since the late 90's, and has been writing code since the late 80's. Find me on twitter: @derickbailey, @mutedsolutions, @backbonejsclass Find me on the web: SignalLeaf, WatchMeCode, Kendo UI blog, MarionetteJS, My Github profile, On Google+.
This entry was posted in AntiPatterns, Javascript, Pragmatism, Principles and Patterns, Risk Management. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://blog.ploeh.dk Mark Seemann

    Great post. Relevant to this discussion is also the Reused Abstractions Principle: http://codemanship.co.uk/parlezuml/blog/?postid=934

  • http://profiles.google.com/bolibic Javier Gonel

    In software development when presented with 1,2 you abstract doing +1. Next day if presented with “1,2,3″ you can say you were lucky because your abstraction still works. But 50% of the times will be “1,2,4″ and you will have to change it.

    You don’t know “when” you will be presented with “1,2,4″ (or “1,2,3″). But right know you have 1,2 and you know that copy pasting code is a debt you want to pay as soon as possible.

    Of course YAGNI and KISS. Don’t overengineer it. Common sense, etc. But let the changes flow as they come. Don’t be scared of change, embrace it.

    When you’re scared of changing something because what could come next. Is just the opposite of over-engineering: under-engineering. It’s like not committing your code now because you want to change another file… you can always wait and it will make more sense. But focus on the things you know know, and if it makes sense now. You got it.

    Just don’t wait.

  • joshuaflanagan

    I think you are making the (common) mistake of misrepresenting the DRY principle, when suggesting it needs to be tempered with YAGNI. DRY doesn’t mean ‘do not have lines of code that look similar’. From http://c2.com/cgi/wiki?DontRepeatYourself :

    “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.”

    Its about making sure the facts in your system are not declared in multiple places. Consider a fact like “Last name is always required”. Ideally that would be declared in one place in your system, and anything else that needs to support it (client validation, server validation, database constraint, etc) would flow from that single declaration.

    The example you provide isn’t a DRY violation – its just code that looks similar.

    • Mark

      The piece of knowledge that is being duplicated in the example is that the value “bar” should be handled differently then any other value.

      • joshuaflanagan

        One could certainly do some mental gymnastics to make the situation fit the definition, but it only dilutes the intent of the DRY principle.
        I think what you are looking for is Once and Only Once:
        http://c2.com/xp/OnceAndOnlyOnce.html

  • Leonardo Pisano

    For that matter: 1,2,3,5,8,13.

  • http://profiles.google.com/rkeurel Kris Garcia

    Good stuff. Definitely resonates with me. Code is knowledge, so I think the DRY principle does apply to lines of code. Plus, it seems to me that part of the problem that the Rule of Three heuristic can help address is to determine whether a line of code is JUST SIMILAR to another line of code as opposed to ACTUALLY REPEATING the same concept. As time goes by the two lines may diverge and so it makes sense to delay abstraction until pragmatism demands it (which reminds me of The Last Responsible Moment).

  • nerd

    When looking at 1,2,3… I can think of millions of valid patterns, e.g. f(n) = round(3*f(n-1)/2), which gives: 5,8,12,18,27… Or f(n) = f(n-1)+f(n-2)…
    One can show that there are unlimited number of patterns to continue any limiter sequence of numbers.

    • Alejandro Varela

      damn true

    • Guzart

      yeah, but you can think of one less

  • imnotsayinimjustsayin

    1,2,3,1,5,7,1,11,13,…

  • Dan Sutton

    There’s something missing here, and that’s the concept of what the code is going to be used for, and, more importantly, how it’s going to be called.

    Here’s an example of what I’m talking about: imagine you’ve got a class which ships packages via the US Mail, and another one which ships packages via FedEx. Conceptually, they’re going to be similar; internally they’re going to be different.

    Now, if different software is going to use each class, then it’s OK to keep them separated with occasional duplication. But if you’re going to create something which expects an object of type “shipper” then you’re going to want to make an abstract class (or an interface) which contains empty methods to perform any shipper function, and subclass the US Mail and FedEx classes from it. Of course, you’re going to find parts which contain identical code, and by all means, those parts should be defined in the abstract superclass.

    But my point is that any decision you make about how to abstract the code should be informed by the requirements of what’s going to call it. It’s all very well to throw things around like DRY and YAGNI, but in the end, what counts is common sense.

  • Andrew Young

    Two points make a line. Third point validates it.

  • UKB

    1, 2, 3 could be add the previous two numbers together so that isn’t completely accurate either