Encapsulation: You’re Doing It Wrong

Encapsulation, or Information Hiding, is one of the core principles of object oriented software development. It ranks up there with Cohesion, Polymorphism, Inheritance and all the other OO things we all hear about and learn about. Unfortunately, many of the software developers that I’ve encountered in my career don’t know what encapsulation actually is… honestly, it’s only been the last 2 to 3 years that I’ve really started to understand it, too, so how could I expect everyone else to get it right?

 

A Definition

According to Wikipedia, Encapsulation, as Information Hiding, is when:

the internal representation of an object is generally hidden from view outside of the object’s definition. Typically, only the object’s own methods can directly inspect or manipulate its fields. Some languages like Smalltalk and Ruby only allow access via object methods, but most others (e.g. C++ or Java) offer the programmer a degree of control over what is hidden, typically via keywords like public and private.

While this definition and the rest of the section on information hiding is technically correct, it is also naive or short-sighted.

 

A Common And Misguided, Misunderstanding

I’ve watched many developers – including myself – make arguments along the lines of “properties are encapsulation”. After all “I made the fields private, and only let my data be accessible via the public properties.” … and that means my objects are encapsulated, right? It gets even better when we hide a private List<T> field behind Add and Remove methods and only expose an IEnumerable<T> property to get the entire list. Sure, this may be a part of how encapsulation is done. However, a lot of people stop here. Saying that this represents encapsulation in it’s entirety, or stopping at this point and not even considering that this is short-sighted is dangerous.

If we were to take the Wikipedia definition as all we need, the result would be a spaghetti mess of code that becomes difficult to work with, quickly. We end up with a lot of objects that are nothing more than simple data structures that we pass around to the various UI components, data access components, web service integration points, etc. As a system grows, code gets duplicated and triplicated and then some. A well defined process suddenly has a dozen or more implementations because there are that many sections of the application that need all or part of that process. We introduce complexity and maintenance nightmares with this type of code and our ability to be productive with the code base tanks, rapidly.

There is a better way, though, and it starts by admitting this:

Encapsulation: I’m doing it wrong.

 

A Better Definition: Information Hiding

I’ll leave the first part of the definition I showed above, alone. It fits well with the right way to look at encapsulation:

the internal representation of an object is generally hidden from view outside of the object’s definition.

Beyond this sentence fragment, though, we have to throw out the definition and re-examine two key words: information hiding.

First, “information”. What is information? Isn’t it the data that my class stores? Possibly… but that’s a very narrow view of information. I’ve heard many times in recent years that we have an abundance of data, and a lack of information. I like this perspective because it helps to draw a clear line between data and information. Data is all the bits – the facts, the figures, the snippets and quotes, and the raw pieces that can be put together and pulled apart in many different ways. Information, on the other hand, is the application of data as something meaningful and valuable.

Second is “hiding”. Ok, this doesn’t need much explanation. Hiding is making something not visible to those that don’t need to see it.

Now take the perspective of data versus information back to simple idea of encapsulation. What are the properties of a class? What do they represent? … data. They are bits that can be put together and pulled apart, and used in various ways. The data is necessary. We often need to have properties (whether or not we have “encapsulated” the data in a private field with a public property to wrap around it) that tell us something. However, the data alone cannot be called “information”.

For the data to be meaningful and valuable, we have to put it to use. We do that with additional code – by writing routines that know how to look at the data and do useful things with it. The business process, the logic and flow, the sequence of operations; these are all the things that turn data into information. While the process is a necessary part of creating information, is not sufficient on it’s own, either. In order to have information in our systems, we must have both the data and the process to transform it and apply it where it is meaningful and valuable.

 

Correctly Applying Encapsulation

We know that both data and process are necessary to create information, and we know that neither of them are sufficient on there own. We also know that encapsulation is information hiding: preventing the outside world from knowing about the internal detail and implementation. With that in mind, we come to the conclusion that encapsulation is not just private fields with public properties, but it is both the data and the process (or behavior) being hidden or wrapped up in an implementation that is not exposed to the outside world. When we create a class, we do not stop at hiding the data with private fields. We also ensure that the process of converting the data into information is hidden within the class. We encapsulate the data and the process, creating an object that has both data and behavior.

A human being, for example, is not just height, weight, eye color, hair color, and other data points. A human being is also a set of behaviors (influenced by many things, including genetics, life experience, etc) which make each of us unique. Similarly, a car is not just a make, model, year, pain color, wheel size, etc. A car is also a set of behaviors that allow it to be driven.

 

Encapsulation: Do It Right

Re-examine your current coding practices and ask yourself i you really are encapsulating correctly. Are you truly hiding information, or are you just creating public properties that wrap private fields? Yes, there are times when simple data structures – classes that have nothing more than properties to get and set data – are necessary. However, these are not representative of encapsulation.

Encapsulation is a powerful tool. Learn it. Practice it. Apply it to your code, correctly.

 

Reaching For And Moving Beyond Encapsulation

There are many other object oriented principles, as I mentioned previously. Encapsulation is only one piece of writing good object oriented software. There are other principles that need to be applied and accounted for, as well. For example, I’m fond of the SOLID software development principles – not because I think they are the one true way (they aren’t, by the way), but because they provide a simple set of guidelines and stepping stones that help us move our code toward good object oriented design and implementation. There are dozens of other principles that are similar and related to SOLID and I encourage you to find and study all of them. Every principle you learn will help you understand the other principles that you have already encountered and the ones that you have yet to encounter.


Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Derick Bailey

Derick Bailey is an entrepreneur, problem solver (and creator? :P ), software developer, screecaster, writer, blogger, speaker and technology leader in central Texas (north of Austin). He runs SignalLeaf.com - the amazingly awesome podcast audio hosting service that everyone should be using, and WatchMeCode.net where he throws down the JavaScript gauntlets to get you up to speed. He has been a professional software developer since the late 90's, and has been writing code since the late 80's. Find me on twitter: @derickbailey, @mutedsolutions, @backbonejsclass Find me on the web: SignalLeaf, WatchMeCode, Kendo UI blog, MarionetteJS, My Github profile, On Google+.
This entry was posted in AntiPatterns, Principles and Patterns. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • D M

    Thank you for attacking this subject. But the problem that I see here is that, while you have given a different way to “look” at the problem, I see nothing concrete here that helps us “solve” the problem. I feel like I just read something that makes sense but doesnt actually help me “implement” a better solution. Maybe it’s just me, but a concrete example that shows a “bad” implementation and a “better” implementation following your principles would do more good than all of this text.

    • http://mutedsolutions.com Derick Bailey

      good point. I’ll work on another post to follow-up with examples.

    • Joshua Sigar

      I have to agree. We don’t need another definition that can always be interpreted differently by different people. Give concrete, practical examples, please.

  • Ed Kirwan

    Hi, Derick,

    I’d like to thank you, too, for attacking this subject.

    I’ve tried and failed:
    http://edmundkirwan.com/encap/overview/paper7.html

    Regards,

    Ed.

    • http://mutedsolutions.com Derick Bailey

      i wouldn’t say failed. there’s a lot of value in your article with the many diffferent sources and discussion on the history of encapsulation. it’s just a different perspective on the subject

  • Liam Mclennan

    A lot of this happens naturally if you follow the ‘tell don’t ask’ guideline.

  • Tudor

    It’s easy to say “this is bad”, but it’s harder to give concrete examples..
    Let’s take an example – most of the classes that many programmers create every day needs bunch of properties (FirstName, LastName, Age, etc.) needed to display or edit that object in a screen.. How do you encapsulate that?

    • http://mutedsolutions.com Derick Bailey

      a data structure is probably the right choice if you just want to set the first name, last name age, etc, for a person. there’s not context around your example, though, so there is no way to provide an example for what your asking.

      a simple CRUD edit screen is just a simple CRUD edit screen and doesn’t need anything other than read/write properties.

      do you have an example of a business process related to these properties? find out how the person class is going to be used, other than editing them. are they being used in an HR org chart with managers and subordinates? are they being trained and certified to operate certain types of equipment? are they interacting with others in a social network?

      until you have the context and processes in which something is being used, it’s not possible to give an example for the simple properties your asking about.

  • http://strangelights.com/blog Robert Pickering

    I agree encapsulation is often misunderstood, but I don’t think this article really gets to the heart of the matter. For me, to really do encapsulation right you need to stop reasoning in terms of encapsulation of state and start reasoning terms of encapsulation of complexity. You’re doing encapsulation right when you provide a simple interface for a complex problem. The .NET Framework BCL has lots of good examples of this, requesting a web page would be fairly tricky if you just had to deal with the OS native APIs or even .NET’s socket APIs but you don’t have to as the BCL encapsulates this in WebClient class which takes a fairly complicated operation and encapsulates it to provide a nice simple interface.

    • http://mutedsolutions.com Derick Bailey

      my gut reaction is that you’re describing abstraction, not encapsulation. the two concepts are very closely related, though.

  • Rob Scott

    Before comming up with your own theory of encapsulation and information hiding, you might do well to dig a little deeper than a wikipedia entry. These terms have long histories in our field and are much better defined than your strawman suggests.

    Read Parnas to understand that Information Hiding is the practice of “hiding” design decisions that might change behind “interfaces” so that later changes to those decisions don’t impact across the system. Some of those design decisions might be the data structures used, or the “magic that turns data into information”, or it might be much larger decisions such as where in the network the work is done, or how much concurrency is used in the implementation.

    Encapsulation is generally understood to be the steps taken to ensure that the information that we intended to hide is, in fact, well hidden (i.e., encapsulated). These steps include the use of language facilities such as interfaces and access protection modifiers, as well as the utilization of design practices such as providing minimal interfaces, adhering to the Open/Closed principle, etc.

    The measure of how well a piece of information (i.e. a design decision) has been encapsulated is the extent to which changing that decision ripples through the system. For example, do we have to change the code that uses the code that encapsulated the design decision, do we just have to recompile, do we just have to ship the new assembly… The more we have to do, the less encapsulated the decision was (the less hidden the information was).

    • http://mutedsolutions.com Derick Bailey

      100% agree with everything your said… except the strawman bit. we’re saying the same thing, except i’m not quoting anyone specifically

  • Duke

    I think no discussion of how to properly implement encapsulation is complete without mentioning the Law of Demeter (http://en.wikipedia.org/wiki/Law_of_Demeter). There are many ways to leak abstractions and weaken encapsulation, and the guidelines recommended by LoD certainly help keep things under control.

    What I find most interesting is why guidelines like LoD are even necessary. I think the biggest problem is that a lot of us in the industry (certainly myself) learned OOP in languages like C++. I think it is telling that Alan Kay, said that when he invented OOP, he did not have C++ in mind. C++ is literally just C with Classes.

    Languages like C++, Java or C# encourage us to think in terms of types and imperative programming, and our thinking of OOP is warped by this. Classes are not just data and operations bolted together.

    The vision Kay had with OOP was of biological cells, which are separated from one another by cell walls. Cells can only communicate with each other by simple chemical messages. With OOP, he envisioned objects as closures with process state completely hidden from the outside world, and simple messages are the only way for two objects to communicate. Erlang’s Actor Model is about as close to true OOP as you can get.

    I would say that encapsulation is more than just “information hiding”, and certainly more than private vs protected vs public vs properties. This is a reductive view which obscures the larger picture – encapsulation goes hand in hand with messaging to form a way of thinking and creating abstractions centered around communications patterns. Whenever I get into these discussions, I always bring up Uncle Bob’s Heuristics and Coffee (http://www.objectmentor.com/resources/articles/CoffeeMaker.pdf), because it is simply the best example I’ve seen illustrating this principle.

    Finally, I think it’s funny that Alan Kay regrets the name he chose for his new programming paradigm, because OOP has shifted the dialog to objects instead of messages.

  • http://www.facebook.com/profile.php?id=73000349 Joel Beckham

    I really like the distinction you make between Data and Information. While I have been wrestling with these concepts for a while, I’ve never been able to articulate it very well. So thanks for this article!

  • http://profiles.google.com/raould Raoul Duke

    i am just reading // What Every Programmer Should Know About Object-Oriented Design // by Meilir Page-Jones in which Encapsulation is actually a separate, although clearly related, concept vs. Information Hiding.

    in there, if i read it right, the idea is:

    Encapsulation is taking a bunch of stuff and putting it together under one “name” or identifier. ADTs do that. Objects do that. Functions/procedures do that.

    Information Hiding is having the outside view be different than the inside view; having the API there so you can change things under the covers if need be.

    You could have Encapsulation without Information Hiding. Records do that.

    $0.02 on the fire.

  • http://www.dust-extractors.com Dust Extractors

    There are many reasons to wanting and purchasing dust extractors. Use http://www.Dust-Extractors.com to obtain Reviews and Information about dust extractors