Case study in mixing system concerns

For the vast majority of systems I’ve been involved with, transaction isolation levels were never something I really had to worry about too much.  The number of reads on these systems overwhelmingly outnumbered the number of writes.  But every once in a while, you work on a system where it’s actually the opposite, and you have just about an equal number of writes as reads.

In these cases, things like resource contention and database locks start to matter.  You start caring about phantom reads, lost updates, dirty reads and the like.  You run into a situation where some critical piece of data needs a higher level of transaction isolation.

Suppose we have a critical set of data that we don’t want to allow dirty reads.  In this case, we follow the normal guidelines of:

  • Create a transaction with the higher isolation level (SERIALIZABLE in this case)
  • Keep the transaction window as absolutely short as possible

This fixes someone reading dirty data for the row that’s being updated.  However, sometimes not all reads are the same.  In this case, a higher isolation level guarantees that there aren’t dirty/phantom reads.  But I now have the issue that anyone reading any of the data can potentially time out.  In those cases, I now have to do a lot more explicit configuration the other way, setting the isolation level lower on certain sets of reads so that “unimportant” reads can go through and don’t time out.

An alternate approach

Looking at this design, a highly centralized, normalized view of data, you can start to see the problems of a lack of Separation of Concerns at the system level.  The fact that I have certain sets of data in the same row with different isolation level concerns should be a signal that I’ve mixed concerns.  If some pieces of my system care about certain pieces of data, and some pieces others, then why should they all be mangled in to one database table?

In this case, I did need elevated isolation levels for the data I was modifying often.  However, lots of other data in that table doesn’t get modified often.  If these two concerns were split in to different tables, the timeout issues I was running in to with the reads would have simply gone away.

Moral of the story – architecture doesn’t solve problems, it eliminates them.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Architecture. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Roco

    “Moral of the story – architecture doesn’t solve problems, it eliminates them.”

    And it can create them as well. If you start breaking apart tables because of transactional concerns, you start adding the need for additional joins when reading data. I am also having a tough time visualizing this example. Can you provide an example of the table you broke apart?

    Also, are you using an ORM over this DB?

  • http://jason.pettys.name Jason Pettys

    Good insight, Jimmy. Breaking out components based on the entity doesn’t always align with breaking out components based on business capability. I’ve been working on CQRS for almost a year now and am surprised at the ways it helps break about things that seem like they should go together, but things work very smoothly when you break them apart.

    Have you read Udi Dahan’s recent blog post, “The Known Unknowns of SOA,” that I think correlate well with what you’re saying?

  • bubu

    Let’s say we have an order entry application. An order has a customer and one or more order lines corresponding to each of the items purchased. Instead of creating a customer table and an order line table to store the data for each of these entities separately, the database was designed such that all of this information is stored in the order table in a very non-normalized manner. Order detail information is constantly changing because it is the holiday season and exchanges and returns means that we have to add, remove or update lines for a given order whenever this happens but this data is only read a new receipt is printed. Customer information never changes because our company policy requires that any changes to an order must be made by the person who made the original purchase but this same data is frequently read to populate mailing lists for special offers. Because all of this information is in one row, the mailing list application is timing out trying to read a row that it can’t lock due to the order line changes. A separate table for the customer information would allow the mailing application to read the customer table while the order application updates the order and/or order line table and occasionally adds a new row to the customer table for a new customer.

    I realize that I took a lot of liberty with assumptions that are unlikely but hopefully that’s at least something to go with if you were looking to prove your point about joins against a common example

  • Roco

    @bubu

    IMHO that is not a good example. If that is how a DB is designed, the problem in not transaction management, it’s DB design in general.

  • @Roco

    Very well; while I try and come up with something else, I was wondering why you asked about using an ORM. Was this a side question out of curiosity or were you going to suggest that an ORM may help these sorts of problems in the future and how? I’ll admit my experience with using this type of software is limited but nothing I’ve ever used would help out here

  • @bubu

    @Roco

    Very well; while I try and come up with a better example, I was wondering why you asked about using an ORM in your original post. Was this just a side question out of curiosity or were you going to suggest that an ORM may help out with this sort of problem and, if so, how? I’ll admit my experience with this type of software is limited but I’ve never used an ORM that would address anything like this

  • @bubu

    accidentally put your name in over mine and re-posted to the best of my memory – sorry about that

  • http://www.lostechies.com/members/bogardj/default.aspx bogardj

    @Roco

    ORM isn’t the problem here, even if I just tried to read the data that was less volatile, the DB locks would prevent me.

    As far as the joins go, I wouldn’t join. I would just have de-normalized tables/views, and have data duplicated.

  • http://blog.scooletz.com/ Scooletz

    @bogardj,
    it’s very close to the design proposed by Greg Young. Having read data prepared to be read in separate, often one-per-view, tables eliminates that problem

  • Roco

    @Scooletz

    I was hoping this would come up. I hear a lot about CQRS and the idea of breaking read only data into separate tables, often on a per view basis as you mentioned. This sounds like a great idea in principle, but what about a large system with many views, reports, web services, external connection points, etc..? I could see this easily getting out of hand, creating a large set of read only tables that have to be kept in sync. What about two view where the data is similar but just a little different? Is that another table?

    Not to take this off topic – but this is my major frustration with CQRS. How are these tables kept in sync? I can see it being manageable with a system that needs its data refreshed nightly, but what about a system that needs the read data to be up to the minute or even hour current?

  • http://blog.scooletz.com/ Scooletz

    @Roco,

    speaking about slightly different data, I can give you only one answer: it depends;-) In my latest project, the so-called read-only table was created and had almost 80% of their columns simply copied from the original one.

    In terms of keeping them in sync:
    if you’re using NHibernate, then you are saved. Post update and post insert event handlers can simply get the ‘view’ entity from db and alter their values and you will never have to think about synching them up.

    Greg’s proposal was to batch all the sql commands till the first query hits the view-table, but he prefers not to use NH in his CQRS-DDD design.

  • http://blog.scooletz.com/ Scooletz

    @Roco,

    speaking about slightly different data, I can give you only one answer: it depends;-) In my latest project, the so-called read-only table was created and had almost 80% of their columns simply copied from the original one.

    In terms of keeping them in sync:
    if you’re using NHibernate, then you are saved. Post update and post insert event handlers can simply get the ‘view’ entity from db and alter their values and you will never have to think about synching them up.

    Greg’s proposal was to batch all the sql commands till the first query hits the view-table, but he prefers not to use NH in his CQRS-DDD design.