CQRS Performance Engineering: Read vs Read/Write Models

I’ve used a lot of different architectures, patterns and implementations that revolve around the core concept of command-query separation (CQS) and the more recent label of command-query responsibility separation (CQRS). The ideas behind these principles help us create code that targeted to a single purpose, generally side-effect free and easier to work with and maintain. In the last few days, though, I’ve begun to see how CQRS can be used for performance engineering as well.

Performance Problems With A Common Pattern

A few weeks ago, our product owner reported a performance problem with a control that is used on two screens in our handheld / Compact Framework application. This control is not terribly complicated – it has 4 drop down lists, each one loaded based on the data selected in the previous one. I’m pretty sure every developer has created a series of drop down lists like this at some point in their career. It’s not difficult… it just takes a little time and effort to handle all the cases of no items found, auto-selecting if there’s only a single item in the list, having a “Select One” or other default option, etc.

After digging into the offending control, I found that it was doing the following for every drop down list on the control:

Data Load / Display:

  1. Load all data from the database into a DataTable
  2. Convert each row of data into the full object it represents
  3. Convert each object into a simple Name/Value Lookup object
  4. Bind the Lookup objects to the drop down list

Data Select / Use (on selected index changed):

  1. Get Value (ID) of the selected lookup item and load the full object for that ID
  2. Run the Data Load / Display for the next drop down list based on the ID of the object
  3. Publish the selected object on an event so the parent form could respond to it as needed

This is a pattern that I see a lot of – whether it’s WinForms or WebForms development. It’s especially common in a WebForms environment, though, where there is not state on the view implementation. Unfortunately, this pattern and implementation is very problematic when it comes to performance. The actual performance on the control in question was so bad that we resorted to using asynchronous commands to retrieve the data for the drop down lists. This let us keep the UI “responsive” to the user – it prevented the screen from locking up with strange artifacts for the 3 to 5 seconds that it took to load any given drop down list.

 

Separation Of Concerns

Why would I want to load the entire set of data from the database and deserialize that into the full object model just so I can bind the name and id of the objects to a drop down list and then re-load the same object from the database again? That doesn’t make much sense to me – even in a web environment where I should bind nothing more than the name and id in the form. In a WinForms environment, though, I guess I can see “the easy way out” by loading up the objects with my existing data access infrastructure… but that just doesn’t make any sense other than being lazy.

Here’s the crux of a read-only or view model in this situation: if I’m only going to display the name and id of the objects, then that’s all I should load.

 

Load View Model, Lazy Load Full Object When Its Needed

To solve the performance problems in this control, I decided to use the basic CQRS tenants of separating my view model, which is a read-only representation of my data, from the object model which is a read/write representation. Here’s the new approach I took to solve the performance problems, with each of the drop down lists:

Data Load / Display

  1. Load the name and id only, from the database using a DataReader
  2. Populate a generic Lookup object with the name / id of each record
  3. Bind the drop down list to the Lookup objects

Data Select / Use

  1. Get the the id of the selected item in the drop down list
  2. Run the Data Load / Display for the next drop down list based on the id of the selected item

Data Collection

  1. After the entire selection process has been performed, then and only then load the full object that was selected and publish it to the parent form

There are a couple of key things to note in this solution… namely, I’m only loading the name and id for the drop down lists. I only need that information for the drop down list to work, so I’m not going to bother loading anything else. And I’m not loading the full object model until I’m actually ready to use it. If the user is constantly switching the drop down lists to figure out what they need, then loading the full object model after each individual selection will just use up a bunch of time and resources for no good reason. I’m waiting until some level of confidence in the selection can be established and the code is ready to use the object model before loading the full model.

 

The Performance Improvements

I don’t have any scientific performance metrics for this, yet. I’m not sure if I’ll need to do that, actually. I do have first hand experience with the existing performance and the new performance, though.

The original code tended to take anywhere from 3 to 5 seconds, on average, to load any given drop down list. The worst performance, though, was one particular query that returned nearly a thousand items for the drop list to display. This would take closer to 6 or 8 seconds to load. … again, these are all based on my experiences, not actual timers… I can say with certainty, though, that I was never able to use keypad up/down arrows to select items in the drop down list. The control was simply too slow in responding so I would sit there and wait for it to finish loading before clicking the down arrow again.

With the new implementation in place, the control’s performance is significantly enhanced. The average time it takes to load the drop down list has dropped to far below a second. Again, I haven’t done any real timer / performance testing with this… but I can say with certainty that I can now use the up/down arrow keys on the keypad and the control keeps up with me no matter how fast I’m able to click the keys. Furthermore, the performance is good enough that I have not yet needed to use any asynchronous processing to load or display any data. Even with the one query that returns nearly a thousand records to the drop list, the time to load is less than a second – a barely noticeable stutter in the list being available for selection.

 

Conclusions And Other Considerations

The principles and patterns that comprise CQRS can be used for a number of different reasons – not the least of which is performance improvements in your code. Whether you are working on Winforms, Webforms, Compact Framework or another system or platform that has read vs. read/write needs, keeping CQRS in mind at all levels of the system can have a significant impact in many different ways.

Of course, this does not come free. There is an increase in the amount of code you have to maintain when you go down this path. You may end up writing two or more different types of data access code and you will have the same data represented in multiple objects and queries in your system. These costs are not to be taken lightly. However, when used judiciously and understood by the entire team the impact of these costs can be mitigated. Keep your data access methods simple and have a clean separation between your full object model and your read only models. Constantly communication with team members and work on well named and organized code. Its your team’s communication, collaboration and standards that will help to cut the costs, keep your system clean and maintain it’s performance over time.


Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Derick Bailey

Derick Bailey is an entrepreneur, problem solver (and creator? :P ), software developer, screecaster, writer, blogger, speaker and technology leader in central Texas (north of Austin). He runs SignalLeaf.com - the amazingly awesome podcast audio hosting service that everyone should be using, and WatchMeCode.net where he throws down the JavaScript gauntlets to get you up to speed. He has been a professional software developer since the late 90's, and has been writing code since the late 80's. Find me on twitter: @derickbailey, @mutedsolutions, @backbonejsclass Find me on the web: SignalLeaf, WatchMeCode, Kendo UI blog, MarionetteJS, My Github profile, On Google+.
This entry was posted in Analysis and Design, CQRS, Pragmatism, Principles and Patterns. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://thinkbeforecoding.com thinkbeforecoding

    I really like the term of Persitent View Model (from Udi Dahan).
    Just put the data needed by this view in this table, and you got it. No need to apply any logic when reading, just get the data.
    The good thing is that your data change a lot less than it’s read, so the logic used to transform data is applied only once per change instead of once per view.

    I noticed you said you have a readonly service for queries and a read/write service for commands. You can easily remove the read from commands, and get a writeonly service from commands.

  • http://www.adverseconditionals.com Harry M

    Here’s my problem with applying this as a globally reusable paradigm. I can see how this works in a lot of cases (particularly as the starting point of ‘do the simplest thing that can possibly work’)

    The scary bit comes when theres a new business rule which filters the data which only applies to certain customers who have enabled certain settings and fulfil certain calculated criteria. At this point your SQL query breaks down. What then, copy the query into a simple DTO, add a couple of filters to the data using a service (there are a couple of screens witneeding this filter you see)?

    Not only that, when you write data back to the store, you need to check the business rule from the service above in order to validate the request (we dont want people assigning themselves data they aren’t allowed to see, right?). So do we share the service between the read and write models, running against an interface implemented by the read-model-DTO and the write-model-request-object? Sounds like we’ve crcreated some monster anemic domain, with linked objects.

    I’m not critising the approach taken in your post by the way, it seem the right tool for the job, really I’m just trying to understand how CQRS works once you’ve got a complex model with security and filtering that needs to be applied to both read and write. I’m worried one may end up trying to write everything in more and more SQL queries (joining onto the Permissions and GroupPermissions tables) which are harder and harder to maintain.

  • http://craniometrics.blogspot.com J Healy

    Good CQRS post. Depending on how it’s structured, this sort of application ‘reference’ data is often cached and filtered for successive dropdown selects

  • http://thinkbeforecoding.com thinkbeforecoding

    @harry> In a CQRS system, you always make calculations on state change. So you will have in your DB the current ‘rights’ of your user, no calculation needed.
    If your ‘rights’ set is static, this is not a problem, just make a view per right.
    If it’s dynamic, add an AccessRight column to your view, and filter using this an user Rights.

    Usually, when your user rights are dynamic, there actually part of your domain and should be treated as any other state data.

  • Mihai Lazar

    There is always value in separating concerns.

    My interest is in what/how you define CQRS. What posts or resource did you find usefull ?

  • http://www.lostechies.com/members/derick.bailey/default.aspx derick.bailey

    @Mihai,

    Greg Young over at http://codebetter.com is considered the father of the term CQRS – he’s got a blog post on how this name came about from last year. and Udi Dahan has done a significant amount of work within this name as well – his post on clarifying CQRS at the top of this article is tremendous and very long.

    in general, CQ_R_S is just CQS applied outside the confines and contexts of method calls. it’s not a pattern or architecture, though. it’s a principle of separating the command and query responsibilities in a system, at all levels of that system. this ranges from having an OLTP and OLAP database that are separate (separate schema at least) all the way to methods, classes, modules, frameworks, architectures, UI pattherns, etc.

    there are a set of practices are are becoming ‘standard’ in the CQRS family. Greg Young has several posts on these and lists some of the common ones in a recent blog post.