Choosing an ORM strategy

One of the mistakes I see people make (and have made myself) is assuming that you must choose exactly one ORM strategy for an entire database/application/system. It’s simply not the case. You can (and should) tailor your ORM strategy to your use case, and make sure you’re picking the right tool (or feature of the tool) for the job.

First things first – 99.9% of the time, you should never use ADO.NET classes directly, including IDbCommand and IDataReader. If you’re still doing “dataReader.MoveNext” – you are writing code that you should not write, ever again. There are better options for what you’re trying to do.

Now, there are lots of folks who don’t like ORMs, and from listening on how those folks chose their ORM strategies, I would agree. Fowler said on OrmHate that:

much of the frustration with ORMs is about inflated expectations

We have to move past the idea that ORMs are bad or ugly or bloatware. ORMs are meant to solve a problem, with many different options for doing so. But before we look at options, let’s look at the problem space. What is the problem we’re trying to solve?

Bridging the gap

If you have to push/pull data to/from SQL and .NET, you have to map your .NET data types to SQL. In .NET, this means using ADO.NET to communicate SQL commands to a SQL database. At some point, mapping has to happen between SQL data types and .NET data types. SQL dates are different than .NET dates, and so on.

ADO.NET helps with this, but leaves the work up to you for taking raw result sets and populating objects. And ultimately, that’s what we want to do – work with objects and types in .NET, and let something flow our objects/data back and forth to SQL.

ORMs are meant to help solve this problem, by applying layers on top of ADO.NET of various abstractions and weight. But there are many different strategies for doing so, used to solve this problem in many different ways.

Let’s look at each of these in turn and see where they might fit.

Entity-based relational mapping

In entity-based relational mapping, tables are more or less 1:1 with entities in your object design. When you add a property to an object, a column is added to the database. Usage of entity-based relational mapping centers around loading up an entity (or aggregate) by its identifier, manipulating that object and perhaps child objects, and saving that object back through the ORM.

The ORM in this case provides many features, like:

If I’m working with effectively one entity or aggregate at a time, tools like NHibernate use mapping configuration to keep track of entities loaded and automatically flush changes upon committing the transaction. This is nice because we don’t have to transport a data access gateway around, it’s just managed for us.

When loading up an item by Id for manipulation purposes, this option works great. It removes quite a bit of code I would otherwise need to write to keep track of adding objects, saving them and so on. I would have to “remember” to save things and the like.

The downside is that the ORM does not immediately know that you’re just reading objects, or loading up an entity to manipulate it. We often see folks stumble when not understanding that change tracking is turned on by default, and when you load up entities, the assumption is that we need to check to see if it’s changed.

If you’re looking at loading up an entity in order to manipulate it and persist changes (or create a new entity to manipulate), this pattern allows a greater deal of freedom of keeping data access concerns into infrastructure layers, and let your entity types be relatively persistent-ignorant. Persistence ignorance is not that my C# model and DB model can diverge, but more that data access concerns don’t leak into my object model, which I’d rather be more concerned with business rules.

Active Record is another flavor of entity-based usage, except data access concerns are embedded with my object model.

Result-set-based relational mapping

In most applications, the demands for reads greatly outstrips the instances of writes. We’ve seen ratios of 100:1 looking at the profiler of SELECTs to INSERT/UPDATE/DELETEs in one recent application. When looking at what SQL is very good at, it’s great at working with data in sets. To return sets back to the application, it often makes zero sense to try and use an entity-based approach for simply returning sets of data.

But we still would like to not work with very raw SQL objects like IDataReader or DataTables. These are very loosely-typed objects, not easily transferable to upper layers of the application.

Instead, we often build objects who are tailor-made for data. These are often called Data-Transfer Objects, or Read Models. These are types we craft for individual SQL SELECT queries, not to be re-used across other queries.

Many ORMs have features optimized for these scenarios. In NHibernate, you can use projections to skip any sort of tracked Entity objects to map straight to DTOs, you can use SQL queries to map straight to DTOs and skip needing to configure mapping. Or you can use micro-ORMs like PetaPoco.

These reads can also stream objects as they are read. Both NHibernate and several micro-ORMs let you map individual DTO instances as each row is read from the underlying result set, minimizing the amount of objects kept around in memory.

In our applications, we often still use NHibernate for reads, but skip any kind of entity objects and instead craft our raw SQL, relying on NHibernate’s optimized mappers to simply supply a DTO type and the results are mapped automatically.

This approach does not work well if we have to apply business rules and persist information back, as these models usually map to result sets, not database tables.

DML-based relational mapping

If you know what you want to do in SQL for CRUD operations, and would rather not have a tool figure it out for you, you’re really looking for something to effectively abstract DML commands at a step higher than ADO.NET.

This is the arena of micro ORMs. Tools like PetaPoco, Dapper, Massive and others work to solve the problem of the pain of working with raw ADO.NET. These typically still allow us to work with objects, but our interactions are greatly simplified and SQL is brought to the forefront. We only need a connection, and these tools can let us work with all the CRUD operations in a manner that offers much simpler code that ADO.NET

In cases where you don’t already have an entity-based relational mapper in the application, micro-ORMs provide a much lighter approach. Because types and mappings are not configured ahead of time, these tools rely on lazily-evaluated optimized reflection caching techniques to on-the-fly map parameters and result sets.

Many applications can start out with DML-based mappings, graduating to a full-fledged ORM when certain relationships or entities demand it.

Bulk loading tools

One that holds a dear place in my heart – sometimes you don’t want to push/pull data in an object-based manner. Instead, you’d rather push and pull data in bulk. Tools like SQL Bulk Copy allow you to pull data out in a delimited or tabular format, or push data in.

Bulk loading tools typically work as a sort of bazooka, blasting data out or blasting data in, but not providing much beyond that. You won’t be able to do updates or deletes, but for getting large amounts of data in and out of SQL, bulk loading tools are what you need.

In many integration scenarios where you provide delimited files to partners, or delimited files are supplied to you, bulk loaders allow you to treat files as tables and load them directly against the database, skipping past layers of abstractions both in mapping objects and even in the database.

These tools are tuned to be much, much quicker than traditional methods of extracting/loading data. In some of our tests, we’ve seen orders of magnitude differences of time between row-by-row loading versus bulk load. In one case, we saw a difference between several hours and less than a minute to load.

The downside is that options are quite limited, and you’re really only limited to INSERT and SELECT. Everything else requires different tools.

Right tool for the job

In one recent project, I’ve used every single one of these approaches against a single database and a single codebase. NHibernate for entity/aggregate-based mapping, result-set for reading sets of data (and building messages/exports/views from the results), DML-based mappings for simple tables and modules, and bulk loading tools for those nightly drops of million-row files from partners.

The key is to not tie yourself down to a specific tool or approach. No single ORM strategy works for every situation, and nor should it. NHibernate can work for a variety of scenarios beyond entity-based mappings, but doesn’t do everything. Complexity often arises from trying to use the same approach for every situation.

Every application written outside of SQL uses an ORM. Whether it’s hand-rolled ADO.NET code, or NHibernate, you have to bridge the gap from .NET to SQL. Bridging this gap is a hard problem to solve, and nothing solves the entire problem set perfectly. Nor should it!

Choose the approach that fits the problem at hand, and don’t worry about having multiple approaches in a single solution. That’s not to say haphazard decisions are appropriate, but informed, educated decisions based on knowledge of all choices at hand are always a good idea.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Architecture. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Asher Kohn

    Great post

  • http://twitter.com/ntcoding Nick

    It’s a good post but based on experience nothing is every smooth running with a big ORM.  Weird errors, unexpected craziness – something always pops up and catches out even the most experienced devs on the team.

    As a general rule life is easier for me without them.

    Everyone to their own opinions though.

    • Anonymous

      Life might be easier without a “big ORM”, but those are decisions we can defer. For the problems they solve, they solve them really, really well. 

    • Anonymous

      I know what you mean. You’re probably stuck with using a complicated wrapper solution that someone else wrote, and it seems like a big black box right? I think you’ll find it’s worth it in the end, especially when you have to write something new and you can tweak the wrapper (or completely rewrite it).

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1150

  • Anonymous

    I typically use Entity Framework 4 as my ORM of choice. I typically build ASP.Net MVC3 web applications and I find the repository pattern fits in really well with EF4.

    I’d like to give dapper a try since StackOverflow uses it, and if they are using it, there must be some merit. They are after all serving hundreds of thousands of requests each day.

  • Mike

    “these tools rely on lazily-evaluated optimized reflection caching techniques to on-the-fly map parameters and result sets”

    Yes. 
    ;-)

  • jdn

    “Right tool for the job
    In one recent project, I’ve used every single one of these approaches against a single database and a single codebase.”

    Exactly.  Ditto.  Bingo.  +1

    “you have to bridge the gap from .NET to SQL. Bridging this gap is a hard problem to solve”

    Meh.  It isn’t a trivial problem to solve (e.g., it’s not something you can slap a ‘convention over configuration’ layer on top of and forget about), but I don’t think it is really all that hard.  It ain’t no Vietnam.

  • http://www.facebook.com/profile.php?id=543192318 Michael Yoffe

    ORMs can be really useful if you know no SQL and want to make a quick POC with a framework that has an ORM built in to it which is tightly integrated with the rest of the framework.
    For example, if you use Django and its ORM you can have an excellent Admin site within less than 5 minutes with almost no work. And it’s also very customizable.

    But besides that, ORMs seem like over engineering to me. If you are not afraid of SQL (and you should not be). It is enough to separate the business objects, data access and query layer and you will have the best and easiest time.

    One type of library that I believe is extremely useful for that purpose is MyBatis. It gives you data mapping classes and session management that encapsulate _your_ SQL.

    So I’d recommend that. I would not recommend any ORM for a long term project.
    The problem with abstractions is that they hide performance issues and other subtleties behind them. The SQL should be in front of your eyes, not behind a black box. It is very important to know what’s going on in your database.

    • Anonymous

      iBatis is an ORM – it’s the DML-based ones. And yeah, I love those too :)

      Anything you have that maps from and Object to a Relational DB is an ORM. I was trying to highlight the different kinds/use cases.

  • http://twitter.com/vkornov Victor Kornov

    I understand this “Many applications can start out with DML-based mappings, graduating to a full-fledged ORM when certain relationships or entities demand it. ”, to a point. But not sure why’d you use a MicroOrm if you already have BigOrm in place. Surely, you can just create separate simpler classesmappings for those simpler tables and cases. Was it just so you could avoid the “separate classesmappings” and use dynamics all the way?

    • Anonymous

      Eh, no, it was more like we didn’t want to reference the big NHibernate project. For a Web API endpoint, it was easier to have a simple PetaPoco call.

  • Mike Cole

    Good post, thanks for the info. I often hang on the decision whether to stick wholeheartedly to the ORM for retrieving data or to use DTOs. Do you have any rules or guidelines as to when to keep data retrieval within your domain and when to use a DTO? Or is it mostly a common sense decision?

  • Bob

    You totally missed the EF Code first. It’s a brilliant piece of software. It’s not complicated as your beloved NHibernate and also it’s not a half baked ORM Like NHibernate.
    NH say I do this, I do that but in reality most of its features are half baked like its LINQ support. Like it support of different databases.

    • Anonymous

      ???

  • Steve

    Thanks for the article, that’s helpful. I wonder, what would you consider the sweet spot to be for Microsoft’s Entity Framework? Or do you not recommend it at all (you didn’t have a single mention of it after all) – and why?

    • Anonymous

      This wasn’t an article about specific ORMs, but capabilities within them. I tried not to mention specific ORMs, but it’s a little hard when you have to mention individual features.

      I’ve never used EF in a production app before, and for me it would be a bit hard until EF supports custom types. 

  • Benjamin Kotvis

    I noticed in one of your responses to a comment you mention custom types with regards to the Entity Framework.  I assume you mean SQL Server custom types.  They don’t have a lot of SQL Server specific functionality in there and I think they may be doing that on purpose as MySQL has nothing like that.  

    Although, when I want to transmit complex types to the database I use XML parameters because it seems to give me fantastic performance and it actually can work with the EntityFramework.  It works well for bulk operations where you want to update or delete where the conditions come from a large set of in-memory objects.  

    • Anonymous

      Nah, not SQL types, I mean .NET custom types. Like an Email object or a Money object, things I use to get rid of primitive obsession smells.

  • Greybeard

    Friends don’t let friends ORM. 

    I’ve been doing this a long time.  Don’t ORM.  Become a SQL and database expert in addition to being a code monkey.  ORMs are for the lazy and noobs.  Please learn and take advantage of the database engine/capabilities that ORMs make untouchable. Do it the hard way because in the long run it is the easy way.

    Just for fun run SQL Profiler against an ORM based application and stand in sheer terror at the amount of queries (inefficient ones at that) that pour across the screen.

    Word.

    • Anonymous

      Sure, but you’re still writing code to map from the DB to your objects. Unless you’re using datasets mapped straight to the screen, you’re either using an ORM or you wrote your own.

    • Blake H

      Stop spreading FUD. I do recommend developers start with an understanding of SQL and then realize that much smarter people have developed solutions that make our jobs less mundane and our code easier to maintain. When you don’t understand how an ORM works, then you are going to have a problem. The solution to that problem isn’t to ignore ORMs.

  • Anonymous

    Yeah right, but if you use multiple frameworks inside a single app you create a framework salad – which is in itself not something very palatable.

    OTOH, if you need database independence, a smart ORM with built-in support for many databases is indeed the cheapest solution. Also, if your database schema/object structure changes often, maintaining the changes by hand is costly.

    There’s one more thing: crafting SQL by hand tends to get you to place logic in SQL, which is IMO a very bad thing – you eventually end up writing huge stored procedures in the name of performance, and having the app’s logic split between SQL and C#, which practically makes sure nobody fully understands the app’s logic, and no change can be done by one man – DB admins quite often don’t easily let programmers touch their databases, but they don’t have the knowledge to touch source code.

    What worked best for me until now is to use stored procedures exclusively for CRUD operations plus specialized, app-specific searches, wrap them in either generated or hand-woven DAOs on top of something as low-level as ODBC/JDBC, and maintain the database by hand. Since the persisted objects are domain-related, and the data model changes a lot less often than the algorithms, the cost of doing so is moderate, performance is quite good, and there’s some degree of database independence – the logic of the SQL is simple enough in order not to require non-standard or exotic  SQL features, which makes the scripts mostly portable across databases. Also, having all your scripted tests use a database whose initialization depends on your version-controlled SQL scripts for creation/update and initial population is quite helpful – and can be done this way.

    • Anonymous

      Oh definitely agree! I should have clarified – I use multiple ORMs in a single solution, but it’s not a single application. It’s something like 20 different deployed apps. I’m all for choosing one approach and sticking with it :)

  • http://twitter.com/demisbellot Demis Bellot

    The biggest problem with Heavy ORMs is the leaky abstraction. SQL is already a DSL for managing an RDBMS, sticking it behind a heavy ORM requires you to map Relational -> OOP mapping in configuration – which is inferior, more limiting, less verifiable and less explicit than source code.

    >> The key is to not tie yourself down to a specific tool or approach.  

    Although many ORM’s want to own the world, i.e. you have 1 config/dbml model for your entire database. Micro ORMs at least work off connection strings and POCOs, they are easily interchangeable, e.g. at StackOverflow Careers we use https://github.com/ServiceStack/ServiceStack.OrmLite + http://code.google.com/p/dapper-dot-net/ (the 2 fastest ORMs) off the same DB Connection with the same Models. 

    Since the Micro ORM POCO models are also re-usable, they’re not tainted with RDBMS constraints so we’re free to use them as DataContainers, ViewModels, DTO’s etc when we see fit.

    Another problem is the complexity around managing child view objects, in a perfectly normalized database these would all be different tables – In practice you get much better productivity, freedom and versionability storing value objects in schema-less text blobs. We love doing this in NoSQL DBs, but many are still reluctant to do this inside an RDBMS because of decades old teaching. 

    Here’s an example of the versionability schema-less tex blobs offer: https://github.com/ServiceStack/ServiceStack.Redis/wiki/MigrationsUsingSchemalessNoSql
     
    Because of the productivity wins, OrmLite has first-class support for this where every complex-type property (i.e. non-scalar) is stored in a schema-less text blob, apart from letting you make structural changes without out-of-band DDL changes, it lets you re-use the same POCO in DTOs, Caching/Sessions, NoSQL DBs, Disk/Files, etc:
    https://github.com/ServiceStack/ServiceStack.OrmLite#code-first-customer–order-example-with-complex-types-on-poco-as-text-blobs 

  • Pingback: Don’t write your own ORM | Jimmy Bogard's Blog

  • http://blog.andrei.rinea.ro/ Andrei Rinea

    Fine article, Jimmy!

    I had the opportunity to exchange some tweets about ORMs with you so you might already know my opinion regarding ORMs in general. However I must ask something that is not covered in this article.

    How about batch update/delete’s ? How do you handle them? I mean I might need to delete 500,000+ expired rows from the table OFFERS (for example). NHibernate, for example, offers me two alternatives : 500,000+ reads followed by 500,000+ deletes or untyped/stringly-typed HQL.

    What’s your recommandation? I do RAW SQL / sprocs for these…

    Thank you.

    • Anonymous

      Raw SQL all the way. I only use the entity models w/ NHibernate if I’m dealing with one entity.

      But, I can still do raw SQL and use NHibernate (or insert other micro-ORM here) for just adhoc DML things.

      I even have a process that does bulk insert of persistent NHibernate entities because it’s the fastest way. SQL bulk insert of pre-generated entities flattened out into a datatable because there’s 100s of Ks of records. And since we have a test on it, well, who cares right? Just let it do what it needs to do.

  • Pingback: Choosing an ORM strategy | hakre on wordpress

  • Pingback: Micro-ORMs for .NET Compared – Part 1 « The Public Void

  • Pingback: Should we still be using stored procedures? « whygwarren

  • Pingback: ORM strategy | petersnotebook