Limiting your abstractions

It’s been almost 3 years since I first wrote about moving away from the Repository abstraction. Since then, I’ve gone more or less full-bore without any concept of a repository in the systems I’ve built. I haven’t removed existing repositories, but I’m just not finding any value in them as an abstraction.

Repositories I see generally come in two flavors:

  • Abstracting the ORM framework
  • Encapsulating queries

An example of the first might be something like:

public interface IConferenceRepository
    IRavenQueryable<Conference> Query();
    Conference Load(Guid id);

While one encapsulating queries would more more along the lines of:

public interface IConferenceRepository
    IEnumerable<Conference> FindAll();
    IEnumerable<Conference> FindFuture();
    IEnumerable<Conference> FindFree();
    IEnumerable<Conference> FindPaid();

Where each method encapsulates a single query. Both cases provide value in certain cases. If I have a goal of abstracting my ORM, I would go with the first, and perhaps include the second as well.

But is an ORM something that requires abstraction? I don’t think so – abstracting something like an ORM actively prevents me from using powerful features. An ORM is already an abstraction, we really need to question whether we need to abstract an abstraction.

Digging deeper

We need to go back to why we introduced the Repository pattern in the first place. It was likely in the name of “testability”. We might start with something like this:

public ActionResult Index()
	RavenQueryStatistics stats;
	var posts = RavenSession.Query<Post>()
		.Include(x => x.AuthorId)
		.Statistics(out stats)
		.OrderByDescending(post => post.PublishAt)
		.Paging(CurrentPage, DefaultPage, PageSize)

	return ListView(stats.TotalResults, posts);

Seems complex, no? However, if the complexity grows, we’re still limiting its scope to exactly one method. Whether we pull this query out into a class, repository or extension method, the query will be in one method. In our controller action, what does it matter exactly that this code lies in our controller or another class? What about a more complex example:

public ActionResult Archive(int year, int? month, int? day)
	RavenQueryStatistics stats;
	var postsQuery = RavenSession.Query<Post>()
		.Include(x => x.AuthorId)
		.Statistics(out stats)
		.Where(post => post.PublishAt.Year == year);
	if (month != null)
		postsQuery = postsQuery.Where(post => post.PublishAt.Month == month.Value);

	if (day != null)
		postsQuery = postsQuery.Where(post => post.PublishAt.Day == day.Value);

	var posts = 
		postsQuery.OrderByDescending(post => post.PublishAt)
		.Paging(CurrentPage, DefaultPage, PageSize)

	return ListView(stats.TotalResults, posts);

Again, it’s just a set of queries. I’d still want to encapsulate this in one spot, but I really don’t see a reason to move this code out from where it is. If the query changes, I’m still just changing code in one spot. An abstraction here would only serve to misdirect.

The trick is when I have more than one concept in play. Let’s look at an action that has to do a couple of different things (queries are quite dull, it turns out):

public ActionResult Comment(CommentInput input, int id, Guid key)
	var post = RavenSession
		.Include<Post>(x => x.CommentsId)

	if (post == null || post.IsPublicPost(key) == false)
		return HttpNotFound();

	var comments = RavenSession.Load<PostComments>(post.CommentsId);
	if (comments == null)
		return HttpNotFound();

	var commenter = RavenSession.GetCommenter(input.CommenterKey);
	if (commenter == null)
		input.CommenterKey = Guid.NewGuid();

	ValidateCommentsAllowed(post, comments);
	ValidateCaptcha(input, commenter);

	if (ModelState.IsValid == false)
		return PostingCommentFailed(post, input, key);

	TaskExecutor.ExcuteLater(new AddCommentTask(input, Request.MapTo<AddCommentTask.RequestValues>(), id));

	CommenterUtil.SetCommenterCookie(Response, input.CommenterKey.MapTo<string>());

	return PostingCommentSucceeded(post, input);

In this case, I have a lot of validation, but the actual work is delegated to the “AddCommentTask” object. It’s a command object, that takes care of performing the actual work of the task outside of MVC, validation, ActionResults and the like.

We limited our abstractions to concepts (tasks) and we could do the same with queries if we started to have more than one concept at play.

Testing strategies

My testing strategy these days is:

  • Unit test isolated components (domain models and other already isolated classes)
  • Integration test the rest

I don’t do automocking containers. I only stub out components I can’t control. Otherwise, it’s a strategy of pushing behavior as far down as possible, but not by introducing abstractions.

For something like a database, my tests will be slower. I’m willing to accept that, since the tradeoff is ease of refactoring. My tests don’t break because something needs to be stubbed differently.

In these controllers, I would just make sure I have a seam to test. In the RaccoonBlog case, it’s just a matter of swapping out the underlying session with one going against an in-memory version, and my tests become much faster. But the seam exists.

But otherwise, I just don’t bother. Introducing a repository merely to stub something out is a waste of time from my experience. It introduces an unnecessary abstraction, where a concept (an encapsulated query object) would suffice. And even then, my test would look exactly the same.

Rather than focusing on abstractions, I’m instead focusing on concepts, and let the tests fall where they may. After all, my controllers above aren’t object-oriented – they’re procedural, as their needs warrant.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in Architecture. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Sean Feldman

    Good post Jimmy. Would be valuable if you’d present the test you’ve mentioned as well.

  • I’m agreeing with you far too much these days, Jimmy.

    After Ayende’s inflammatory posts a long time ago I realised what a burden the common misinterpretation of the repository pattern has become in most cases.

    My experiences of not abstracting data access have been very positive.

  • Dennis Doomen

    I do agree that any unnecessary abstraction should be removed, but we still decided to encapsulate individual queries into well-named single-purpose queries. Initially we used a thin fluent API on top of LINQ, but we discovered optimizing specific combinations of calls was practically impossible. Since then we’ve been successfully adapting the query pattern as explained in this article.

  • Bill Sorensen

    Because the Gang of Four, Kent Beck, Robert C. Martin, Michael Feathers, Mark Seemann, and all the others who have advocated loose coupling, programming to an interface, and unit testing must be wrong. After all, how much experience on different types of projects do those people have? (Sarcasm.)

    • the blind faith coder.

    • Anonymous

      Careful with argumentum ad verecundiam – these are patterns meant to be put in place when complexity warrants – not before!

      It’s why I like the refactoring books so much – I can defer decisions on to do OO when complexity presents itself, but not any sooner.

  • Jay Douglass

    Jimmy could you give a quick explanation of Request.MapTo()?

  • John

    I remember coming upon Evan’s book and thought the repository was fantastic. The main thing was that it could enforce business rules. So a car might be able to have different size tires, but the back two would have to match and so would the front. The car aggregate root handled the burden and it made sense.

    But alas, using repos in the real world has been nothing more than a new name for a data access layer. For some reason, everyone wants to implement an IRepo.

    I glanced at Bob Martin’s Clean Code book the other day after not having done so in awhile. Bob hit me between the eyes: objects are not data structures. Objects are about behavior. Data structures are about properties. What we called objects were really data structures, used mostly just for retrieving and updating data. We never needed the repository pattern.

  • I can think of two reasons to abstract away queries: (1) It simplifies the controller so it’s easier to see at a glance what actions it performs/supports, (2) it allows for query reuse without the need to refactor.

    • Anonymous

      In that case, I think having Query objects might be a better fit?

      • Paul Tiseo

        Could you explain the difference between your Repository object and your Query object?

        • In short: Repo – bunch of methods FindSomethingXXXBy, Query Object: one query use case per class. And/Or you can use extension methods :)

  • Personally I think that you can avoid abstracting the datalayer when you do data-centered applications. You need to pull some data, probably modify it, and save it back. If this is the case and you don’t have much business logic, and you have one month only, then your approach is very good. However, for bigger applications with true business logic, this would some day lead to a great mess.

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1194()

  • Carlos Alberto Costa Beppler

    I understand your point, but what to do if your query (a complex one) must be used in more than one place?

    • Anonymous

      Extract into a query object? Or extension method?

      This case is rarer than I thought it would be (at least for the apps I’m building).

      My problem with a Repo is that it has ALL queries on it, when you only really need the one. It seems to violate the I in SOLID.

  • Wouldn’t code contracts, action filters and some nice AST LINQ “abstractions” work around your examples. I haven’t seen anything in your examples that couldn’t have my controller be “….FindPaid()”.

  • DavidL

    I agree with your points if Repository as just a place to put queries.

    When Repository becomes an isolation for where or how the data is stored in conjunction with UoW and IoC, it begins to have value.

    public class SomeEngine {
    private IUoW _uow;
    public SomeEngine(IUoW uow){
    _uow = uow;
    public SomeMethod( some parameters){
    IGenericRepo xRepo = _uow.Repo();
    IGenericRepo yRepo = _uow.Repo();

    var s = yRepo.GetAll().where(blah=>blah…);


    That’s a useful (and working) pattern.

    • Anonymous

      I agree! Definitely useful in that case!

  • Very good article, that certainly puts a perspective on things and why you don’t necessarily need all these Repository layers. I just wish I could find similar minded people; every company I work for seem to not even know things like this exist (let alone what an ORM is, or why you’d even want one, when you can just new up SqlConnections and SqlCommands). I actually just got fired two months ago from a development job because I was trying to get us to look beyond WebForms and massive code-behind files. The sad reality is good advice like this is usually the exception, and not the rule in most development shops.

    • Anonymous

      So um….we’re hiring :)

  • You reasonably asked “Is an ORM something that requires abstraction?”. Honestly speaking I feel more comfortable when I know that nobody can write something like this (we use EF):

    new MyDbContext().Set().Where(…)

    in our domain layer just because the project doesn’t reference EF assemblies.

    We use queries as first-class citizens in conjunction with a generic repository interface (I agree that named-query-method repositories violate SRP and OCP) and a UoW interface . If we assume we can live with a domain layer referencing ORM assemblies all that is encapsulated in the IRepository implementation (for example I have to do LINQ query rewriting in order to support enumeration classes very similar to yours) can be moved to the base Query class.
    But enforcing some principles via preventing direct (eventually wrong) usage of the ORM is only part of the story. Another very important reason to keep the IRepository interface is that it shows exactly what is needed by the class that depends on it. Having the RavenSession property in controller classes (I suppose that’s the case as I have read Ayende’s “Limit your abstractions” series – by the way they were very helpful to me) seems to me as step toward service locator – one controller (the same apply to query and command objects) could query/change every single entity in the domain model. Isn’t that violation of SRP? I think having constructor showing that this controller/query/command depends on one or several repositories is more intention revealing. Yes, sometimes I feel the pain of having too many dependencies ( as it is not always easy to limit to only one repository but I still resist to add Execute() method to my base Query class (plus gateway to the whole domain model like the RavenSession property). I would appreciate your opinion.

    • Anonymous

      So what I went back and challenged myself on was “is it wrong to have direct access to ORM in the controller”. For me, personally, I realized that I was only really wanting to prevent ADO.NET access right in the controller. Besides that, it was a choice. I could go to a query to encapsulate or whatever.

      As for dependencies, what bothers me about controllers in general is that there is very little shared behavior amongst actions. If you think about it, controllers are a construct to satisfy easy route definitions. Otherwise, I’d be cool having controller-less actions.

      Just my $0.02 :)

  • I went back and removed every repository and abstraction I could from my current project and now call the NHibernate session directly from a controller. Best decision I’ve made as regards to performance. Each query can be optimised per action rather than being absracted to a common “GetAll” or whatever and it’s made solving common issues regarding NHibernate a breeze..

  • Evgeny mz

    There are still two issues:

    1) Let’s assume you want to add an additional condition .Where(post => post.Language == Language). You have to search through all your actions instead of doing it on one file (repository class);
    2) Someone wants to get archived posts for N-date – he would repeat your Achive-action code because he wouldn’t be able to examine hundreds of actions to find that there’s already the code that could be abstracted in a query object as you suggested.

    How do we solve those?

    • Anonymous

      1) This is every single query in the system?
      2) Hundreds of actions? You have other problems then. How is many objects different than many methods? It’s the same problem, now I just have a god object.

      • Dave

        Re: 2) Doesn’t seem like the same problem to me. If I have 15 methods in a single, dedicated repository – it’s much easier to check those than to check 100′s of varied/duplicated/inconsistent queries spread through out controller action methods.

        • jbogard

          I don’t have varied/duplicated/inconsistent queries. Each query is for a specific purpose. Cross-cutting query concerns are addressed using cross-cutting solutions, not “everything in a file”.

    • Alexey Zimarev

      Write an extension method for I(Document)Session in a separate static class, one per aggregate root or use case, whatever you find better for your particular case. And you can reuse the same query thousand times.

  • John Reilly

    Hi Jimmy,

    Really good article – thank you. I’ve recently been coming to similar conclusions to you and found your post after writing a blog post about my own experiences:

    I was coming from the perspective of assuming that there must be a simple way to MOQ Entity Framework and get my unit tests up and running. It looks like I was mistaken both in the expectation that this was simple and also that it was even worthwhile!

    I’ve put a link in my blog post to this as I think you probably express the underlying issue rather better than I have – hope that’s okay?

  • Josh

    Can you please do a post expanding on your testing strategies around these ideas? You mention it briefly, and perhaps I’m misinterpreting what you mean by ‘I only stub out components I can’t control’. I read this statement as the opposite to the advice ‘don’t mock objects you don’t own.’ If you’re calling your ORM directly, there is no way to isolate your controller test, and you’re forced to integrate test it. It sounds like you advocate the approach to create a thin wrapper around the ORM which simply delegates to it to provide a seam to isolate test your controller, but then you say ‘…it’s just a matter of swapping out the underlying session with one going against an in-memory version…’. I’m unfamiliar with RavenDB, but we use Entity Framework, and running LINQ against an in-memory store (LINQ to objects) is not the same as against the database (LINQ to Entities), so swapping the ORM doesn’t create a useful test. Is it a case that you just integrate test your controller (keeping your abstractions simple) until you need to manage the complexity by introducing some abstraction for the query, at which time you change the controller test to an isolated one?

  • JustinBMichaels

    Jimmy, how would you approach integration testing with your ORM querying being done in your controller’s action? Do you integration test your controllers by asserting against your ViewResult.Model or do you go for an automated Ui test and verify against the html? I guess my confusion or over analyzing is happening when trying to test that my query returned what was expected.

  • Pingback: Favor query objects over repositories | Jimmy Bogard's Blog()

  • Rodi de Boer

    Nice post. I think I’m starting to get this. So we don’t just do a repository, but we encapsulate and introduce the right OO pattern at the momet we find the need to really do so.
    Thanks. All those posts on the internet; combining them to something I can digest and make my own is pretty hard. :)

  • Rodi de Boer

    Ok, so what’s the plan if you have several “front-ends” on the same domain intersecting; for example an MVC app that let’s users do things like management of certain things and a web service that lets other systems do things (some management and some other stuff too)?
    Would you just use the Task/TaskExecuter scenario for all the intersecting functionality?
    And is that Task scenario an implementation of the Command pattern?

    • Alexey Zimarev

      Commands and Queries aren’t going anywhere. In addition when having multiple front-ends you most probably will end up with CQRS implementation based on services+messaging where services and message handlers will use queries and commands.

  • binidas

    Nice article. Does that mean you have got rid of service layer as well and handled all the business logic within Controller or may be extracted to separate class as and when required?

  • Ben Lewies

    Hi Jimmy

    Good article, I think that over-engineering is definitely a problem and this kind of thinking can definitely help curb some of that. I do think that readers have to apply different kinds of thinking as befits their specific circumstances and the complexity of the problems / requirements they are facing.

    You pose the following question: “In our controller action, what does it matter exactly that this code lies in our controller or another class? “.

    For anything but the simplest of use cases, it matters, because:
    - maintenance becomes a real headache each time you make changes to your datamodel.
    - not everybody are full stack developers or have intimate knowledge on the inner workings of their ORM’s. Seperation of concerns plays a vital role in effectively developing a system as a team in such cases.
    - In larger systems where you have multiple databases and contexts, you are bound to repeat yourself without properly abstracting your data access code.
    - You might want to swop out your ORM or the type of database (RDBMS vs No-Sql) at a certain stage. Using the above approach would basically mean rewriting all of your controller actions and data access code (as opposed to just your data access code).

    I also found that certain traits / properties and behaviours of ORM’s can cause issues and inconsistencies during runtime, and as such it is best to isolate such behaviour behind an interface, which provides consistent and predictable behaviour during runtime.

    Three examples of such unexpected / unpredictable / inconsistent behaviour jumps to mind (let’s assume that EF is the ORM in these cases):
    1. Dynamic proxy creation. In certain cases, when trying to map an entity to (lets say) a viewmodel during runtime using Automapper, the mapping will fail -unexpectedly- because the runtime type of the object is a proxy and is not the same type as specified in your actual code.
    2. Lazy loading. Due to lazy loading being enabled, you might not know during design time which objects will be in memory and which will need to be fetched. Once again, unpredictable performance will follow in this case.
    3. Entity tracking and database updates. EF does not play well with untracked entities and graph updates. The native solution to this is managing the tracking state of your objects manually and possibly making multiple updates to the database for the updated entities. Messy. Not something you want to do in your MVC controllers.

    There are a multitude of other reasons, but I think you understand what I am driving at here. Note that when I say “unpredictable”, I do not mean to say that someone with in-depth knowledge of the ORM’s inner workings would not be able to predict this behaviour, but merely that most developers will be caught off-guard by these issues.

    In my humble opinion, I have to question how well the default abstractions of the ORM have been designed when you have leaky abstractions like this spilling all over the place. Abstracting and isolating these sorts of behaviours and complexities away is definitely warranted in many cases.

    I think that, in general, the problem is not as much with abstractions over your ORM as such, but rather:
    1. Developers/architects not applying the right level of abstraction as befitting to their specific problem / use cases. You have to take current and future team dynamics into consideration! Abstraction applied correctly could mean that only the appropriate team members (those who understand the ORM’s inner workings) are exposed to its intricacies, as opposed to the whole team.
    2. When abstraction is warranted, people often opt to apply the wrong abstractions (for instance, generic repository pattern, where CQRS might have been the better option).

    I look forward to your comments :)

  • Desy13

    I would be **ALMOST** convinced by this, if there wouldn’t be one big fat huge problem: When you use `IQueryable` in all your domain you can’t use ORM specific stuff (like .Include or .ToListAsync) and this just doesn’t make it practical to directly use IQueryable

    Do you have any solutions for that? I searched weeks for a way to abstract ToListAsync and Include out of EF Core, but no way unless I’d be willing to reimplement all the extension methods and ToListAsync just returns IAsyncEnumerable (Rx interface) or IAsyncEnumerableProvider (which is an interface defined in EF Core). And all other things like building the expression tree and passing it to a method are just plain ugly and would require to implement dozen of these methods (just as ToListAsync, SingleOrDefaultAsync) etc.