Efficient querying with LINQ, AutoMapper and Future queries

Even after all these years, I’m still a big fan of ORMs. One common complaint over the years is people using ORMs use the tool identically for both reads and writes. With writes, I want tracked entities, managed relationships, cascades and the like. With reads, I don’t need any of that gunk, just an efficient means of getting a read-only view of my data. If I need to make multiple queries to gather the data, this often results in queries that return lots of data over multiple round-trips.

We can do better!

Let’s say we have a controller action (taken from the Contoso University Entity Framework sample) that pulls in instructor, course, and enrollment information:

public ActionResult Index(int? id, int? courseID)
{
    var viewModel = new InstructorIndexData();
 
    viewModel.Instructors = db.Instructors
        .Include(i => i.OfficeAssignment)
        .Include(i => i.Courses.Select(c => c.Department))
        .OrderBy(i => i.LastName);
 
    if (id != null)
    {
        ViewBag.InstructorID = id.Value;
        viewModel.Courses = viewModel.Instructors.Where(
            i => i.ID == id.Value).Single().Courses;
    }
 
    if (courseID != null)
    {
        ViewBag.CourseID = courseID.Value;
        viewModel.Enrollments = viewModel.Courses.Where(
            x => x.CourseID == courseID).Single().Enrollments;
    }
 
    return View(viewModel);
}

This doesn’t look so bad at first glance, but what isn’t so obvious here is that this involves four round trips to the database, one for each set of data, plus some wonky lazy loading I couldn’t figure out:

image

We could alter the original query to eagerly fetch with left outer joins those other two items, but that could seriously increase the amount of data we have returned. Since I’m only interested one instructor/course at a time here, I don’t really want to pull back all courses and enrollees.

There’s a bigger issue here too – I’m passing around a live queryable around, making it possible to modify, iterate and otherwise make a general mess of things.  Additionally, I pull back live entities and all entity data – again, more inefficient the wider and larger my tables become. Since the entities could be live, tracked entities, I’d want to be careful not to modify my entities on the way to the view for reading purposes.

Ideally, I’d hit the database exactly once for only the data I need, and nothing more. This is what I often see people create stored procedures for – building up the exact resultset you need at the database level, only getting what we need. First, let’s pull in AutoMapper and create a ViewModel that represents our projected data:

public class InstructorIndexData
{
    public IEnumerable<InstructorModel> Instructors { get; set; }
    public IEnumerable<CourseModel> Courses { get; set; }
    public IEnumerable<EnrollmentModel> Enrollments { get; set; }

    public class InstructorModel
    {
        public int ID { get; set; }

        [Display(Name = "Last Name")]
        public string LastName { get; set; }
            
        [Display(Name = "First Name")]
        public string FirstMidName { get; set; }

        [DisplayFormat(DataFormatString = "{0:yyyy-MM-dd}", ApplyFormatInEditMode = true)]
        [Display(Name = "Hire Date")]
        public DateTime HireDate { get; set; }

        public string OfficeAssignmentLocation { get; set; }

        public IEnumerable<InstructorCourseModel> Courses { get; set; } 
    }

    public class InstructorCourseModel
    {
        public int CourseID { get; set; }
        public string Title { get; set; }
    }

    public class CourseModel
    {
        public int CourseID { get; set; }
        public string Title { get; set; }
        public string DepartmentName { get; set; }
    }

    public class EnrollmentModel
    {
        [DisplayFormat(NullDisplayText = "No grade")]
        public Grade? Grade { get; set; }
        public string StudentLastName { get; set; }
        public string StudentFirstMidName { get; set; }
        public string StudentFullName
        {
            get
            {
                return StudentLastName + ", " + StudentFirstMidName;
            }
        }
    }
}

We can flatten many members out (Department.Name to DepartmentName). Next, let’s modify our controller action to project with LINQ and AutoMapper:

public ActionResult Index(int? id, int? courseID)
{
    var viewModel = new InstructorIndexData();

    viewModel.Instructors = db.Instructors
        .Project().To<InstructorIndexData.InstructorModel>()
        .OrderBy(i => i.LastName)
       ;

    if (id != null)
    {
        ViewBag.InstructorID = id.Value;
        viewModel.Courses = db.Instructors
            .Where(i => i.ID == id)
            .SelectMany(i => i.Courses)
            .Project().To<InstructorIndexData.CourseModel>();
    }

    if (courseID != null)
    {
        ViewBag.CourseID = courseID.Value;
        viewModel.Enrollments = db.Enrollments
            .Where(x => x.CourseID == courseID)
            .Project().To<InstructorIndexData.EnrollmentModel>();
    }

    return View(viewModel);
}

Finally, we’ll need to configure AutoMapper to build mapping definitions for these types:

Mapper.Initialize(cfg =>
{
    cfg.CreateMap<Instructor, InstructorIndexData.InstructorModel>();
    cfg.CreateMap<Course, InstructorIndexData.InstructorCourseModel>();
    cfg.CreateMap<Course, InstructorIndexData.CourseModel>();
    cfg.CreateMap<Enrollment, InstructorIndexData.EnrollmentModel>();
});

With these changes, our SQL has improved (somewhat) in reducing the data returned to only what I have in my view models:

exec sp_executesql N'SELECT 
    [Extent1].[CourseID] AS [CourseID], 
    [Extent1].[Grade] AS [Grade], 
    [Extent2].[LastName] AS [LastName], 
    [Extent3].[FirstName] AS [FirstName]
    FROM   [dbo].[Enrollment] AS [Extent1]
    LEFT OUTER JOIN [dbo].[Person] AS [Extent2] ON ([Extent2].[Discriminator] = N''Student'') AND ([Extent1].[StudentID] = [Extent2].[ID])
    LEFT OUTER JOIN [dbo].[Person] AS [Extent3] ON ([Extent3].[Discriminator] = N''Student'') AND ([Extent1].[StudentID] = [Extent3].[ID])
    WHERE [Extent1].[CourseID] = @p__linq__0',N'@p__linq__0 int',@p__linq__0=2042

We’re now only selecting the columns back that we’re interested in. I’m not an EF expert, so this is about as good as it gets, SQL-wise. EF does however recognize we’re using navigation properties, and will alter the SQL accordingly with joins.

We’re still issuing three different queries to the server, how can we get them all back at once? We can do this with Future queries, an extension to EF that allows us to gather up multiple queries and execute them all when the first executes. Pulling in the “EntityFramework.Extended” NuGet package, we only need to add “Future” to our LINQ methods in our controller:

public ActionResult Index(int? id, int? courseID)
{
    var instructors = db.Instructors
        .OrderBy(i => i.LastName)
        .Project().To<InstructorIndexData.InstructorModel>()
        .Future();

    var courses = Enumerable.Empty<InstructorIndexData.CourseModel>();
    var enrollments = Enumerable.Empty<InstructorIndexData.EnrollmentModel>();

    if (id != null)
    {
        ViewBag.InstructorID = id.Value;
        courses = db.Instructors
            .Where(i => i.ID == id)
            .SelectMany(i => i.Courses)
            .Project().To<InstructorIndexData.CourseModel>()
            .Future();
    }

    if (courseID != null)
    {
        ViewBag.CourseID = courseID.Value;
        enrollments = db.Enrollments
            .Where(x => x.CourseID == courseID)
            .Project().To<InstructorIndexData.EnrollmentModel>()
            .Future();
    }

    var viewModel = new InstructorIndexData
    {
        Instructors = instructors.ToList(),
        Courses = courses.ToList(),
        Enrollments = enrollments.ToList()
    };

    return View(viewModel);
}

Which results in all 3 queries getting sent at once to the server in one call:

image

One other modification I made is I ensured that all projection occurred within the controller action, by calling “ToList” on all the IQueryable/FutureQuery objects. I’d rather not have the view be able to modify the query or otherwise introduce any potential problems.

Now, the SQL generated is…interesting to say the least, but that’s not something I can control here. What has improved is I’m now only returning exactly the data I want, into objects that aren’t tracked by Entity Framework (and thus can’t be accidentally modified and updated through change tracking), and all my data is transferred in exactly one database command. I intentionally left the model/mapping alone so that it was a simple conversion, but I would likely go further to make sure the manner in which I’m querying is as efficient as possible.

AutoMapper’s auto-projection of LINQ queries plus Entity Framework’s FutureQuery extensions lets me be as efficient as possible in querying with LINQ, without resorting to stored procedures.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in AutoMapper, Entity Framework, LINQ. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Rob Gibbens

    As an additional optimization, you could wrap the viewModel in a Task, and make the Action async so that you’re not blocking the thread on what could be a long database operation.

    InstructorIndexData viewModel = null;
    await Task.Run(() => {
    viewModel = new InstructorIndexData
    {
    Instructors = instructors.ToList(),
    Courses = courses.ToList(),
    Enrollments = enrollments.ToList()
    };

    });

  • Robert Friberg

    Good use of AutoMapper there, one of my favorite libs :)

    Anyone dare guess what the rountrip time for a single tds package between 2 physical servers over a 1 gbit LAN?

    I did some performance analysis for a client recently and was working under the assumption that db roundtrips are expensive. The average transaction could do over 100 :) Turned out that roundtrip time was insignificant compared to what was going on in the db, so we were barking up the wrong tree. Luckily, we did a benchmark and discovered the mistake.

    That said, I’d aim for a single roundtrip per request as well.

  • robertmclaws

    Nice use of AutoMapper. May have to use that in my project. I should point out that the EntityFramework.Extended package is not compatible with EF 6.1. But my EntityFramework61.Extended package is. The original package does not always keep up with the new releases, but my port does.

    • jbogard

      Awesome! I wish this would just be rolled into EF itself. It’s OSS so I guess that shouldn’t be too difficult.

  • Efraín Reyes

    You can use EF’s .AsNoTracking() IQueryable extension precisely for data where you don’t need tracking.

    http://msdn.microsoft.com/en-us/library/gg679352(v=vs.103).aspx

    http://stackoverflow.com/q/14527721/2563028

    • jbogard

      Yep, and AsReadOnly() for NH. Maybe it’s just me, but I still prefer the explicitness of view models for reads. Thanks for the tip!

      • Jonny Bekkum

        And you should be aware that when loading many rows/fields in a model with associations/foreign keys may be 10x faster when you use AsNoTracking() in materialization of objects.

  • jasonsmale37

    I’ve notice in my own code, as well as this example, that if you use the AutoMapper projection feature to map multiple properties of a related entity, it can cause the SQL to join to the table multiple times. In this example, to get the first and last name, it joins to the Person table twice (Extent2 and Extent3). Is there any way to use the AutoMapper projection feature and avoid this issue?

    • Matt Freiburg

      Yeah, I’ve gotta say, I’ve already got a lot of contempt from our DBAs to overcome when it comes to use of ORMs in general. I know, I know, I’ve done a lot of work to try to ally those fears by showing their incremental improvements in the ways that they construct dynamic SQL. But this multi-joins-to-the-same-entity thing would have them laughing me right out of the building…and deservedly so. Kinda makes this a non-starter, in my mind.

      • jbogard

        I agree, I don’t know what’s going on there with multi-joins – except to say I don’t see that in NHibernate. For whatever reason, NH seems to generate much saner SQL.

  • Sean Hederman

    I’m sorry, I’m confused. You said you like ORMs but then your article is basically a round trip showing why ORMs are so awful. From read/write confusion to complexities over how they abstract the queries and stuff them up, to the mapping nightmare that seems an inevitable feature of them…

    You wrote a LOT of code there for a very, very simple task.

    • jbogard

      Sorry, I don’t mean to be dense – what’s the “lot of code”? I know the alternatives – raw SQL to DTOs, a la PetaPoco. I’ve gone that route too – only to find that those micro ORMs fall short when it comes to writes. This approach lets me keep what’s optimal for writes (mapping to an behavioral domain model) and optimal for reads (mapping to a DTO).

      Unless you’re talking about going with sprocs, which, good luck buddy :)

      • João P. Bragança

        Why not use both? Micro ‘orm’ on the read side and EF on the write side?

  • http://weblogs.asp.net/ricardoperes Ricardo Peres

    Jimmy,

    What AsReadOnly() method? There isn’t one in NH! :-)

    I did write one, though.

    • jbogard

      Ah, there is in the QueryOver API. Do you have a link to yours?

  • Eugene Kolupaev

    Where the .Project().To thing is came from? Is it from AutoMapper or EF?

    • jbogard

      AutoMapper, in the Extensions namespace.

  • Gene Reddick

    Assuming navigation properties exist and you always wanted to load the additional properties (courses, enrollements) is there any reason not to load them by using ForMember on the CreateMap?

    Wouldn’t Project().To pick them up from the Instructors table without having to loop in code or use the Future extension?

    .ForMember(dest => dest.Courses, opt => opt.MapFrom(src => src.SelectMany(e => e.Courses))

    ;

  • gabrielrdz

    Jimmy, in 2012, you wrote about choosing the right ORM strategy, and it resonated with me. I feel its crazy to go through the hoops you just went through in this example just to get your data to come in a single trip, and it is still not optimal SQL. Also in this article’s comments you said something like “if you mean going with sprocs, good luck buddy”. I will push the approach of micro ORM for reads and ORM for writes in my company because I believe it makes sense. Is there a reason why you didn’t go with something simpler (sprocs, micro) this time?

    • jbogard

      I don’t like having 2 ways of doing the same thing in my app. I can do reads and writes effectively with a full-blown ORM (and support micro-ORM usages too). For example, NHibernate lets you use raw SQL straight to DTOs. I’d just rather not worry about the conceptual weight of 2 ORMs.

      As for the SQL not being optimal, I can only guess it’s because of EF. The SQL NHibernate generates looks completely different. I’m not an EF expert, so I don’t understand why, but that’s what it is.

  • Brian Beatty

    This is great. Do you have an example or best practice where automapper is used in the data tier of a n-tier application?
    Where the client/controller doesn’t have a direct access to the EF model. It’s abstracted in the data assembly using DTOs in a Domain assembly being called from a business tier.
    But the Data layer is using Automapper?
    How do would you handle using an Automapper Profile in the data assembly, when it doesn’t have an equivalent to Application_OnStart event?

    Thanks

    • jbogard

      Check out my Put Your Controllers On a Diet (redux) – I use a mediator pattern to send commands/queries to a series of handlers that return DTOs:

      http://lostechies.com/jimmybogard/2013/12/19/put-your-controllers-on-a-diet-posts-and-commands/

      • Brian Beatty

        Sorry, I don’t see where Automapper fits in with the Link provided. Your repository is DI and there is no reference to AutoMapper. I want my repository to use automapper to transform EF’s return types into my Domain Objects. I have the EF container with internal access.

        • jbogard

          Did you check out the GETs and queries article? That’s the one that goes into details about AutoMapper. I’m not sure what you mean about transforming EF return types into domain objects – my domain models ARE the EF models.

          • Brian Beatty

            i see it now. But my project is not using EF classes as our Domain model.But I want to use Automapper in my DAL to map my EF models to Domain objects.

  • Pingback: Projecting computed properties with LINQ and AutoMapper | Jimmy Bogard's Blog

  • Or

    Great post :)

    Just wondering, why not using ‘.Future()’ as the default behavior (also works in NH off course)?

    Thanks.

    • jbogard

      Hmmm, good question. If you’re only doing one query, then it’s not needed. It also is a little bit confusing too, as it’s not obvious where the query actually hits the server.

      • Or

        agree, but that’s what called “lazy”, right? :)
        (you’ll have you’re data once you need it)

  • Laurentiu Macovei

    except the extensions framework is very buggy and dead!