Introducing LazyLinq: Overview


This is the first in a series of posts on LazyLinq, a wrapper to support lazy initialization and deferred disposal of a LINQ query context, including LINQ to SQL’s DataContext:

  1. Introducing LazyLinq: Overview
  2. Introducing LazyLinq: Internals
  3. Introducing LazyLinq: Queryability
  4. Simplifying LazyLinq
  5. Introducing LazyLinq: Lazy DataContext

Motivation

I recently posted an approach to dealing with IDisposable objects and LINQ. In the comments at LosTechies, Steve Gentile mentioned that my IQueryable example didn’t actually compile:

IQueryable<MyType> MyFunc(string myValue)
{
return from dc in new MyDataContext().Use()
from row in dc.MyTable
where row.MyField == myValue
select row;
}

Steve suggested using AsQueryable() on the result of the query, which does indeed fix the build. However, the purpose of returning an IQueryable is that it would allow us to perform additional query operations using the original query provider. Since the query result isn’t IQueryable, AsQueryable() will use a query provider based on LINQ to Objects, with an additional performance penalty to compile the expression trees into IL.

Even worse, because Use() is returning an IEnumerable<T> the entire query is actually executed with LINQ to Objects. Even though dc.MyTable is IQueryable, the translated query treats it as a simple IEnumerable, essentially performing a SELECT * before executing all query operations on entity objects in memory. It should go without saying that this is less than ideal.

Introducing LazyLinq

After several iterations, I think I have a better solution. In this post I’ll review the architecture of the solution, with posts to follow detailing the implementation.

LazyLinq is implemented around three interfaces. The first serves as a deferred query context provider:

public interface ILazyContext<TContext> : IDisposable
{
TContext Context { get; }

ILazyQueryable<TContext, TResult, TQuery>
CreateQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
where TQuery : IQueryable<TResult>;

ILazyOrderedQueryable<TContext, TResult, TQuery>
CreateOrderedQuery<TResult, TQuery>(Func<TContext, TQuery> queryBuilder)
where TQuery : IOrderedQueryable<TResult>;

TResult Execute<TResult>(Func<TContext, TResult> action);
}

An implementer of ILazyContext has four responsibilities:

  1. Lazily expose the Context.
  2. Produce lazy wrappers to represent queries retrieved from a context by a delegate.
  3. Execute an action on the context.
  4. Ensure the context is disposed as necessary.

The remaining interfaces serve as lazy query wrappers, corresponding to IQueryable<T> and IOrderedQueryable<T>:

public interface ILazyQueryable<TContext, TSource, TQuery>
: IQueryable<TSource>
where TQuery : IQueryable<TSource>
{
ILazyContext<TContext> Context { get; }
Func<TContext, TQuery> QueryBuilder { get; }
}
public interface ILazyOrderedQueryable<TContext, TSource, TQuery>
: ILazyQueryable<TContext, TSource, TQuery>, IOrderedQueryable<TSource>
where TQuery : IOrderedQueryable<TSource>
{ }

An implementer of ILazyQueryable has four responsibilities:

  1. Expose the Context from which it was created.
  2. Expose a delegate that represents how the deferred query is built from Context.
  3. Implement IQueryable for the deferred query.
  4. Ensure the context is disposed after the query is enumerated.

If it seems like these interfaces don’t do much, you’re absolutely correct. As we’ll see later, the light footprint gives us considerable flexibility.

LINQ to ILazyContext

Defining a few interfaces is all well and good, but the real goal is to simplify working with our disposable context. What if I told you that our original use case didn’t need to change at all (other than the lazy return type)?

ILazyQueryable<MyType> MyFunc(string myValue)
{
return from dc in new MyDataContext().Use()
from row in dc.MyTable
where row.MyField == myValue
select row;
}

We can’t implement it yet, but our new Use() extension method will have this signature:

public static ILazyContext<TContext> Use<TContext>(this TContext @this) { ... }

This is where we really start to bend LINQ against its will. As the first step in the query translation process, the compiler will translate our from clauses into a call to SelectMany. All we need to do is provide a SelectMany method for ILazyContext that the compiler will find acceptable:

public static ILazyQueryable<TContext, TResult, IQueryable<TResult>> SelectMany<TContext, TCollection, TResult>(
this ILazyContext<TContext> lazyContext,
Expression<Func<TContext, IQueryable<TCollection>>> collectionSelector,
Expression<Func<TContext, TCollection, TResult>> resultSelector)
{

The method signature is a slight variation from the corresponding overload of Queryable.SelectMany(), changed to require that collectionSelector returns an IQueryable that we can defer. If it doesn’t, the compiler will complain:

An expression of type ‘System.Collections.Generic.IEnumerable’ is not allowed in a subsequent from clause in a query expression with source type ‘Solutionizing.Linq.Test.MyDataContext’. Type inference failed in the call to ‘SelectMany’.

Now that we’ve hijacked the query, we can control the rest of the translation process with the returned ILazyQueryable. Recalling that our ILazyContext knows how to make an ILazyQueryable, we just need to give it a QueryBuilder delegate:

return lazyContext.CreateQuery<TResult, IQueryable<TResult>>(context =>
{
Func<TContext, IQueryable<TCollection>> getQueryFromContext = collectionSelector.Compile();
IQueryable<TCollection> query = getQueryFromContext(context);

ParameterExpression rangeParameter = resultSelector.Parameters[1];
InvocationExpression invoke = Expression.Invoke(resultSelector, Expression.Constant(context), rangeParameter);
Expression<Func<TCollection, TResult>> selector = Expression.Lambda<Func<TCollection, TResult>>(invoke, rangeParameter);

return query.Select(selector);
});
}

This is pretty dense, so let’s walk through it:

  1. Our lambda expression’s context parameter represents the MyDataContext that will be passed in eventually.
  2. We’re going to manipulate the expression trees passed into the method, which will look something like this:
    • collectionSelector: dc => dc.MyTable
    • resultSelector: (dc, row) => new { dc = dc, row = row }
  3. Compiling collectionSelector produces a delegate we can invoke on context to get an IQueryable<TCollection>context.MyTable, in this case.
  4. Before we can use resultSelector on MyTable, we need to wrap it in a lambda expression to eliminate its first parameter.:
    1. Save the second parameter (row) to use later.
    2. Create a new invocation expression that will represent calling resultSelector with the current context and our saved row parameter.
    3. Create a new lambda expression that will accept that same row parameter and return the invocation expression.
  5. The resulting selector, of type Expression<Func<TCollection, TResult>>, can then be passed to query.Select() which happily returns the desired IQueryable<TResult>.

Essentially we’re pretending that the SelectMany call is just a Select call on the IQueryable<TCollection> generated by collectionSelector, all wrapped in a lazy query.

Hopefully this overview has piqued your interest. Next time we’ll look at a base implementation of the interfaces.

Using IDisposables with LINQ