Hacking LINQ Expressions: Select With Index
First, a point of clarification: I use LINQ Expressions to mean
(Language-INtegrated) Query Expressions (the language feature) rather
than Expression Trees (the .NET 3.5 library in System.Linq.Expressions).
So what do I mean by “Hacking LINQ Expressions”? Quite simply, I’m
not content with the rather limited set of operations that query
expressions allow me to represent. By understanding how queries are
translated, we can use various techniques to broaden our expressive
reach. I have already documented one such hack for managing IDisposable objects with LINQ, so I guess we can call this the second in an unbounded series.
The Problem
In thinking over use cases for functional construction of web control trees, I paused to think through how I would express alternate row styling. My mind immediately jumped to the overload of Select() that exposes the current element’s index:
Controls.Add(
new Table().WithControls(
data.Select((x, i) =>
new TableRow() {
CssClass = i % 2 == 0 ? "" : "alt"
}.WithControls(
new TableCell().WithControls(x)
)
)
)
);
This works fine for simple cases, but breaks down for more complex queries:
Controls.Add(
new Table().WithControls((
from x in Xs
join y in Ys on x.Key equals y.Key
select new { x, y }
).Select((z, i) =>
new TableRow() {
CssClass = i % 2 == 0 ? "" : "alt"
}.WithControls(
new TableCell().WithControls(z.x.ValueX, z.y.ValueY)
)
)
)
);
The Goal
Instead, I propose a simple extension method to retrieve an index at arbitrary points in a query:
var res = from x in data
from i in x.GetIndex()
select new { x, i };
Or our control examples:
Controls.Add(
new Table().WithControls(
from x in data
from i in x.GetIndex()
select new TableRow() {
CssClass = i % 2 == 0 ? "" : "alt"
}.WithControls(
new TableCell().WithControls(x)
)
)
);
Controls.Add(
new Table().WithControls(
from x in Xs
join y in Ys on x.Key equals y.Key
from i in y.GetIndex()
select new TableRow() {
CssClass = i % 2 == 0 ? "" : "alt"
}.WithControls(
new TableCell().WithControls(x.ValueX, y.ValueY)
)
)
);
Much like in the IDisposable
solution, we use a from
clause to act as an intermediate assignment. But in this case our hack is a bit trickier than a simple iterator.
The Hack
For this solution we’re going to take advantage of how multiple from
clauses are translated:
var res = data.SelectMany(x => x.GetIndex(), (x, i) => new { x, i });
Looking at the parameter list, we see that our collectionSelector
should return the result of x.GetIndex()
and our resultSelector
‘s second argument needs to be an int
:
public static IEnumerable<TResult> SelectMany<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, SelectIndexProvider> collectionSelector,
Func<TSource, int, TResult> resultSelector)
The astute observer will notice that the signature of this resultSelector
exactly matches the selector
used by Select
‘s with-index overload, trivializing the method implementation:
{
return source.Select(resultSelector);
}
Note that we’re not even using collectionSelector
! We’re just using its return type as a flag to force the compiler to use this version of SelectMany()
. The rest of the pieces are incredibly simple now that we know the actual SelectIndexProvider
value is never used:
public sealed class SelectIndexProvider
{
private SelectIndexProvider() { }
}
public static SelectIndexProvider GetIndex<T>(this T element)
{
return null;
}
And for good measure, an equivalent version to extend IQueryable<>:
public static IQueryable<TResult> SelectMany<TSource, TResult>(
this IQueryable<TSource> source,
Expression<Func<TSource, SelectIndexProvider>> collectionSelector,
Expression<Func<TSource, int, TResult>> resultSelector)
{
return source.Select(resultSelector);
}
Because we’re just calling Select()
, the query expression isn’t even aware of the call to GetIndex()
:
System.Linq.Enumerable+
d__b1.Select((x, i) => (x * i))
We’re essentially providing our own syntactic sugar over the sugar already provided by query expressions. Pretty sweet, eh?
As a final exercise for the reader, what would this print?
var res = from x in Enumerable.Range(1, 5)
from i in x.GetIndex()
from y in Enumerable.Repeat(i, x)
where y % 2 == 1
from j in 0.GetIndex()
select i+j;
foreach (var r in res)
Console.WriteLine(r);