Some improved LINQ operators

I ran across a couple of scenarios the other day that were made pretty difficult given the current LINQ query operators.  First, I needed to see if an item existed in a collection.  That’s easy with the Contains method, when you want to find item that matches all the attributes you’re looking for.

Suppose I want only one attribute to match?  For example, I have a Person class:

public class Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
}

What if I want to see if a collection of Persons contains someone with the last name “Smith”?  Contains only gives me two options:

public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value);
public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value, IEqualityComparer<TSource> comparer);

That doesn’t help, I have to implement some interface just to match against the LastName.  Typically, this is solved with one of two options:

// Inefficient Contains replacement
values
    .Where(person => person.LastName == "Smith")
    .Count()
    .ShouldBeGreaterThan(0);

// Efficient, but ugly and hard to use
values
    .Where(person => person.LastName == "Smith")
    .FirstOrDefault()
    .ShouldNotBeNull();

The first example is inefficient because Count() iterates through all of the values found, where I only really care if one is found.  The second example works, but loses the intent of what I’m trying to find out.

I also had the same types of problems with Distinct, where I’d like to find distinct elements, but only looking at a certain value.  I had to implement the same IEqualityComparer (very annoying).

Better LINQ extensions

Instead of implementing some crazy interface, I’d like to just give the Contains and Distinct query operators an expression of what to look for.  I’d like this test to pass:

 [Test]
 public void Better_enumerable_extensions()
 {
     var values = new[]
                      {
                          new Person {FirstName = "Bob", LastName = "Smith"},
                          new Person {FirstName = "Don", LastName = "Allen"},
                          new Person {FirstName = "Bob", LastName = "Sacamano"},
                          new Person {FirstName = "Chris", LastName = "Smith"},
                          new Person {FirstName = "George", LastName = "Allen"}
                      };

     values
         .Distinct(person => person.LastName)
         .Count()
         .ShouldEqual(3);

     values
         .Distinct(person => person.FirstName)
         .Count()
         .ShouldEqual(4);

     values
         .Contains("Smith", person => person.LastName)
         .ShouldBeTrue();

     values
         .Contains("Nixon", person => person.LastName)
         .ShouldBeFalse();
}

In the Distinct example, I pass in a lambda expression of the distinct attribute I’m looking for.  In the Contains example, I pass in the lambda expression, as well as the value I’m looking for.

To do this, I’ll need to create my extensions class with the new extension methods:

public static class BetterEnumerableExtensions
{
    public static IEnumerable<TSource> Distinct<TSource, TResult>(
        this IEnumerable<TSource> source, Func<TSource, TResult> comparer)
    {
        return source.Distinct(new DynamicComparer<TSource, TResult>(comparer));
    }

    public static bool Contains<TSource, TResult>(
        this IEnumerable<TSource> source, TResult value, Func<TSource, TResult> selector)
    {
        foreach (TSource sourceItem in source)
        {
            TResult sourceValue = selector(sourceItem);
            if (sourceValue.Equals(value))
                return true;
        }
        return false;
    }
}

Yeah yeah, all those angle-brackets really start to get ugly.  The new Contains method takes in the selector method now, in the form of a Func delegate.  In the body, I just loop through the source items, evaluating the selector for each item.  If the source value matches the value I’m searching for, I return “true” immediately and stop looping.  Otherwise, I return false.

The new Distinct method uses the existing Distinct, but now it’s using a new DynamicComparer class:

public class DynamicComparer<T, TResult> : IEqualityComparer<T>
{
    private readonly Func<T, TResult> _selector;

    public DynamicComparer(Func<T, TResult> selector)
    {
        _selector = selector;
    }

    public bool Equals(T x, T y)
    {
        TResult result1 = _selector(x);
        TResult result2 = _selector(y);
        return result1.Equals(result2);
    }

    public int GetHashCode(T obj)
    {
        TResult result = _selector(obj);
        return result.GetHashCode();
    }
}

It has to do similar things as the Contains method, where I evaluate the items passed in against the selector method delegate passed in earlier.  In any case, the existing Distinct method works the way I want to, without me needing to re-implement its internal logic as I did with the Contains.

I tried using the DynamicComparer with the Contains method, but it just worked out better re-implementing the logic.

Intention-revealing interfaces == good

The Where/Count or even the Where/FirstOrDefault ways of getting the Contains is just plain ugly.  By passing in a selector method, I can describe exactly what I’m looking for.  In the case of Distinct, having to create a custom IEqualityComparer just for that is unnecessary most of the time.  When I saw that initially, it just looked like more trouble than it was worth.  But with the new and improved extensions, I get a much cleaner implementation.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Jimmy Bogard

I'm a technical architect with Headspring in Austin, TX. I focus on DDD, distributed systems, and any other acronym-centric design/architecture/methodology. I created AutoMapper and am a co-author of the ASP.NET MVC in Action books.
This entry was posted in C#, LINQ. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://nextension.blogspot.com/ Nextender

    There is
    public static bool Any(this IEnumerable source, Func predicate);

  • Alexey Romanov

    Why not

    values
    .Where(person => person.LastName == “Smith”)
    .Any()
    ?

  • http://blog.linqexchange.com Philip

    values.Any(person => person.LastName == “Smith”)

    You can skip the Where() and put the lambda directly in the Any().

  • Erik

    Yeah I was gonna suggest values.Any() too, but it’s been suggested to death. Nobody else touched on the fact that your extensions won’t play nice with other LINQ flavors, particularly LINQ to SQL, which may or may not be a problem.

  • http://jimmybogard.lostechies.com Jimmy Bogard

    @Nextender, @Philip

    Yeah, thanks for the heads-up on the Any method. Don’t need the Contains, although the Any method could have named a little better.

  • http://silverlightme.net al pascual

    great post, we need more LINQ posts!

  • http://Bryan.ReynoldsLive.com Bryan Reynolds

    Post and the comments where educational

  • http://weblogs.asp.net/okloeten/ Omer van Kloeten

    Hey Jimmy,

    I’ve actually written about these exact things at my blog recently.
    You should take a look and you might find something to your liking:
    http://weblogs.asp.net/okloeten/archive/tags/Tools_3A00_+Linq+Extensions/default.aspx

    Cheers,
    Omer

  • http://jimmybogard.lostechies.com Jimmy Bogard

    @Omer

    Thanks for the LINQ link (ha). I’ll check out that CodePlex project, looks interesting.

  • Jero

    I quite like your DynamicComparer but I have run into a snag with it that I really can’t work out how to fix.
    If your _selector evals to null in GetHashCode() it will throw an exception.
    eg
    var spam = new string[3] { “a”, “b”, null };
    var results = spam.Distinct( c => c );

    This will crash.

  • hero

    Usually I use the best Megaupload file searcher- http://megauploadfiles.com/