LINQ Efficiency – Is It More Efficient Than It Appears?

clinq

If I write something like this:

var things = mythings
    .Where(x => x.IsSomeValue)
    .Where(y => y.IsSomeOtherValue)

Is this the same as:

var results1 = new List<Thing>();
foreach(var t in mythings)
    if(t.IsSomeValue)
        results1.Add(t);

var results2 = new List<Thing>();
foreach(var t in results1)
    if(t.IsSomeOtherValue)
        results2.Add(t);

Or is there some magic under the covers that works more like this:

var results = new List<Thing>();
foreach(var t in mythings)
    if(t.IsSomeValue && t.IsSomeOtherValue)
        results.Add(t);

Or is it something completely different altogether?

Best Answer

LINQ queries are lazy. That means the code:

var things = mythings
    .Where(x => x.IsSomeValue)
    .Where(y => y.IsSomeOtherValue);

does very little. The original enumerable (mythings) is only enumerated when the resulting enumerable (things) is consumed, e.g. by a foreach loop, .ToList(), or .ToArray().

If you call things.ToList(), it is roughly equivalent to your latter code, with perhaps some (usually insignificant) overhead from the enumerators.

Likewise, if you use a foreach loop:

foreach (var t in things)
    DoSomething(t);

It is similar in performance to:

foreach (var t in mythings)
    if (t.IsSomeValue && t.IsSomeOtherValue)
        DoSomething(t);

Some of the performance advantages of the laziness approach for enumerables (as opposed to calculating all the results and storing them in a list) are that it uses very little memory (since only one result is stored at a time) and that there's no significant up-front cost.

If the enumerable is only partially enumerated, this is especially important. Consider this code:

things.First();

The way LINQ is implemented, mythings will only be enumerated up to the first element that matches your where conditions. If that element is early on in the list, this can be a huge performance boost (e.g. O(1) instead of O(n)).

Related Topic