C# – math stats with Linq

clinq

I have a collection of person objects (IEnumerable) and each person has an age property.

I want to generate stats on the collection such as Max, Min, Average, Median, etc on this age property.

What is the most elegant way of doing this using LINQ?

Best Answer

Here is a complete, generic implementation of Median that properly handles empty collections and nullable types. It is LINQ-friendly in the style of Enumerable.Average, for example:

    double? medianAge = people.Median(p => p.Age);

This implementation returns null when there are no non-null values in the collection, but if you don't like the nullable return type, you could easily change it to throw an exception instead.

public static double? Median<TColl, TValue>(
    this IEnumerable<TColl> source,
    Func<TColl, TValue>     selector)
{
    return source.Select<TColl, TValue>(selector).Median();
}

public static double? Median<T>(
    this IEnumerable<T> source)
{
    if(Nullable.GetUnderlyingType(typeof(T)) != null)
        source = source.Where(x => x != null);

    int count = source.Count();
    if(count == 0)
        return null;

    source = source.OrderBy(n => n);

    int midpoint = count / 2;
    if(count % 2 == 0)
        return (Convert.ToDouble(source.ElementAt(midpoint - 1)) + Convert.ToDouble(source.ElementAt(midpoint))) / 2.0;
    else
        return Convert.ToDouble(source.ElementAt(midpoint));
}