Grouping Linq Aggregates in C#

Using aggregates such as “Count”, “Average”, “Sum” etc… is pretty easy in simple queries. Lets say we have a table like this:

Product Table

So, if we want to query all of the products with the name “Bike” then our query will look like this:

var products = from p in db.Products 
        where p.Name == "Bike" select p;

Then if we want to expand on that to just get the count of those products then we will create a query that looks like this:

int count = (from p in db.Products 
    where p.Name == "Bike" select p).Count();

Or if we wanted to get the average price of those products we would do this:

decimal count = (from p in db.Products
                where p.Name == "Bike"
                select p.Price).Average();

Or if we wanted to get all of the names appended together:

string names = (from p in db.Products
                where p.Name == "Bike"
                select p.Name)
                .ToArray()
                .Aggregate((p1, p2) => p1 + " " + p2);

That last one was just thrown in for fun. It doesn’t even really apply to this post. You’ll notice though that I put a “ToArray()” in there, well, that is because I am using Linq To Sql and the Linq To Sql provider doesn’t support the Aggregate operator. So anyways, lets get on to what we are really here for.

Now all of those were probably pretty clear, but what do we do if we want to group our products by name? Well, you would do something like this:

var groups = from p in db.Products
            group p by p.Name
            into g
            select g;

Okay, that looks pretty easy, but what exactly is put into “groups” when this statement is run? Well, it is an IOrderedQueryable, just like most Linq queries, but what is actually produced when it is executed? For example, what if we put a ToArray() on this statement, like we did in the statement above that used “Aggregate”? What would each item in the array hold? Well, there is an interface in linq called IGrouping which is defined as IGrouping<TKey, TElement>. IGrouping is what the array would be filled with. In our case it would be an “IGrouping<String,Product> since we are grouping on “Name” which is a string. IGrouping also implements IEnumerable, so you can do things like this as well:

var names = from p in db.Products
            group p by p.Name
            into g
                select 
                (from c in g select c.Name)
                .Distinct().Single();

This allows you to do sub-queries on these groups that you have now formed. This query does in fact query the names out of each group and then filters out the duplicates, then it calls “Single()” to get a list of items rather than a list of IEnumerables with a single item in them. Seems kinda crazy to have to do all that to get the names of the groups that we grouped on, right? Well, it is crazy. There is a much easier way to do this:

var names = from p in db.Products 
            group p by p.Name 
            into g select g.Key;

So, as you can see our IGrouping interface has a property called Key that allows us to access the key that we are grouped on. Isn’t that just great? So, now that we have a list of names, how do we get the count in each group? Well, as I stated earlier, IGrouping implements IEnumerable so we get access to all of our wonderful extension methods:

var groups = from p in db.Products
            group p by p.Name
            into g
            select new {Name = g.Key, 
                        Count = g.Count()};

So there you have it, we now have a Linq query that is capable of grouping on a specific key and then return a count for each of our groups. Not the most straightforward thing if you are thinking in traditional SQL syntax, but you’ll get used to it! Hope you enjoyed!

Be Sociable, Share!

3 comments

  1. Linq just rocks! – Your article is another prove

  2. Instead of ToArray() use .AsEnumerable() for situations where one needs to process data using the linq to object extentions not available for linq to sql.

  3. What would be performance of that compared to straight T-SQL?

Leave a comment