codethinked (kōdthĭngked) adj. To be consumed by or obsessed with code.

A Visual Look At The LINQ SelectMany Operator

If you want to learn all about LINQ, first go check out my series on TekPub and then come back here!

So, I blogged a while ago about the LINQ SelectMany operator. The LINQ SelectMany operator is one of the most useful, misunderstood, and underused operators in your LINQ repertoire. In my previous post I gave you a decent idea of what you can do with the LINQ SelectMany operator, but I'm not quite sure that I did a very good job at really showing you how it works. In this post, I want to give you a more visual explanation of the LINQ SelectMany operator, and what it can do for you.

A Visual Explanation of SelectMany

As I said in my previous post about the SelectMany operator, MSDN describes it like this: "Projects each element of a sequence to an IEnumerable<T> and flattens the resulting sequences into one sequence." Again, once you are familiar with SelectMany, this is a great explanation, but it can still be a bit hard to visualize. So, let's see...

1) it takes a sequence:

image

2) And then it iterates over each element in this sequence:

 image

3) While it is iterating, it projects each element into an IEnumerable<T>:

 image

4) And then flattens the resulting IEnumerables back into a single sequence:

 image

Pretty easy to see  now, right? The SelectMany operator allows us to produce a single sequence with 1 to n items for each item in the original sequence. It really allows us to sort-of multiply each item in a sequence, or at least project it into multiple items. If we did something like this with Select then we would end up producing a sequence with multiple sub-sequences, like this:

 image

While this might be useful, it is not exactly what we want in this case. We want a single contiguous sequence which we can then operate on using other LINQ operators.

Different Kinds Of Projections

Remember that all you have to do is get a list from each item in a list. You can do this by accessing a list that is on each item, like this:

image

Or you could do it by generating a list, for example, what if each item was an integer, we could generate a range based on each number:

image

Pretty cool, and powerful. It also lets us chain calls to SelectMany, like this:

image

So basically, it let's you continue to generate n number of sequences, and then combine them all back together. Since each call to SelectMany generates a single sequence, then you can call SelectMany on that resulting sequence and continue to do so as many times as you want.

Summary

While most of the operators in LINQ let you get one output element for each element you have in your sequence, or they let you filter out elements, but SelectMany is the only operator that lets you produce n output elements for each element in your input sequence. This fact opens up all kinds of possibilities with LINQ that otherwise wouldn't be available to you. I hope this helps you out!

.NET 4.0 And Our Parallel Future

I want to show you an algorithm, it is a pretty simple algorithm. It is an implementation of the Damerau–Levenshtein edit distance algorithm from the pseudocode on Wikipedia:

public static int EditDistance(string string1, string string2)
{
    var s1Length = string1.Length;
    var s2Length = string2.Length;
    var matrix = new int[s1Length + 1, s2Length + 1];

    for (int i = 0; i <= s1Length; i++)
        matrix[i, 0] = i;
    for (int j = 0; j <= s2Length; j++)
        matrix[0, j] = j;

    for (int i = 1; i <= s1Length; i++)
    {
        for (int j = 1; j <= s2Length; j++)
        {
            int cost = (string2[j - 1] == string1[i - 1]) ? 0 : 1;

            matrix[i, j] = (new[] { matrix[i - 1, j] + 1,
                                    matrix[i, j - 1] + 1,
                                    matrix[i - 1, j - 1] + cost}).Min();

            if ((i > 1) && (j > 1) &&
               (string1[i - 1] == string2[j - 2]) &&
               (string1[i - 2] == string2[j - 1]))
            {
                matrix[i, j] = Math.Min(
                                matrix[i, j],
                                matrix[i - 2, j - 2] + cost);
            }
        }
    }
    return matrix[s1Length, s2Length];
}

Continue reading the rest of this post...

An Overview Of System.Collections.Generic

I recently put up a post on my blog about some of the new concurrent collections in .NET 4.0, and I noticed that a lot of people were being sent by Google to those posts who were only searching for System.Collections. I figured that maybe people could use a similar overview of the collections available to them in the System.Collections.Generic namespace, since it seems to me that no one uses anything other than List and Dictionary. So, in this post, I am going to take a look at a few of those collections, and explain exactly why you would want to use them.

Keep in mind as you read through this list that you shouldn't just start switching out collection types in code that you already have working. If something is working and performing properly for you, it is almost always better to take the easier route until you have proof that the easy approach does not work for you.

System.Collections.Generic.List<T>

The first collection that we are going to look at is List<T>. Like I said before, this collection seems to be the fallback, and with good reason. It is unsorted (but supports sorting), items can be added, removed, or inserted at a specific index. It also has indexed access, meaning that items can be accessed directly using a numeric index. You can add Ranges, perform binary searches, perform sequential searches, built in sorting, get the count of items, check for items within it, pass a delegate to it to perform actions, etc... It really is the Swiss Army knife of collections.

Continue reading the rest of this post...

BlockingCollection and IProducerConsumerCollection

In .NET 4.0 there is a new namespace on the block, and it is called System.Collections.Concurrent. In this namespace you will find a pretty decent number of goodies that will help you to more easily write application which can leverage multiple threads without having to resort to manual locking. We looked previously at the ConcurrentBag, ConcurrentStack, and ConcurrentQueue. Well, I have two more goodies to show off, and those are the BlockingCollection class and the IProducerConsumerCollection interface.

In order to get this party started, let me explain exactly what the IProducerConsumerCollection is and why we would want to use it. The IProducerConsumerCollection interface is pretty simply and really exposes only two methods that we care about: TryAdd and TryTake. You see, IProducerConsumerCollection is designed for multi-threaded producer/consumer scenarios, which means situations where we have multiple threads producing pieces of data (producers), and then multiple threads that are trying to consume those pieces of data (consumers). So, IProducerConsumer just signifies that the implementing type supports thread-safe adding and removing of data. (In .NET 4.0, the types that will support this interface are the ones that we already looked at: ConcurrentStack, ConcurrentQueue, and ConcurrentBag)

Continue reading the rest of this post...

.NET 4.0 and System.Collections.Concurrent.ConcurrentBag

Inside of .NET 4.0 there are numerous new namespaces, but the one that we are here to talk about today is called System.Collections.Concurrent. This namespace contains a handful of types which implement different types of thread-safe collections. The thread-safe collection type that we are going to take a look at today is the ConcurrentBag<T>.

The ConcurrentBag<T> is one of the most simple concurrent types which has been introduced. It is a typical bag data structure (also known as a multiset), meaning that it is an unordered collection of items that allows duplicates. It is called a bag because if you were to draw a diagram of the data structure, it would look something like this:

image

Just a jumble of items, with no order at all. It has three important methods, which are "Add", "TryTake", and "TryPeek". Add works exactly as you would expect, you simply instantiate a ConcurrentBag instance and then call Add in order to put items into it:

Continue reading the rest of this post...