Life After Loops

For loops have been our friend for so many years. I have fond memories of looping through huge lists of items imperatively bobbing and weaving to construct my final masterpiece!

for (int i = 0; i < items.Length; i++)
{
  if (items[i].SomeValue == "Value I'm Looking For!")
  {
    result.Add(items[i]);
  }
}

Look at that beauty. I am looping through a list of items, filtering them on some value, and then BAM! I get a result list with the values in it. Magic I tell you, magic. And then foreach loops came along and I realized how ridiculously ugly it was.

Just check this out:

foreach (SomeDummyClass item in items)
{
  if (item.SomeValue == "Value I'm Looking For!")
  {
    result.Add(item);
  }
}

Mmmmm. Beauty, simplicity, less room for error. But I still have to do a lot of declaring and looping and things. Ugh. So then they gave me the magical yield statement, and when used with a method, I could do this:

private static IEnumerable<SomeDummyClass> GetItems(SomeDummyClass[] items)
{
  foreach (SomeDummyClass item in items)
  {
    if (item.SomeValue == "Value I'm Looking For!")
    {
      yield return item;
    }
  }
}

Nice. Still a lot of looping, but now I don’t have to declare that stupid result list. And, if the result is never used, nothing even runs! Lazy evaluation rocks your face people! This still just feels gross though. Why am I holding the compilers hand so much? I just need to say "hello computer, give me crap in this list where SomeValue = some crap I’m looking for". Lo and behold Anders Hejlsberg and his team came down from on high and delivered Linq to us. Now I say:

var result = items.Where(i => i.SomeValue == "Value I'm Looking For!");

And the compiler figures it all out for me. Better yet I still get lazy evaluation and I get my list filtered. Best of both worlds! And since I am not telling the compiler exactly what to do, then in the future (with .NET 4.0) when my list grows really really large, all I have to do is say:

var result = items.AsParallel().Where(i => i.SomeValue == "Value I'm Looking For!");

And suddenly my list is being filtered by all of the processors on the box! This is possible because at each iteration we began to tell the computer less and less how to perform the individual operations needed in order to get our result, and instead we are now more closely telling the computer the action to take, not the specifics of how to perform the action. This lets the computer best decide how to execute our action, and in the case of Parallel Linq, we are now able to tell the framework that we want our task executed in parallel. (In case you are wondering, there are a few reasons why it can’t just run it as parallel by default)

As you can see, we really are moving more and more down the road of declarative development. Over time we will see more "what" and less "how" in our day to day programming adventures. And that, my friends, is life after loops.

Be Sociable, Share!

14 comments

  1. List<Item> result5 = items.FindAll(i => i.SomeValue == "Value I’m Looking For!");
    Works in older frameworks as well (with VS2008, if you are using older versions of VS you can use anonymous delegates)

  2. @Klaus Yep, but there are a few caveats. First, it is specific only to the List class. You can’t use it on any other type of enumerable object. Secondly, you lose lazy evalution with the "ForAll" method on the List class, it executes as soon as you call it.

  3. The problem with LINQ is that you cannot change the content of the things you select.
    For example, in the for loop you can do:

    for (int i = 0; i < items.Length; i++)
    {
    if (items[i].SomeValue == "Value I’m Looking For!")
    {
    items[i].SomeOtherValue = "This is a new value";
    }
    }

    You cannot do the same with LINQ, unless you add a lot of complexity just to circumvent the tool itself.

  4. @Simone – You can change the content if you do an intermediate ToList():

    items.Where(i => i.SomeValue == "Value I’m Looking For!").ToList().ForEach(i -> i.SomeValue = "This is a new value");

    although this causes all matches to be evaluated first, and then those results changed.

  5. @Simone Well, this isn’t a problem, this is actually very much by design. By taking a functional approach, which always favors creating new constructs instead of mutating state, you avoid many problems and allow numerous optimizations. In the example I provided "AsParallel" would never work if we were sharing any kind of state between iterations. Modifications to the item being iterated over would break this.

    In fact, this is the reason why they didn’t implement a "ForEach" method within Linq, it pushed people in the direction of mutable state too much and would cause problems for future additions to the framework.

    If you really want to mutate state in a collection, implement a custom method like in this post: http://www.codethinked.com/post/2008/05/IEnumerable-ForEach-extension-method.aspx

  6. Oh, and in the last comment I forgot to mention that almost all extension methods operate on IEnumerable, which makes it impossible to modify the collection anyways. You would need to change the extension method to operate on an IList or something in order for it to work.

  7. Anytime a sentance begins with "Hello Computer," the sentance *must* be read with a heavily Scottish accent.

    I am very much enjoying life after loops. Nice post on the evolution of looping.

  8. Excellent blog post. Very clearly shows the point of Linq and why it is such a game changer.

  9. I do wonder [i]why[/i] the AsParallel() was made necessary. Is there any situation where you [i]don’t[/i] want Linq to execute faster? Maybe it would have been better to go the other way and provide the NonParallel() method instead?

  10. @Dmitri Yes, the overhead of parallelism can cause many operations to be slower. For example, if I had a list with 1000 items and just did a quick scan through it to find an item, then most likely the parallel version would actually be slower. There is a lot of overhead in spinning up threads, context switching, partitioning the workload, etc…

  11. I get a lot of questions on this one. If you want to find an item that is not case sensitive you can use a Linq query that looks like this:

    var result = items.Where(i => i.SomeValue.Equals("Value I’m Looking For!", StringComparison.InvariantCultureIgnoreCase));

    You may want a culture specific search and for that you would use this:

    var result = items.Where(i => i.SomeValue.Equals("Value I’m Looking For!", StringComparison.CurrentCultureIgnoreCase));

    Joe Feser

  12. In addition to Justin’s answer to Dmitri, I’d add that some things should not be run in parallel. Most windowing toolkits, such as Windows Forms or WPF, support accessing controls only on the thread that created them. If you were looping over, say, listBox.Items, you’re violating the single-threaded, STA nature of those controls, leading to subtle bugs.

  13. @Judah Excellent point, and I would also add that you cannot parallelize anything that shares state between iterations. So if had a Linq query which was incrementing some value, that would also produce race conditions.

Leave a Reply

Your email address will not be published. Required fields are marked *