codethinked (kōdthĭngked) adj. To be consumed by or obsessed with code.

Lambdas and Closures and Currying. Oh my! (Part 4)

View the whole series:

Functional Programming Series

I figured since we went over the "Fold" (also called Reduce) higher order function in the last post, we would go over the "Map" higher order function in this post. Since everyone these days has heard of Map-Reduce because of Google, I figured I would try and close the circle. Map is actually more simple than reduce, it just takes a set of items and a method. It then runs this method on each item in the set. We are going to just create this function as an extension method, since it will make it much easier to use. Our "Map" function would end up looking like this:

  public static IEnumerable<TResult> Map<TArg, TResult>(
    this IEnumerable<TArg> list, 
    Func<TArg, TResult> func)
  {
    if (list == null || list.Count() == 0) 
    {
      return new List<TResult>();
    }
    else
    {
      List<TResult> result = new List<TResult>();
      result.Add(func(list.First()));
      result.AddRange(
        Map(list.Skip(1).Take(list.Count() - 1).ToList(), func));
     return result;      
    }
  }

This function may look a little bit daunting at first, but trust me, it is actually really simple. The first parameter "this IEnumerable<TArg> list" is our list of data that we are going to operate on. We have added "this" on the front of it because this is an extension method that is called like "IEnumerable.Map();". You will have parameters, but we will get to that in just a second. Our second parameter "Func<TArg, TResult> func" is the method that we are going to apply against the items in our list.

The first part of the method is just the base case for our recursion. If this list is empty or null then we just return an empty list. The second part of our method declares a new List as a result and then adds the first item in the list to it after applying the specified method. Then we call Map recursively passing in the tail of the list (everything except the first item). Here we are using the Linq extension methods to do this, but you could do it however you wanted to. Then we simply return the list that we created. Easy as pie! You would call this method like this:

  var result = numbers.Map(x => x * 2);

This would return to you a list of the same size with every item in the list multiplied by 2. And there you have it! We can combine our map and reduce methods like this. This will multiply all the items in the list and then add the results up into an integer.

  var comboResult = 
    numbers.Map(x => x * 2).Fold((int x, int y) => x + y);

Pretty awesome, huh? So, now we have learned how to "map" a function to all the items in a list and then "fold" that list back into a single result. In the future I might have a post that will get into the usefulness of these a bit more, and show you how we could extend these to create more useful functions. In the meantime, get out there and come up with a creative use for it! Be sure to let me know how you use it.

Update:

Someone posted a comment below about how inefficient the Map function I had was. I did it the way that I did because I was trying to implement it in a more functional style. But since I implemented some of my other methods in a more imperative manner I figured I would add an updated version that is more efficient.

  public static IEnumerable<TResult> Map<TArg, TResult>(
        this IEnumerable<TArg> list,
        Func<TArg, TResult> func)
  {
    List<TResult> result = new List<TResult>();
    foreach (TArg item in list)
    {
      result.Add(func(item));
    }
    return result;
  }

Lambdas and Closures and Currying. Oh my! (Part 3)

View the whole series:

Functional Programming Series

In my previous post we went fairly deep into the syntax and use of lambda expressions in C# 3.0 and in this post we are going to discuss something that will make excellent use of those lambdas, higher order functions. Higher order functions are simply functions that can take functions as parameters and can also return functions as results. And I'm not talking about passing in the results of a function, or returning the results of a function. This is what we call function composition (you may remember the term from math class). Here is an example of function composition:

    //we declare a method that returns 5
    public static int iReturn5()
    {
        return 5;
    }
 
    //we declare a method that multiplies some number by 2
    public static int iMultiplyItBy2(int somenum)
    {
        return somenum * 2;
    }
 
    //now we compose these two functions to multiply 5 by 2
    public static int iMultiply5By2()
    {
        return iMultiplyItBy2(iReturn5());
    }

See, you are using these big fancy programming terms every day and you didn't even know it! And no, you don't need to stare at it any closer, I'm not tricking you or anything. I'm just showing you this so that I can compare it to higher order functions. Lets start off by showing you an example of a higher order function.

    public static TResult Fold<TArg, TResult>(IEnumerable<TArg> data, 
        Func<TArg, TResult, TResult> func, TResult result)
    {
        foreach (TArg i in data)
        {
            result = func(i, result);
        }
        return result;
    }

I know that this looks pretty ugly, so lets look at it closely and break it down. This is an example of a C# 3.0 implementation of the classic "Fold" higher order function (sometimes referred to as reduce), which iterates over something and accumulates the items. A classic example is adding up all of the integers in an integer array. So, lets start bu looking at the first parameter, which is "IEnumerable<TArg> data". This is pretty self explanatory, it is the item that you want to iterate over.

The second parameter "Func<TArg, TResult, TResult> func" is your reducing function. This function defines what you want to do with the items in the "data" argument (this will become more explanatory when I show you can example). It takes one parameter that is the type of the items in "data" and a second argument of the type you are accumulating, and then returns the accumulated type. So, if we were using int going in and string coming out (not sure why you would fold integers into a string, but just keep moving), its definition would look like this...

    public string func(int, string)

The last parameter "TResult result" is simply the seed for your reducing function, it is used to pass in what you have already accumulated up to this point. For example when starting with integers this would likely be zero, or with strings this would be String.Empty. Again, this will make more sense in a second. Here is an example of how you would call this with an array of integers:

    //declare our integer list
    var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };
 
    //now pass in a reducing method that adds them up
    var result = Fold(numbers, (x, y) => x + y, 0);

We pass our list of integers into fold, and then our lambda that takes two parameters (the C# compiler can infer that these are integers since we are passing a zero as our third parameter) and then simply adds them together and returns an int. The first parameter is the next item in the list and the second parameter is the accumulated item so far. Our third parameter is "0", which is to simply get it started. We could even declare an overload of this method that automatically initializes the seed to "default(TResult)", which we will do below. (this is something that was added in C# 2.0 to support generics and initializes the generic parameter to its "default" value). We could also make this an extension method of IEnumerable that would look like this:

    public static TResult Fold<TArg, TResult>(
        this IEnumerable<TArg> data,
        Func<TArg, TResult, TResult> func, TResult result)
    {
      foreach (TArg i in data)
      {
        result = func(i, result);
      }
      return result;
    }
 
    public static TResult Fold<TArg, TResult>(
        this IEnumerable<TArg> data,
        Func<TArg, TResult, TResult> func)
    {
      return Fold(data, func, default(TResult));
    }

We have two functions here so that we can call it like this:

    var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };
    var result = numbers.Fold((int x, int y) => x + y);

We have to add the "int" declarations back onto the lambda here because we are no longer passing a zero as the third parameter which no longer allows the C# compiler to figure out what type TResult is. We could make a more simple version of these methods that only takes one generic parameter, and that would make the type inference work, but then we would only be able to return the same type that we are passing in. Which would mean that we couldn't use it with two types like this...

    var result = Fold(numbers, (x, y) => { y.Add(x); return y; }, 
        new System.Collections.ArrayList());

Being able to dump a List of numbers into an ArrayList is someone contrived, but you can see where I am going with this. You can use the Fold method to perform almost any operation on the items in your collection. I hope you can find some good uses for this, so until next time, have some fun with it! Leave a comment if you see anything that I have skimmed over that you would like to see elaborated in my next post.

Lambdas and Closures and Currying. Oh my! (Part 2)

View the whole series:

Functional Programming Series

So, in the last part of this series we discussed functional programming languages in general and we discussed closures, and how they were simply methods that are bound to external variables. In C# 2.0 this was only implemented with anonymous methods. In C# 3.0 we are introduced to the lambda. A lambda is for the most part syntactic sugar for anonymous methods. Lets look at an example of a lambda.

Sidenote:

There are very subtle differences, and you can read Eric Lippert's post on it if you want to know more.
    //Rob Conery should like the variable 
    Func<int, int> elLambda = (int x) => x * 2;

This lambda can be called like this

    elLambda(5);

And it would return 10.

But wait, that statement works, but the syntax doesn't look *that* much better than an anonymous method, and you are correct. It is a good thing that C# 3.0 has much improved type inference. Because of this, we can remove most of the cruft from this statement. We can the rewrite it like this...

    var elLambda = x => x * 2;

Sidenote:

The "var" keyword was added in C# 3.0 and while it may look like C# 3.0 has introduced dynamic typing, this is not the case. The var just allows the type to be inferred. In this case, we can tell that elLambda is a Func<int> and therefore it will become that type and compile just as if you had written it explicity like that. For example, if we had this...
    var i = 1;
    i = "";

This would work in a dynamically typed language, the variable's implicit type would simply change. In C# 3.0 this would result in a compile time error of "Cannot implicitly convert type string to int"

The only problem is that this no longer compiles. We get an error that says "Cannot assign lambda expression to an implicitly-typed local variable". Hmm, maybe we need to add a type to x, since currently there is not enough information for the type to be inferred. So we change the call to this..

    var elLambda = (int x) => x * 2;

Close, but no cigar. Apparently we still cannot assign to this implicit local variable, why is this? Well, it has to do with expression trees. (Which I will hopefully touch on later on down the road) Basically, the compiler does not know if we want this lambda expression to be compiled into a delegate, or we want it to be turned into an expression tree that can be passed to a method. So, we could do a few things. We could remove the "var" and replace it with "Func<int,int>" or we could cast the whole lambda to a Func<int,int>. In both of these cases we could drop the "int" off of the lambda parameter. These options would end up looking like this...

    Func<int,int> elLambda = x => x * 2;
    var elLambda = (Func<int,int>)(x => x * 2);

Neither of these are particularly elegant solutions, and won't help you at all if you are using anonymous types. It is a good thing that we do have another option. We can create a method that will return our lambda as a particular type.

  public static Func<TArg, TResult> GetFunc<TArg, TResult>
    (Func<TArg, TResult> lambda)
  {
    return lambda;
  }

This also may look a bit convoluted to you, but it is generic for any single argument method. It can be changed for two, three, four, etc... argument methods. Essentially you are passing the lambda into the generic method "GetFunc". Because the type of the parameter is is "Func<TArg, TResult>" the compiler knows that the expression needs to be a delegate, and therefore compiles it normally and doesn't try to turn it into an expression tree. Then we just return our lambda, and the C# compiler no longer has any problem assigning to the implicit local variable. We can then call it like this...

    var elLambda = GetFunc((int x) => x * 2);

Keep in mind that we still need the "int" to be there because the compiler still doesn't know enough information about x to infer what type we are trying to use. The good news is that most of the time when you are using lambdas you are not going to have to worry about most of this. The short syntax of a lambda makes it easy to pass as a parameter, and most of the time you will be doing something like this...

  var numberList = new List<int>() { 5, 4, 3, 9, 15, 6, 3, 22, 55 };
  List<int> greaterThanFiveList = numberList.FindAll(n => n > 5);
 
  foreach (int num in greaterThanFiveList)
  {
    System.Console.WriteLine(num);
  }

So, as you can see we are declaring a list of a bunch of numbers (using the C# 3.0 collection initializers!) and then we are passing our lambda expression into the FindAll method on the list. This method has existed since C# 2.0 and it takes a parameter of type Predicate. Predicate is just a delegate that is declared like this...

    delegate bool Predicate<T>(T obj);

So, because we have declared the type of our List as "int", the compiler has made the "FindAll" method take a Predicate<int> and so the C# compiler can infer that n is supposed to be an int.

So, now that you know what a lambda is, let me add one or two things onto the end here. First of all a lambda can also represent a method block rather than just a single statement. This allows us to do simple expressions and then have a return statement.

  var elLambda = GetFunc((int x) => { if (x > 5) { return x * 2; }
    else { return x * 3; } });

You probably don't want to get much crazier than that, it quickly becomes difficult to read. You can break it down into multiple lines, but I think that readability still suffers when it gets more complicated than this. It would be better to assign the lambda to a variable.

So, in conclusion, you have seen a thorough explanation of the C# language syntax for lambdas. In the next post we are going to talk a little bit more about methods that take lambdas and hopefully get into some more examples of how you would want to use them.

Lambdas and Closures and Currying. Oh my! (Part 1)

View the whole series:

Functional Programming Series

This is part one in an n part series about the functional programming aspects in C# 3.0. Now you may have read other blog posts by other people describing these features, but the difference between this series of articles and those other posts is that I am *not* a functional programmer. In one or two college courses I may have had to write a Lisp program or two, but I have long since banished that from my mind. So this whole process will hopefully be a learning experience for me as well as for you.

So, lets start with the basics...Wikipedia defines functional programming as:

"a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast with the imperative programming style that emphasizes changes in state."

Snoooooooooooore. Essentially what this is saying is that in purely functional programming every variable is immutable (once it has been set, it cannot be changed) and everything is a parameter to, or result of, a function.

So, lets have an example. Here is a factorial function in Scheme, which is a command dialect of Lisp, which is actually the second oldest programming language in existence. Just in case you were thinking that this functional programming stuff was all new and fancy, it is actually quite old! (this factorial function was also borrowed from Wikipedia):

(define (fact n) 
  (if (= n 0) 1 
    (* n (fact (- n 1))))) 

Now, this probably looks pretty strange to someone who has never seen a Scheme program, so lets break it down. First the left parentheses "(" starts the beginning of a function. So, we have started with a "define" function, which is actually a function that lets us define a function (How nice of it). Then, you will see "(fact n)" and this is saying that this function is called "fact" and will take one parameter, which is "n". (Keep in mind that types are implicit in Scheme) Next we have "(if" which is the start of the "if" function. I told you, *everything* is a function. Everything.

Now, the next part is going to look really strange for those who have never seen prefix notation (also called Polish notation) which is simply when you put operators before operands. What you are probably used to see in most languages is infix notation, where the operators come between the operands. So, in Lisp "1 + 2" would look like "+ 1 2". So "(if (= n 0)" simply translates into C# as "if (n == 0)". Then the "1" following this if statement is what is returned if n is 0, and then "(* n (fact (- n 1)))))" represents the "else clause" of the if function. If we break this down we will see that it takes n and multiplies it against a recursive call to the "fact" function that is taking "n - 1" as a parameter. Now was that so hard? Yes, yes it was. :-)

So now that we have taken forever to explain one simple list function, you are probably getting ready to click the back button on your browser. Well don't! The hard stuff is over! Kinda. This next part is what separates the boys from the men (or the girls from the women, if you prefer), and you may leave a crying, sobbing baby.

Crying Sobbing Baby

So, why I am showing you all of this? Well, because it is important to get a little tiny bit of background. The factorial function above could be implemented just as easily in C# and we don't even need any special functional features to do it. I just wanted you to see what a functional language looks like before I start beating you about the head with closures.

So, what is a closure? Lets start again by looking at a definition from Wikipedia (trust me, this is all going to help you understand the C# 3.0 language features, I'm not trying to be a textbook here):

"A closure is a function that is evaluated in an environment containing one or more bound variables."

Hmmmm. That is about as clear as milk. Lets try to clear this up a bit with the stereotypical closure example...

Sidenote:

Some people have claimed that C# does not support true closures because of the fact that the variables actually bind to the variables inside of the anonymous method or lambda, but this is actually exactly why C# *does* support closures.
//declare us a little delegate
delegate void WriteMe();
 
static void Main(string[] args)
{
    //we first set s to "output1"
    string s = "output1";
 
    //assign our anonymous method
    WriteMe w = delegate()
    {
        System.Console.WriteLine(s);
    };
 
    //reset s to "output2"
    s = "output2";
 
    //voila, it prints out "output2"
    w();
}

Here we have a chunk of code that will actually compile in C# 2.0. We are declaring a delegate and then assigning an anonymous method to it. An anonymous method is just a method that does not have a name. Since it does not have a name, you cannot call it without a direct reference. While this may sound limiting, it actually has a number of uses. So, as you can see from the comments in the code, we are assigning "s" a value, then we assign the anonymous method, then we reassign "s" and then we call our delegate. The delegate prints out the second value of "s" because it is "bound" to it. Instead of it getting the value of "s" at the time it was declared it gets a reference to s.

Sidenote:

In C# 2.0 one of the most common uses is with predicates where anonymous methods provided a somewhat succinct syntax for passing a predicate to another method. As anyone who tried to use predicates in VB.net can attest to! (since VB.net does not support anonymous methods)

And that is pretty much it! A closure simply allows us to declare a method that is bound to the variables around it. It lets us do some really powerful things that we will discuss in a future post.