Lambdas and Closures and Currying. Oh my! (Part 6)

Writing

View the whole series:

Functional Programming Series

So far in this series we have gone through closures, lambdas, higher-order functions, and then we have gone over the implementation of several higher order functions. We are now going to talk about a technique called memoization, which is a much easier concept to describe than it is to implement. What memoization does is it caches the return value of a function for each unique combination of arguments. It works for deterministic functions (functions that always return the same value for a given set of inputs) but will not work with non-deterministic functions (functions that will return different values for the same arguments) for obvious reasons. To start this off, I am going to show you very simple function called MultiplyByTwo. It has one parameters, an integer, and it returns an integer. It looks like this:

  public int MultiplyByTwo(int a)
  {
    return a * 2;
  }

Pretty complex, huh? Yeah, makes your brain hurt just staring at it. So anyways, if we wanted to do some ghetto "memoization like" behavior (also called caching), we could do this:

public class CachedMultiplier  
{
  Dictionary<int, int> cache = new Dictionary<int, int>();
 
  public int MultiplyByTwo(int a)
  {
    int result;
    if (cache.TryGetValue(a, out result))
    {
      return result;
    }
    else
    {
      result = a * 2;
      cache.Add(a, result);
      return result;
    }
  }
}

Notice that we have now added a class that we have to instantiate. We need this because we have to have our cache stick around between calls to "MultiplyByTwo". If we declared it inside of the method then it would be garbage collected between calls and this wouldn’t provide much caching, now would it?

Sidenote:

Please ignore the fact that looking up a result in a Dictionary object and instantiating the CachedMultiplier takes longer than actually doing the multiplication. I am trying not to complicate the topic by introducing a more complex method into the mix.

So now if we wanted to do this without having an extra class to instantiate then we could just wrap it in another method, correct? Well, no, not really. The problem would be that even if we did something like this…

  public int CachedMultiplyByTwo(int a)
  {
    var cache = new Dictionary<int, int>();
 
    int result;
    if (cache.TryGetValue(a, out result))
    {
      return result;
    }
    else
    {
      result = MultiplyByTwo(a);
      cache.Add(a, result);
      return result;
    }
  }
 
  public int MultiplyByTwo(int a)
  {
    return a * 2;
  }

We would still have the issue where cache goes out of scope between every call to CachedMultiplyByTwo. But is there a way around that? Sure there is! The answer is returning a delegate. Since a delegate is a closure, which means that it is bound to all of the methods local variables that it references, we can just return a delegate from this method like this:

  public static Func<int,int> CachedMultiplyByTwo()
  {
    var cache = new Dictionary<int, int>();
 
    return (int arg) =>
    {
      int result;
      if (cache.TryGetValue(arg, out result))
      {
        return result;
      }
      else
      {
        result = MultiplyByTwo(arg);
        cache.Add(arg, result);
        return result;
      }
    };
  }
 
  public static int MultiplyByTwo(int a)
  {
    return a * 2;
  }

Notice though that CachedMultiplyByTwo takes no parameters! How is it supposed to work then? Well, you also have to notice that we are no longer returning an "int" type, we are returning a type of Func<int,int>. As you saw in an earlier post, this is a shortcut for a delegate that takes a single int parameter and then returns an int. So, instead of actually calling this method to get your result, you call this method and it returns a function to you that you then use to multiply. You will notice the "return (int arg) =>", well this is a lambda and you’ll see the curly braces after it that surrounds the body of the lambda. Once this lambda is returned it can be called like this:

  var cachedMultiply = CachedMultiplyByTwo();
  int result1 = cachedMultiply(5);
  int result2 = cachedMultiply(3);
  int result3 = cachedMultiply(5);

We call CachedMultiplyByTwo and put the result into the variable cachedMultiply. This is now our method that we are calling. The "cache" variable from the "CachedMultiplyByTwo" method is still bound to this method, so it is available until the local cachedMultiply variable goes out of scope. Here we call this newly returned method with 5, then 3, then 5 again. The second time we call it with 5 we are getting the cached result back out. Keep in mind that the result of each call to CachedMultiplyByTwo returns a new closure that is bound to different variables. So, if we called it twice, we would get two delegates with two different caches. In some instances this could be quite useful, but in other instances it might be unexpected. Take this for example:

  var cachedMultiply = CachedMultiplyByTwo();  
  var cachedMultiply2 = CachedMultiplyByTwo();
  int result1 = cachedMultiply(5);
  int result2 = cachedMultiply2(5);  

There would have been no cached results used in this instance. How sad.

Now all this is find and great, but what happens when we go to two variables. Lets look at this method:

  public static int Multiply(int a, int b)
  {
    return a * b;
  }

So, once we have two variables, how do we store them in our dictionary? Well, we could convert them to strings and concatenate them, but I don’t really like that solution. There are a few other ways we could do it, but I am going to borrow another concept that is often used in function languages, tuples. Now tuples have nothing really to do with functional languages, they are just generic holders for variables or objects, but they certainly come in handy. Since C# is a statically typed language, we could either make tuples that just store an object type or we could define them with generics. I am choosing to define mine with generics so that type inference will work better a bit later on. Before we get any further, let me show you what a 1-tuple, 2-tuple, and 3-tuple will look like.

  public struct Tuple<T1>
  {
    public T1 One { get; set; }
  }
 
  public struct Tuple<T1, T2>
  {
    public T1 One { get; set; }
    public T2 Two { get; set; }
  }
 
  public struct Tuple<T1, T2, T3>
  {
    public T1 One { get; set; }
    public T2 Two { get; set; }
    public T3 Three { get; set; }
  }

Sidenote:

Structs implement ValueType.Equals, instead of Object.Equals, which uses reflection to compare the fields of the types. This is fairly inefficient, and so Equals can be overridden to help this. I did not do this in the post in order to not put too much extra code into the example.

The first tuple type stores one item, the second two items, and the third three items. Now there is very little use for the 1-tuple, but I am just showing you for completeness. Since we now have a single type that can be used with our dictionary, lets implement the cached multiply method.

  public static Func<int,int,int> CachedMultiply()
  {
    var cache = new Dictionary<Tuple<int,int>, int>();
 
    return (int arg1, int arg2) =>
    {
      int result;
 
      var arguments = new Tuple<int, int>()
        { One = arg1, Two = arg2 };
 
      if (cache.TryGetValue(arguments, out result))
      {
        return result;
      }
      else
      {
        result = Multiply(arg1, arg2);
        cache.Add(arguments, result);
        return result;
      }
    };
  }

First, notice that we have changed our return type to Func<int,int,int>, this is the delegate that takes two ints and returns an int. Next you will notice that the Dictionary’s key is declared as a Tuple<int,int>. This represents our two arguments passed as one item. We are returning a lambda with two int arguments, and the first thing we do is create our tuple by using the new C# 3.0 object initializers, setting the "One" and "Two" properties to arg1 and arg2. Then we just use this Tuple with the Dictionary and the rest of the method works pretty much the same as before. Pretty sweet huh? We are getting pretty close with our implementation here, we just need to make it generic.

I’m going to have to stop there and continue this post on another day. This post has turned out much longer than I had anticipated, and so I’ll just split it into two pieces. In this piece we saw how to implement caching on methods with one and two parameters, and in the next part we are going to see how to turn this into full blown memoization by making this generic.

If you liked it, please vote for it below!

Loved the article? Hated it? Didn’t even read it?

We’d love to hear from you.

Reach Out

Comments (2)

  1. Real programmers use <<1 to multiply by 2. Guy with beards use *2 because when you look at them from very far they look like a *

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

More Insights

View All