codethinked (kōdthĭngked) adj. To be consumed by or obsessed with code.

C# 4.0 New Features Part 3 - Generic Covariance

Here are the previous parts to this series:

Part 1 - Dynamic Keyword

Part 1.1 - Dynamic Keyword Second Look

Part 2 - Default And Named Parameters

Part 2.1 Default Parameter Intrigue

When generics were introduced in C# 2.0 they were one of the best features that ever came to C#. Anyone who had to create strongly typed collection classes in C# 1.0 knows exactly how much code generics saved us from having to write. The problem though is that generics don't seem to follow the same rules of inheritance that all of the other classes follow. Let's start off by defining two quick classes that we are going to use for the rest of this post:

public class Shape
{
}

public class Circle : Shape
{
}

Here we have our stereotypical class hierarchy, which is not doing anything currently. But the behavior of these classes is not important. Now, lets define a dummy container class that can hold an instance of any class:

public interface IContainer<T>
{
    T GetItem();
}

public class Container<T>: IContainer<T> 
{
    private T item;

    public Container(T item)
    {
        this.item = item;
    }

    public T GetItem()
    {
        return item;
    }
}

Now that we have our hierarchy and our container class, let's look at something that we can't currently do in C# 3.0:

static void Main(string[] args)
{            
    IContainer<Shape> list = GetList();
}

public static IContainer<Shape> GetList()
{
    return new Container<Circle>(new Circle());
}

We have a method called "GetList" which has a return type of "IContainer<Shape>" and then returns a "Container<Circle>" class. Since Circle descends from Shape and Container implements IContainer, you would think think that this would just work. But in C# 3.0, it doesn't.

In C# 4.0, we have a way to make this work, we can simply add the word "out" in front of the type parameter on our interface declaration (note that variance in C# 4.0 is limited to interfaces and delegate types):

public interface IContainer<out T>
{
    T GetItem();
}

This is telling the C# compiler that T is covariant, which means that any IContainer<T> will accept any type equal to or more specific than T. Like we saw above, IContainer<Shape> was the return type, but if we have the out parameter on our interface, then we had no problem returning an IContainer<Circle>.

So why did they decide to use the word "out"? Well, it is because whenever you define a type parameter as covariant, you can only return that type out of the interface. For example, this is invalid:

public interface IContainer<out T>
{
    void SetItem(T item);
    T GetItem();
}

But why won't that work? Because if that doesn't work, then that means that the IList<T> interface can't be covariant! Noooo! Well, the reason is actually pretty simple, type safety. Let's look at the implications of what we have done above:

static void Main(string[] args)
{
    IContainer<Shape> container = new Container<Circle>();
    SetItem(container);
}

public static void SetItem(IContainer<Shape> container)
{
    container.SetItem(new Square()); // BOOM!!!
}

You see that since T is covariant and so we can assign a "Container<Circle>" to our variable of type "IContainer<Shape>" and then we pass it into our method "SetItem" which accepts a parameter of type "IContainer<Shape>" and then we take that variable and try to add a new type "Square" to it. Well, it looks like this is valid, the parameter type is "IContainer<Shape>" and so we should be able to add a Square, right? Well, wrong. The line above will explode because we are actually trying to add a square to a container that holds circles. This is why they limited covariance to only a single direction.

Are you wondering how all of this is implemented in the clr? Well, there is no need to. Generic covariance in the clr is the way that it just works. Since generics were worked into the clr in .net 2.0 they have allowed this behavior. Since C# tries its best to maintain type safety, they didn't allow what we just did above. The clr though has no problem with it. As an interesting side note, arrays in C# actually allow this behavior, so go try it out! I hope that you enjoyed this post, and then next in the series will be here soon!

C# 4.0 New Features Part 2.1 - default parameter intrigue

With all of the new C# 4.0 stuff coming out, I feel like a kid in a candy store. Sorry for post overload, but I just can't help myself! I also think I need to stop putting numbers on these posts, because obviously I have just completely thrown them to the curb. I hate to put a "Part 3" on this post though, since it is just an extension of a previous one.

Jonathan Pryor pointed out on my last post that the default parameter feature in C# 4.0 was implemented in the same way that that the default parameter feature in VB.net has been implemented. He also points out a seemingly obvious way that they could have made it better, but then he points out why it wouldn't work when combined with the named parameter feature.

So, since Jonathan is a freakin' smart guy, and most of us (including me) aren't that smart, I am going to elaborate on his comment and explain in detail what he is talking about.

So, to start off let's look at the implementation of default parameters in C# 4.0. It all starts with two attributes called OptionalAttribute and DefaultParameterValueAttribute. If you go look these attributes up, you will see that they have been around since since .net 1.1 and .net 2.0 respectively. The reason for this is that other languages besides C# have supported these features going back to .net 1.1. In fact, you could add a DefaultParameterValueAttribute to one of you parameters in a method in C# and it would work perfectly fine, it is just that you can not consume it in C# since C# does not support this feature (until C# 4.0).

In my previous post I created a class that looked like this:

public class TestClass
{    
    public void PerformOperation(string val1 = "val", int val2 = 10, double val3 = 12.2)
    {
        Console.WriteLine("{0},{1},{2}", val1, val2, val3);
    }
}

So, you see that this class has a method which has three default parameters. This means that we can call this method without passing any arguments to it and the default values would be "filled in" for us. So, how does the C# compiler implement this behavior?

Your first idea may be that C# just generates overloads, something that looks like this:

public void PerformOperation()
{
    PerformOperation("val", 10, 12.2);
}

public void PerformOperation(string val1)
{
    PerformOperation(val1, 10, 12.2);
}

public void PerformOperation(string val1, int val2)
{
    PerformOperation(val1, val2, 12.2);
}
        
public void PerformOperation(string val1, int val2, double val3)
{
    Console.WriteLine("{0},{1},{2}", val1, val2, val3);
}

Well, in reality, it looks like this (this is reflected code):

public void PerformOperation([Optional, DefaultParameterValue("val")] string val1, 
    [Optional, DefaultParameterValue(10)] int val2, [Optional, DefaultParameterValue(12.2)] double val3)
{
    Console.WriteLine("{0},{1},{2}", val1, val2, val3);
}

Hmmmmm. So, instead of just generating overloads for each method with the values filled in, it just applies some attributes to the parameters that declare them as optional and then specifies their default values. But, how does that work?

If you are familiar with attributes in .net then you will know that you have to use reflection to read out the properties of these attributes, and you have to have code running somewhere to process these attributes. All they are is meta-data assigned to the method, not code that executes at runtime. So, is C# doing reflection every time I call a method with default parameters? Fortunately the answer to that question is "no".

The answer to how this works may be a little bit surprising though. If we want to call the above method with no parameters:

var testClass = new TestClass();
testClass.PerformOperation();

What does this compile to? Interestingly it looks like this:

var testClass = new TestClass();
testClass.PerformOperation("val", 10, 12.2);

You'll notice that the default parameter values for this method have just been compiled right into the calling code. The C# compiler is reading those attributes off the method and then using them to just insert the values into the calling code and then compiling them. So, what happens if I change the default values and don't compile my entire system? Well, the calling code will still have the wrong values. That is definitely something that you will have to look out for.

So, why did they choose to implement it this way? Well, as Jonathan pointed out, if they dynamically generated overloads one thing that wouldn't work is the new named parameters feature. Why wouldn't it work? Well, I'm glad you asked.

Lets say we had the overloaded methods that I put in above, and I wanted to call my method like this:

var testClass = new TestClass();
testClass.PerformOperation(val3: 15.1);

Hmmm. What overload would I call? I don't have an overload to call. Even though we generated overloads, we still can't leave out parameters and we would be stuck with inserting values into our IL again. Then we would have a mixed system where sometimes it would bake in values, and other times it wouldn't. No good.

Now, you might say, what about just generating overloads for all parameters in all orders? Well, since we have three parameters of different types, that would work for our particular instance. It would not work for all instances though. What if we had three string parameters? You can't have three overloads of a method that each take three strings, method resolution would be impossible.

It appears for now that these two features just won't interact, and I'm sure that if there was a way in the current .net runtime to make it work without baking in the values, they would have. But for now we just have to accept the way it works and move on. Maybe in the future the runtime will have a way to tag parameters with default values that can stay with the method and then use those values when parameters aren't provided. Who knows. Hopefully you found this little adventure into the default parameter to be interesting, and hopefully you'll come back for part 3 which will be coming along shortly.

C# 4.0 New Features Part 2 - default and named parameters

In a previous post, we talked about the new "dynamic" keyword.

This next new features in C# 4.0 is one that I have been waiting on for years! And in the past it has always been explained away as an explicit design decision. Well, apparently pragmatism has won out and we now have default parameters in C#. In order to make default parameters even more useful, they threw in named parameters as a bonus! We will look at those in just a minute, but first, defaults.

Lets say we have a class like this:

public class TestClass
{
    public void PerformOperation(string val1, int val2, double val3)
    {
        Console.WriteLine("{0},{1},{2}", val1, val2, val3);
    }
}

Now we can instantiate and call this method on our class like this:

var testClass = new TestClass();
testClass.PerformOperation("val", 10, 12.2);

But what if we knew that the values we were already passing in were good defaults. Well, currently our option would be to create overloads and pass in defaults like this:

public class TestClass
{
    public void PerformOperation()
    {
        PerformOperation("val", 10, 12.2);
    }
    
    public void PerformOperation(string val1)
    {
        PerformOperation(val1, 10, 12.2);
    }

    public void PerformOperation(string val1, int val2)
    {
        PerformOperation(val1, val2, 12.2);
    }

    public void PerformOperation(string val1, int val2, double val3)
    {
        Console.WriteLine("{0},{1},{2}", val1, val2, val3);
    }
}

Pretty lengthy option. But C# 4.0 gives us an even better option in the form of parameter defaults.

public class TestClass
{
    public void PerformOperation(string val1 = "val", int val2 = 10, double val3 = 12.2)
    {
        Console.WriteLine("{0},{1},{2}", val1, val2, val3);
    }
}

How much cleaner is that? So, how would we call this? Just as you would with the overloads:

var testClass = new TestClass();
testClass.PerformOperation("val", 10);

Very nice. The third parameter in this call will be defaulted to 12.2, just like it was set that way. Now all of the VB.net developers can stop making fun of us. Now, you will also be happy to know that this works for constructors as well.

public class TestClass
{
    public TestClass(string someValue = "testValue")
    {
    }

    public void PerformOperation(string val1 = "val", int val2 = 10, double val3 = 12.2)
    {
        Console.WriteLine("{0},{1},{2}", val1, val2, val3);
    }
}

No more multiple constructor overloads to just specify a few default values.

So, what happens if we want to leave out "val2" in the call above? So, we want to fill in val1 (the first parameter) and we want to pass in val3 (the third parameters), but we want to default val2. We couldn't call it like this:

var testClass = new TestClass();
testClass.PerformOperation("val", 10.2);

That wouldn't compile since 10.2 cannot be converted to an int, since it is trying to default the third parameter here. So what option do we have? We can use named parameters. Named parameters simply consist of putting the parameter name, and then a colon in from of the value you are passing. So the call above would look like this:

var testClass = new TestClass();
testClass.PerformOperation("val", val3: 10.2);

Kinda neat, although I'm not sure how I feel about the fact that this will now make changing a parameter name a breaking change. I guess only time will tell how this plays out in large application development. Although I'm sure that people in other languages have been dealing with this for years.

Well, there you have it, yet another cool new feature of C# 4.0 and yet anther reason to look forward to VS2010.

C# 4.0 New Features Part 1.1 - dynamic keyword second look

Part 1 - Dynamic Keyword

I was originally going to just update my original post with some performance data regarding the dynamic keyword versus MethodInfo.Invoke, but I started looking a bit deeper at the performance implications of the dynamic keyword. In my previous post I posted these performance numbers:

Compile Time Bound: 6 ms - yep, 2 million calls, 6 ms. Fast.

Dynamically Bound: 2106ms - so, roughly 351 times slower than the strongly typed calls

But these numbers don't really tell the whole picture. The reality is that in the above test I was factoring in the overhead of initializing the callsites used by the dynamic invoke. Since I ran the test over and over from the start, this was included every time. So, I decided to update my testing script to look like this:

private static void Method1()
{
    Console.WriteLine("compile bound");
    var test = new TestClass();
    test.TestMethod1();
    test.TestMethod2();

    long totalMilliseconds = 0;

    for (int j = 0; j < 5; j++)
    {
        var sw = new Stopwatch();
        sw.Start();
        for (int i = 0; i < 1000000; i++)
        {
            test.TestMethod1();
            test.TestMethod2();
        }
        sw.Stop();

        Console.WriteLine(sw.ElapsedMilliseconds);
        totalMilliseconds += sw.ElapsedMilliseconds;

        sw.Reset();
    }

    Console.WriteLine("Average: " + totalMilliseconds / 5);
}

What we are doing here is calling the two methods once before we start testing them in order to get the overhead out of the way. Then we run a loop 5 times and then run our 1000000 loops through calling both methods. So, we are still doing 2,000,000 calls, but we are now doing them five times in a row. So, 10,000,000 total calls per method. I also tested the above code using the "dynamic" keyword and MethodInfo.Invoke.

The numbers that I got were pretty startling. The first method didn't change, there is no startup time for compile time binding, so we don't really see any difference testing it this way.

The second method (the dynamic keyword) is where the real changes started to show up. The start up time for the dynamic keyword is pretty high right now (I was recording about 3.5 seconds to make the first two calls). But once the first calls were out of the way, and our CallSites were created, they are assigned as static methods on a compiler generated class and so they are only created once when first called. The performance here was dramatically better. The calls to warm it up just looked like this:

dynamic test = new TestClass();
test.TestMethod1();
test.TestMethod2();

Third I tested MethodInfo.Invoke where we also started them up and invoked the calls once. The difference here is that each time the method is called, we have significant overhead to make the call, even though we are not retrieving out MethodInfo objects each time.

var test = new TestClass();
MethodInfo mi1 = typeof(TestClass).GetMethod("TestMethod1");
MethodInfo mi2 = typeof(TestClass).GetMethod("TestMethod2");
mi1.Invoke(test, null);
mi2.Invoke(test, null);

So, what could we do in C# 3.0 that would be similar to what is happening with the C# 4.0 dynamic keyword? Well, if you have never played around with it, lets look at the DynamicMethod class. The DynamicMethod class is a helper for creating a method at runtime that we can then invoke. It is doing something similar to what we have above, only in order to use DynamicMethod we have to emit the IL ourselves. This is no simple task, but the easiest way to do it is to compile a handwritten method yourself that does something similar, then reflect it and look at the IL. Then you will at least have a good starting point.

With DynamicMethod we have to write quite a bit of code to get this all wired up, but here it is:

MethodInfo mi1 = typeof(TestClass).GetMethod("TestMethod1");
MethodInfo mi2 = typeof(TestClass).GetMethod("TestMethod2");

var test = new TestClass();

DynamicMethod method1 = new DynamicMethod("CallMethod1", null, new Type[] {typeof(TestClass)}, typeof(TestClass).Module);
ILGenerator il = method1.GetILGenerator(128);
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Callvirt, mi1);
il.Emit(OpCodes.Ret);

DynamicMethod method2 = new DynamicMethod("CallMethod2", null, new Type[] { typeof(TestClass) }, typeof(TestClass).Module);
il = method2.GetILGenerator(128);
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Callvirt, mi2);
il.Emit(OpCodes.Ret);            

Action method1Delegate = (Action)method1.CreateDelegate(typeof(Action), test);
Action method2Delegate = (Action)method2.CreateDelegate(typeof(Action), test);

method1Delegate();
method2Delegate();

Now that we have our delegates that we can all, we can run the same loops against this method. Again, we call each method once but in this instance all of the work is being done when we emit the IL.

Our four methods are planned out and now we can lay out what you really came here for. Here are the averages:

Compile Time Bound: 6 ms

Dynamically Bound with dynamic keyword: 45ms

Dynamically Bound with MethodInfo.Invoke - 10943ms

Dynamically Bound with DynamicMethod - 8ms

Compile time still wins out, but the dynamic keyword is a much much better looking option now. The overhead of the first call is pretty high, in my tests almost 3 seconds. I'm sure that they will work on the speed of this before the final release. Every successive call is actually really fast though. The next method is MethodInfo.Invoke which is quite slow compared to the rest, but in terms of 2 million calls is still pretty fast. Finally the last method is DynamicMethod, which is practically as fast as the compile bound method because essentially we are just manually compiling it to IL.

There you go, the dynamic keyword has some overhead, but you only have to deal with it once for each dynamic call. In a long running application this isn't really a concern. I hope you enjoyed this post and I hope that I have answered most of the performance questions that you may have for the dynamic keyword.

C# 4.0 New Features Part 1 - dynamic keyword

UPDATE: I have posted some more performance numbers on the dynamic keyword in a second post

One of the coolest new features in C# 4.0 that has been announced at PDC is the new dynamic keyword. This keyword allows the developer to declare an object whose method calls will be resolved at runtime. The interesting part about it is that the class doesn't need to be declared in any special way to use this keyword, it is all up to the consumer.

So, lets just declare a normal class like this:

public class TestClass
{
    public void TestMethod1()
    {
        Console.WriteLine("test method 1");
    }

    public void TestMethod2()
    {
        Console.WriteLine("test method 2");
    }        
}

And now we could instantiate and call a few methods on this class just like this:

var test = new TestClass();
test.TestMethod1();
test.TestMethod2();

At this point everything is working exactly as you expect it, and it all compiles just fine. But now, lets throw in the dynamic keyword:

dynamic test = new TestClass();
test.TestMethod1();
test.TestMethod2();

Okay, so nothing changed, right? Wrong. Everything builds just as before, but now those method calls on our test method are not being resolved at compile time, they are being resolved at runtime! So, say we did this:

dynamic test = new TestClass();
test.TestMethod1();
test.TestMethod2();
test.TestMethod3();

This would still compile! But at runtime we would see something like this:

Runtime Error

Pretty big implications, huh? So, what is happening here? This is where our friend Reflector comes in:

private static void Main(string[] args)
{
    object test = new TestClass();
    if (<Main>o__SiteContainer0.<>p__Site1 == null)
    {
        <Main>o__SiteContainer0.<>p__Site1 = CallSite<Action<CallSite, object>>.Create(new CSharpCallPayload(Microsoft.CSharp.RuntimeBinder.RuntimeBinder.GetInstance(), false, false, "TestMethod1", typeof(object), null));
    }
    <Main>o__SiteContainer0.<>p__Site1.Target(<Main>o__SiteContainer0.<>p__Site1, test);
    if (<Main>o__SiteContainer0.<>p__Site2 == null)
    {
        <Main>o__SiteContainer0.<>p__Site2 = CallSite<Action<CallSite, object>>.Create(new CSharpCallPayload(Microsoft.CSharp.RuntimeBinder.RuntimeBinder.GetInstance(), false, false, "TestMethod2", typeof(object), null));
    }
    <Main>o__SiteContainer0.<>p__Site2.Target(<Main>o__SiteContainer0.<>p__Site2, test);
}

So this is what our main method looks like when we reflect it. It may be hard to follow, but make sure you look at the line numbers on the side to see where the wrapping occurs. First the compiler generated the "__SiteContainer0" local field in order to hold our callsite info. Next you will see that our "test" variable is just of class "object" now! There really isn't a dynamic type, it is just a helper.

Next you see that we are checking if these callsites are null, and if they are, then we are using "CallSite.Create" and passing in a "CSharpCallPayload" object which has all the info about the method that we are trying to call. Once the callsite is defined, then we just invoke that callsite on our "test" instance by passing the callsite data and the object. The compiler has done all of this for us, we just need to sit back and let it happen.

So, in this instance what we are doing is pretty useless, but the power of this feature comes in when we are using a type whose methods we do not necessarily know at compile time. This could be because it is coming from some dynamic code (like IronRuby!), or it is a generated class that we don't have compile time type info for, or anywhere that you are currently using heavy reflection for.

So, the first thing that popped into my mind when I saw this was, "what are the performance implications of this?" I know that this is a super early CTP, but I figured that I would run a few unscientific tests anyways.

So, here is the highly scientific process I used:

I changed to Release build and I put the two method calls in a loop surrounded by a stopwatch. The loop ran 1 million times, and so invoked a total of 2 million method calls. I then wrote out the number of milliseconds it took to execute to the screen. I also modified the TestClass to not write to the screen, since that would take significantly more time than the method calls. Instead, I changed the class to just add numbers, like this:

public class TestClass
{
    public int i;
    public void TestMethod1()
    {
        i += 1;
    }

    public void TestMethod2()
    {
        i += 2;
    }        
}

I ran each of them once, and then threw away the first result. I then ran them each 7 times and threw away the highest and the lowest. Then averaged the 5 numbers left.

Compile Time Bound: 6 ms - yep, 2 million calls, 6 ms. Fast.

Dynamically Bound: 2106ms - so, roughly 351 times slower than the strongly typed calls

Now, this is a super early CTP and they will obviously optimize this, but they still are probably going to be many times slower than the compile time calls. The thing to realize here is that while these numbers are very different, we are still talking about 2 million calls in about 2 seconds. So, about one thousandth of a millisecond per call.

In most applications we are using reflection because we have to, or because it solves a problem that would be much harder to solve with strongly typed code. The overhead of making these kinds of calls is small, and unless they are being called with extremely high frequency would likely not make a noticeable impact on your application.

I hope that you enjoyed this quick little spin around the dynamic keyword, and I'll be sure to come back shortly with more C# 4.0 goodies!