codethinked (kōdthĭngked) adj. To be consumed by or obsessed with code.

Taking the Magic out of Expression<T>

I have seen many people gawk at the use of Expression<T> in all sorts of ways, and to someone coming from a static language it may seem like pure magic (or witchcraft depending on your outlook!). In reality though, Expression<T> is really nothing to stand in awe of, and you too can start using it in very measured ways in your applications.

Let's first start off by declaring a simple lambda and assigning it to a Func delegate.

Sidenote:

If you aren't familiar with lambdas, then go check out the second post in my functional programming series.

Func<string, int> length = s => s.Length;

Now that we have defined our length delegate we can call it just like any delegate:

string myString = "some string";
int stringLenght = length(myString);

But we can wrap our "Func<string,int>" in an Expression<T> like this:

Expression<Func<string, int>> length = s => s.Length;

But now we can't call it anymore. Why is that? Well, "length" is no longer a delegate, but instead it is an expression tree. An expression tree is simply a tree structure that represents the lambda "s => s.Length". Instead of the C# compiler turning this into an executable method, it simply goes through the syntax and forms a tree that expresses what the lambda is doing.

In fact, there is a method on the Expression type called "Compile" that lets us turn this expression tree into a Func<string, int> that we can run:

Func<string,int> lengthMethod = length.Compile();
int stringLength = lengthMethod(myString);

Pretty cool stuff. But why would we want to get expression trees instead of a compiled delegate? Well, the answer is both simple and complex. We would want an expression tree because we might want to use the information about the lambda to do something different than what the compiled code might actually do. Whaaaaat? Yep, you read that right, we might want an expression because when we say "Where(t => t.StartsWith("a"))" in a linq expression we don't actually want to call the "StartsWith" method, intead we want to translate that into a "t LIKE 'a%'" SQL statement. Got that?

If we can walk the tree that describes the lambda, then we can figure out what the lambda is trying to do and then perform some other task. For example, one of the things that I am currently doing at work is developing a way to have rules get tied to properties on a class and then in turn tie those back to validation displays that the user can see on the front end. One option would be to pass strings that represent the names of the properties, and another would be to generate enumerations that represent the properties on a class. The problem with these is that in the first one, if names of properties changed then you could potentially have issues where you forget to update an interface to the new string, and in the second we have to deal with giant enumerations and using those things all over the place. Plus we either have to create them manually or use some sort of Code-Gen to create them (yuck!).

But how do we solve this with expression trees? Well, let's define a class like this:

public class Rule<T>
{
    public Rule<T> BindRuleTo<U>(Expression<Func<T,U>> expression)
    {
        //do something with the expression here
        return this;
    }
}

Here we are declaring a Rule class and we can now bind the rule to a property. I have yet to implement the actual work, but lets look at how we would call it first:

var rule = new Rule<User>();
rule.BindRuleTo(u => u.Username);

Here we have a Rule associated with a User class and then we bind the rule to the Username property on the user class. We get strong typing and full intellisense here because C# can infer that the type going into the lambda is a "User" type due to us declaring the Rule class as Rule<T>. Because C# knows the user class goes in, then once we specify a property it can also infer "U" because it knows the type coming out of the lambda as well! It makes our syntax much more clean without all of the angle brackets.

But how do we now turn this expression into something that we can use? All we have to do is inspect our expression. Since we are only allowing a property to be passed in (this isn't constrained at compile time though, we have to check it at runtime) we can very easily limit the amount of parsing that we have to do. Since this is an expression tree, we know that the root is always going to be of type "LambdaExpression", so we can just cast to that type. The "LambdaExpression" type has two properties, "Body" and "Parameters". These hold the expressions for the body of the lambda and the parameters passed to the lambda. In this instance we know that the User class is being passed in, so we don't really care about the parameters, so we are only going to work with the body.

public Rule<T> BindRuleTo<U>(Expression<Func<T,U>> expression)
{
    var lambda = (LambdaExpression)expression;
    
    if (lambda.Body.NodeType != ExpressionType.MemberAccess)
    {
        throw new InvalidOperationException("Expression must be a MemberExpression");
    }

    var memberExpression = (MemberExpression)lambda.Body;

    
    return this;
}

Here you can see that we are expecting the body of the lambda to be a "MemberExpression". This is the type of expression that represents an access to a field or property on a class. In this case, the member expression would represent "u.Username". This also has two properties on it, one called "Expression" and the other is called "Member". The "Expression" property holds an Expression that represents the class on the left of the dot and the "Member" property holds a MemberInfo object that has info about the member on the right of the dot.

If you have dealt with reflection at any point, then you have probably seen MemberInfo objects all over the place before. In this case, since we are expecting a property then this MemberInfo object should actually be a PropertyInfo, so we can add a check for that:

public Rule<T> BindRuleTo<U>(Expression<Func<T,U>> expression)
{
    var lambda = (LambdaExpression)expression;
    
    if (lambda.Body.NodeType != ExpressionType.MemberAccess)
    {
        throw new InvalidOperationException("Expression must be a MemberExpression.");
    }

    var memberExpression = (MemberExpression)lambda.Body;
    
    var propertyInfo = memberExpression.Member as PropertyInfo;
    if (propertyInfo == null)
    {
        throw new InvalidOperationException("Expression must be a property reference.");
    }

    
    return this;
}

Now that we have our PropertyInfo object, all we have to do now is pull the data off of it that we need. In our case, all we really need is the name of the property to use to tie it to another place where we will pass in the same property.

Now you are going to obviously want to refactor this out into a library that you can use anywhere you need to. So most likely this method would look something like this:

public Rule<T> BindRuleTo<U>(Expression<Func<T,U>> expression)
{                
    this.PropertyName = ExpressionUtil.GetPropertyNameFromExpression(expression);                
    return this;
}

Now you can define a method somewhere else that looks up rules based on an expression that you pass in. And now you have intellisense everywhere, strong typing, and if a property name changes you'll get compile time errors without lots of enumerations! If you want to extend this concept further, then you'll just to do some research to find the expression types that represent the lamdas that you want to express. There is an expression tree visualizer that comes in the samples with visual studio that will allow you to look at your expression trees like this:

Expression Tree

This will make it much easier for you to discover what expressions you will need to accomplish your task! I hope you found this useful!

Addicted To MEF - Part 1

As with any new tool that hits the scene, I always feel the need to explore and evaluate it. Right now there are many many new tools coming out of Microsoft and so I have to cherry pick the ones I am most interested in. Well, I saw just a few days ago that Microsoft had just released Preview 3 of its Managed Extensibility Framework, also known as MEF. I've been reading about it for a while, and I know that two people that I admire, Glenn Block and Hamilton Verissimo (of Castle fame) are both working as PMs on the team. I have also recently been working on a lot of architecture design stuff for my employer (Dominion Digital) and so I was very interested to explore MEF further and see what it could offer me.

Before I go further, you must realize that I am exploring this technology, so if I state something wrong, please call me out!

I've heard in a few spots that MEF is a bit like a DI Container, but it is more about application composition than dependency injection. In its most basic use case you can define exports (using the "ExportAttribute") and then define imports (using the "ImportAttribute"). We will start off with a very simple example, and then elaborate upon it a bit to show you how this works.

So let's say that we have an interface which looks like this:

public interface IConsoleWriter
{
    void Write(string message);
}

We are simply going to use this interface to write out to the console, and if you didn't pick up on that, then you are a tad slow. :-) Anyways, so we need to consume this in my application, so we are going to define a class which implements this interface:

public class RedConsoleWriter : IConsoleWriter
{
    public void Write(string message)
    {
        ConsoleColor originalColor = Console.ForegroundColor;
        try
        {
            Console.ForegroundColor = ConsoleColor.Red;
            Console.WriteLine(message);    
        }
        finally
        {
            Console.ForegroundColor = originalColor;    
        }
    }
}

This particular implementation will now write out the value to the console in red! How cool! (sarcasm) Normally if we wanted to use this class, we could do something like this:

IConsoleWriter consoleWriter = new RedConsoleWriter();
consoleWriter.Write("Hello from the console!");

But we obviously are not interested in doing that here, MEF is all about extensibility, so lets make this puppy extensible. The first step that we have to take is to put an "ExportAttribute" on the "RedConsoleWriter" class to tell MEF that this is a type which we are exporting. In this attribute we will also tell MEF which type this class is exported for:

[Export(typeof(IConsoleWriter))]
public class RedConsoleWriter : IConsoleWriter
{

MEF now knows that this class is being exported for the "IConsoleWriter" interface. The next step is for us to import this guy. In order for us to do that we have to use two different classes. This is where the framework starts to feel a bit overly complex, but I'm sure as I explore it more I'll see why this complexity was introduced.

The first class that we have to use is the "AttributedAssemblyPartCatalog", which is a bit of a mouthful. This is one of four types of catalogs available to you. Catalogs are merely a method of gather up exports in different ways. The four types of catalogs are:

AggregatingComposablePartCatalog - Combines multiple catalogs into a single catalog. Useful if you need to gather exports in multiple different ways.

AttributedAssemblyPartCatalog - This is the one we are going to use, it takes an assembly and gathers up all types which are marked with the "ExportAttribute"

AttributedTypesPartCatalog - Looks for exports given particular types.

DirectoryPartCatalog - Watches a folder and enumerates assemblies to find exports

Next we are also going to have to use a "CompositionContainer" class, which takes a catalog and provides the ability to expose the catalog's exports. Make sense? Let's take a look:

private void Compose()
{        
    var catalog = new AttributedAssemblyPartCatalog(Assembly.GetExecutingAssembly());
    var container = new CompositionContainer(catalog);
    container.AddPart(this);
    container.Compose();
}

Here we are creating our catalog by passing in the current assembly, then we pass our catalog to the container constructor. The next line just passes in the current class (the "Program" class, since this is console app) to the container so that its imports will be resolved. Then we call "Compose" on our container, and that is it, our container has now been configured with all of our exports. But how do we get our exports into our application?

We do this by putting an "ImportAttribute" on the properties that we want to have "composed" by the container. So, we declare a property with the "IConsoleWriter" type, and we put the attribute on it:

[Import]
public IConsoleWriter ConsoleWriter { get; set; }

Pretty simple. Now we have to run our Program, but we have to create a new instance of our "Program" class since our property is not static. Our whole "Program" class ends up looking like this:

internal class Program
{
    [Import]
    public IConsoleWriter ConsoleWriter { get; set; }

    public static void Main(string[] args)
    {
        var p = new Program();
        p.Run();
    }

    public void Run()
    {
        Compose();
        ConsoleWriter.Write("Hello from the console!");
    }

    private void Compose()
    {        
        var catalog = new AttributedAssemblyPartCatalog(Assembly.GetExecutingAssembly());
        var container = new CompositionContainer(catalog);
        container.AddPart(this);
        container.Compose();
    }
}

Notice that we are just using the "ConsoleWriter" property without ever assigning anything to it. By adding "this" to the container and calling composed, all "Import" attributes will be found on this type and MEF will assign a part to it. This works because we only have one export for the "IConsoleWriter" type, if we had more in this assembly then MEF would not know how to resolve it:

MEF Multiple type fail

But if we change our application to have an IEnumerable<IConsoleWriter> property, then we will get a list of all of the exported types:

internal class Program
{
    [Import]
    public IEnumerable<IConsoleWriter> ConsoleWriters { get; set; }

    public static void Main(string[] args)
    {
        var p = new Program();
        p.Run();
    }

    public void Run()
    {
        Compose();
        foreach (IConsoleWriter consoleWriter in ConsoleWriters)
        {
            consoleWriter.Write("Hello from the console!");   
        }        
    }

    private void Compose()
    {        
        var catalog = new AttributedAssemblyPartCatalog(Assembly.GetExecutingAssembly());
        var container = new CompositionContainer(catalog);
        container.AddPart(this);
        container.Compose();
    }
}

That is pretty cool. Now we are looping through our "RedConsoleWriter" and our "BlueConsoleWriter" and writing each out to the console:

image

Neato.

Well, that was just a quick look at some of MEF's most basic features, but in the next post I'll take a look at some of the other catalog types and how we can use them in order to really start adding external plugins to our application. Hope you found this useful!

Download the Full Source Here

Don't Be Afraid of Easy

iStock_000000685719XSmall

Construction has never been easy. Hundreds of years ago, there were buildings that were being constructed which were absolutely amazing and complex structures. They were complex even by today's standards. Take a look at the Cologne Cathedral for example. It is one of the largest cathedrals in the world, composed of many thousand and thousands of stones and worked on by thousands and thousands of workers. It is roughly 515 feet tall, which is the equivalent of about a 35 to 40 story building! So what is the difference between this structure and a modern 35 story building?

There are quite a few differences actually. First and foremost are the materials. In a modern building our construction materials would include steel beams, glass sheets, concrete, drywall, cinder block, etc... This is quite different from the stone, brick, and stained glass that a cathedral would be made out of. We know that building materials have changed over the centuries, and we also know that there are many reasons for this. From ease of assembly, ease of manufacture, strength, weight, etc... it all boils down to the fact that these materials have made construction easier. And by making it easier, it reduces the time it takes to construct a building.

And that is the biggest different between the Cologne Cathedral and a modern sky scraper. All of the physical details aside, the time and cost required to construct the Cologne Cathedral is absolutely astronomical compared to modern buildings. The Cologne Cathedral's construction was started in 1248 and wasn't completed until 1880! Even if you consider the fact that there was an almost 400 year halt in construction, it still took almost 200 years to build! Contrast that with the Taipei 101, which is currently the world's tallest skyscraper, and it only took from 1999 to 2004 to build with a cost of 1.76 billion. If it would have taken 200 years to build the Taipei 101, then at that rate it would have cost 149 billion dollars to build! That is almost three times Microsoft's yearly revenue!

But how does this fit into software development? Well, I think that people in software development are sometimes afraid of easy. People in the construction world are never afraid of easy. They have actively sought out materials and manufacturing techniques to make their jobs easier because they know that they will be asked to build larger and more complex buildings on shorter timelines and leaner budgets. Often times it seems that people in the software world actively seek out a complex solution to a problem, simply because the complex solution is seen as requiring more technical ability. Other times software developers will choose to build something that has already been built, simply because they want to have more control over the process without consideration (or at least under-consideration) of how much wasted effort is going to go into it.

Imagine if every time someone wanted to construct a building they set about to design the wall outlet again. First of all we would end up with a hideously large number of different outlet types, and nothing would ever plug in right, but secondly the cost and effort involved would be such a huge waste. The socket has been designed and built over and over, just go get one from the store. Now that may sound silly, but many people do this on a daily basis in their software. They repeat the same code for the same basic processes over and over. Part of it is that they have not taken the time to identify a problem's constituent parts so that they can separate the repeatable parts.

For some items this may be intuitive, but let's look at the wall outlet analogy again for a second... imagine if you had never built a building before and you were told to identify what is repeatable. You look at the outlets and they are all connected to a bunch of different length wires going to other outlets all over the house. You designed these outlets from the ground up, so these wires are physically connected to the outlets, which means you see this network of outlets strewn throughout the house as a single system. Surely you can't mass manufacture these items, each one of these outlets requires different length cable to be connected to it in order to reach the next outlet! It is only when the builder starts looking at the wires and the outlets as separate systems do they realize that each of these items can easily be stamped out in large numbers.

For us, looking at a modern outlet, we have screws on the back where we can attach a wire. Then we can also go buy wire, wire strippers, wire cutters, and a screw driver. The system is broken down for us, we have the tools we need, we just need to make it happen. In software these tools are becoming more readily available to us on a daily basis. We have Inversion of Control containers, object relational mappers, AOP tools, refactoring tools, etc... Most of us need to start seriously looking at our tool-belts and asking ourselves if we are using the tools that will make our jobs easy. But even with all of these fancy tools, we still need to take the time to look at our software and try to ascertain what is repeatable. And trust me, it isn't going to be all ponies and gumdrops, you will have to put some serious effort into finding patterns and leveraging these tools effectively. But once you do, you will never look back. Just as our most of our friends in construction would never dream of going back and hand chiseling a piece of stone.

Emergent Complexity

iStock_000005951585XSmall

This weekend, after the Raleigh Code Camp, James Avery put on an Open Spaces conference called Shadow Camp. It was a small group of people who came together on Sunday morning to discuss the topic of complexity in software. While people were discussing topics for the different slots, one of the topics which was suggested by Corey Haines was "Emergent Design". The idea of Emergent Design is not a new one in the agile world, but discussing this in relation to software complexity led to an interesting discussion on entropy in software.

For many people the idea that complexity is emergent may sound like an obvious statement. We all know that from the second law of thermodynamics that all systems trend toward chaos, but we think of those systems as uncontrolled natural systems. We don't think of our software as a natural system with forces that are out of our control which changes and evolves on its own. But our software is constantly changing and evolving in ways that are more than the sum of the changes we put into it. Every time a developer touches a piece of code, the design of the system becomes more divergent from the original design.

Without someone to constantly guide the design at a high level, it will slowly descend into chaos. Small changes in different places in the application will combine to form larger changes which will affect larger swaths of the application. So what do we do? Do we constantly keep an eye on the architecture of our systems? Well, the short answer is yes, but the long answer is that we can make decisions in our architectures that will allow us to minimize the impact of entropy.

Complexity in software is all about interactions. Now obviously interactions must happen, if they didn't then the software couldn't do anything. But complexity isn't just simply about the surface of your software and the number of methods on objects, it is about the combination of the number of methods along with the number of other methods which can call that method. Let's say that we have two classes and each class has five public methods. Then we have another two classes, each with 1 public method, and 4 private methods.

 image

Judging from what we said above, you might think that we could just say 5 * 5 and figure that we have 25 possible interactions on the first action, and one possible interaction on the second example. But the reality is much worse. In the first example single methods could call multiple other methods, and methods within one class could call other methods from the same class. Now you may be saying to yourself, what does it matter if methods in a class call methods in the same class? If they are in the same class, can't we just change them? No we can't, if we have exposed them publicly we have created a contract on that method. If we decide to change this method, then we have to create a new method to support the new contract.

What all of this means is that as you expose more and more methods from your classes the potential for complexity increases exponentially. As you add more and more classes, the numbers just start increasing at a startling rate. So, lets just assume that our simple 5 * 5 numbers above are accurate. If we had 3 classes, then this turns from 25 into 125. If we have 5 classes then we are now at a staggering 3125 possible interactions. If we stick with the 5 method number and go with 25 classes, which is still a fairly small application, then the number become almost incomprehensible at 2.98023224 × 10^17. This potential for interaction is what allows your so beautifully architected application to slowly descend into a ball of chaos if these interactions aren't constantly managed.

One of the tools that you can use to manage this complexity is partitioning. Divide up your application into chunks, and then manage the interactions between the chunks through strictly defined interfaces. In DDD this is referred to as a bounded context and without them, not only do you have ever increasing complexity, but it gets harder and harder to manage the complexity. The reason for this is that as you add more and more classes to your application you have to consider them when designing new classes.

image

Even though we have five public methods on each class, we have 25 interactions in each context with only minimal interactions between contexts. In extremely large applications, this can be one of the only ways in which to greatly reduce the possible number of interactions.

At a lower level another approach you can take is to try and make methods private or protected. But be careful with this approach! Going overboard can cause your application to be overly rigid, but you also have to remember that every method you expose is a contract that you have tied yourself to in the future, especially if your class is exposed outside of your module.

Yet another approach you can take is to implement the Principle of Least Knowledge (also referred to as the Law of Demeter). This basically says that an object should only interact with methods on objects that it is directly holding, and should not call through an object to another object. For example:

public void Method(SomeObject obj){
    obj.OtherObject.MethodOnOtherObject(); //don't do this
}

By make these kinds of calls you are instantly exposing the number of other classes that your class is interacting with directly. Instead the call to "MethodOnOtherObject" should be wrapped in a method on "SomeObject" that does the interaction on behalf of this method. So, something like this:

public void Method(SomeObject obj){
    obj.PerformAction();
}

Anyway way you can find which will help to reduce the coupling between your objects will help you refactor later to reduce the complexity that is always going to bleed into your application. Managing complexity, and therefore keeping our applications agile is our primary job as architects and developers. Next time you are designing an application, class, or just a method ask yourself if you are doing everything you can in order to manage the complexity.

IronRuby talk at Raleigh Code Camp This Weekend

I am giving a talk on IronRuby at the Raleigh Code Camp this weekend. It is titled "Microsoft and Ruby Sittin' in a Tree" and is basically a 100-200 level overview of Ruby along with a few IronRuby specifics such as .net integration.

If you are at the Raleigh Code Camp, and you don't suck, then you should come check out my talk!