codethinked (kōdthĭngked) adj. To be consumed by or obsessed with code.

Making the Entity Framework Fit Your Domain - Part 2

Wow, I didn't realize when I started this series that it would take me this long to get to part 2. Sorry about that guys (and ladies)! If you have forgotten about the first part of this post, then you can go check it out.

In the first part I talked about getting up to the point where I realized that without going IPOCO I would not be able to use the Entity Framework with any sort of approximation of a real application domain. In this post we are going to go over the entity that I have created and talk about the issues that I had along the way.

In the previous post I showed that in order to create an entity that didn't involve descending from a base class, I would need to implement a class like this:

public class Entity : IEntityWithKey, IEntityWithChangeTracker, IEntityWithRelationships

The first interface "IEntityWithKey" is actually optional, and according to the docs, will decrease performance and increase memory usage. Well, hmmmmm, that makes it sound not so optional anymore, so I went ahead and implemented it. This interface can be implemented entirely within the base entity class:

EntityKey _entityKey;        
EntityKey IEntityWithKey.EntityKey
{
    get
    {
        return _entityKey;
    }
    set
    {
        SetMemberChanging(StructuralObject.EntityKeyPropertyName);
        _entityKey = value;
        SetMemberChanged(StructuralObject.EntityKeyPropertyName);
    }
}

This is a property that the consumer doesn't really need to worry about, the entity framework uses this property for its own internal purposes.

The second interface is required, and lets your object inform the entity framework when properties have changed. The entity framework obviously wants to do change tracking and requires the implementer to do this for themselves. In case you have never used NHibernate, you will know that it uses transparent proxies in order to do automatic change tracking. Since the entity framework doesn't implement dynamic proxies and also doesn't provide any compile time generated proxies, we are forced to implement this ourselves.

My first reaction was to just use DynamicProxy2 in order to generate proxies and do change tracking like that! This is the same library from the Castle Project that NHibernate uses to create its proxies. The issue immediately came up when I realized that the Entity Framework had no way to construct objects from factory methods. In order to create these proxies I would need to setup my entity like this:

protected User()
{            
}

public static User Create()
{
    var proxy = new ProxyGenerator();
    return proxy.CreateClassProxy<User>(new PropertyChangeInterceptor());
}

Not only would the entity framework not be able to construct these proxies, but it even blew up when I constructed one of these classes on my own and tried to pass it as a new entity to the Entity framework. Clearly the entity framework did not like working with runtime generated proxies. The reason for this is that DynamicProxy2 does its magic by creating a subclass of our "User" class at runtime and then passes that class back as a "User" class. So for the code in your application, it looks exactly like a "User" class. This requires that all of your properties and methods be virtual in order to intercept them though. If you look below you will see how the proxy wraps the calls to the underlying type and allows code to be inserted.

 image

But since the Entity Framework doesn't like this kind of proxy, we are going to have to look elsewhere. So instead of runtime generated proxies, I decided to look into using PostSharp to do some compile time method interception. This is exactly what it sounds like, PostSharp actually modifies your assemblies post-compile and allows you to inject code around methods and field accesses.

What we will first need to do is put methods in our base entity class to report property changes:

public void SetMemberChanging(string member)
{
    if (_changeTracker != null)
    {
        _changeTracker.EntityMemberChanging(member);    
    }            
}

public void SetMemberChanged(string member)
{
    if (_changeTracker != null)
    {
        _changeTracker.EntityMemberChanged(member);    
    }            
}

The "_changeTracker" variable is actually passed into the entity on a method that is implemented by the IEntityWithChange tracker interface. If we were to implement this manually, then the "Username" property would look like this:

[EdmScalarProperty(IsNullable = false)]        
public string Username
{
    get
    {
        return this._Username;
    }
    set
    {                        
        SetMemberChanging("Username");
        this._Username = value;
        SetMemberChanged("Username");
    }
}

And we would have to do this with every single property that we were tracking with the entity framework. Instead with PostSharp we will need to create a class which inherits from an "OnMethodBoundaryAspect". This aspect allows us to inject code around a method call. I would explain this further, but right now this isn't a tutorial for PostSharp, I recommend that you go check it out. This aspect that we are going to create will be applied to our entity classes.

public class ChangeTrackingAspectAttribute: OnMethodBoundaryAspect
{
    public override bool CompileTimeValidate(MethodBase method)
    {            
        if (method.Name.StartsWith("set_"))
        {
            return true;
        }            
        return false;
    }

    public override void OnEntry(MethodExecutionEventArgs eventArgs)
    {
        string propertyName = eventArgs.Method.Name.Substring(4);
        PropertyInfo pi = eventArgs.Instance.GetType().GetProperty(propertyName);
        if (pi.IsDefined(typeof(EdmScalarPropertyAttribute), false))
        {
            var changeTrackingEntity = eventArgs.Instance as Entity;
            if (changeTrackingEntity != null)
            {
                changeTrackingEntity.SetMemberChanging(propertyName);
            }    
        }                        
    }

    public override void OnExit(MethodExecutionEventArgs eventArgs)
    {
        string propertyName = eventArgs.Method.Name.Substring(4);
        PropertyInfo pi = eventArgs.Instance.GetType().GetProperty(propertyName);
        if (pi.IsDefined(typeof(EdmScalarPropertyAttribute), false))
        {
            var changeTrackingEntity = eventArgs.Instance as Entity;
            if (changeTrackingEntity != null)
            {
                changeTrackingEntity.SetMemberChanged(propertyName);
            }
        }            
    }
}

This class looks for methods that start with "set_" and apply this attribute to them. Then at runtime we look for the "EdmScalarAttribute" in order to call the "SetMemberChanging" and "SetMemberChanged" attributes. There could certainly be some more caching and optimizations, but for now this will do. We have implemented some very basic "change tracking" on entities while only placing an attribute on our entity class.

The last interface "IEntityWithRelationships" requires a bit more work to get implemented. It also requires us to dirty up our domain entity a bit more than we have had to so far. In the next entry in this series, I'll show you how I implemented "IEntityWithRelationships" and then provide the full source for the base entity. You'll start to see that you can use the Entity Framework within your own domain, but is all of this work worth it? Well that all depends on why you are using the Entity Framework. If you remember, I started this series because I wanted to see if I could fit the Entity Framework into my own domain and be happy with it. So far the results haven't been too bad, and we'll see in the next post where this will go. Stay tuned!

Making the Entity Framework Fit Your Domain - Part 1

I'm assuming that like myself, many of you out there work for companies that base much of their IT infrastructure (or at least software development tools) around Microsoft products. So, when a new tool like the Entity Framework comes out, even if you are not a fan, you still need to have a solid knowledge of it because you are going to have to use it at some point. At this point most of my ORM experiences have been with NHibernate, but I still feel the need to explore the Entity Framework to see if I can make it palatable for me to use. I say "palatable" because of the fact that the Entity Framework is designed almost entirely around database first design, which is not the way that I like to design my applications.

My goal with this post is not to trash talk the entity framework, but instead to take it as far as I can toward a usable solution that I would be okay with putting into a production application. This post is going to be written as I explore, so please let me know if you see anything that is wrong or missing.

Let's first talk about the domain that we are getting ready to look at. It is going to be a very simple domain, because otherwise it would just overwhelm the blog post by introducing too much complexity. I do want to have enough entities though so that you can see where each technology differs. What we are going to do is start off with a scenario that everyone is familiar with... a user with groups and roles. The user will also have a list of addresses associated with it.

Database Schema

We have 4 main tables along with two join tables. I would explain this schema to you, but if you don't get the schema then this article might be confusing anyway so I'm not going to waste everybody else's time with it. Basically we have already started designing this application in a way that would bother most people who are using DDD. Normally I wouldn't start with the database, but since the Entity Framework essentially forces you into starting with the database first, we are going to take this approach. The reason that I say that the Entity Framework forces you into database first design is because the primary method of generating an EF model is to generate it off the database. And then later on as you make changes, you can then update your model to reflect those changes.

At this point in the process the Entity Framework allows us to get up and running very quickly. We simply add a new Entity Data Model:

image

Then we get a wizard that lets us connect the model to our database and generate our entities. So, in just a few seconds we are looking at this:

Entity Data Model

Kinda cool actually. It knows about our join tables and generates many to many relationships automatically. It doesn't however know about join tables with payloads, but then again there is ambiguity about how we might want that sort of data modeled in our app. So now we have our entities in our Entity Data Model, but where are we really at this point? Well, we are actually already at the point where we can create new entities and save them off to the database.

var user = new User();
user.Username = "TestUser";
user.EmailAddress = "test@test.com";

var address1 = new Address();
address1.Street = "111 Test Street";
address1.City = "Test City";
address1.State = "Virginia";
address1.PostalCode = "22055";

var address2 = new Address();
address2.Street = "222 Test Street";
address2.City = "Test City";
address2.State = "Virginia";
address2.PostalCode = "23000";

user.Addresses.Add(address1);
user.Addresses.Add(address2);

var entities = new TestEFAppEntities();
entities.AddToUserSet(user);
entities.SaveChanges(true);

That was painless, wasn't it? But where did the "User" and "Address" classes come from? I don't remember creating any classes... But that is because we didn't. The Entity framework spit out all of these classes into a file that is hidden under our Entity Model called "Domain.Designer.cs". Here is the user class that was generated for the model (I removed all the comments so that it would only be kinda huge) ;-)

[global::System.Data.Objects.DataClasses.EdmEntityTypeAttribute(NamespaceName="TestEFAppModel", Name="User")]
[global::System.Runtime.Serialization.DataContractAttribute(IsReference=true)]
[global::System.Serializable()]
public partial class User : global::System.Data.Objects.DataClasses.EntityObject
{
    public static User CreateUser(int id, string username, string emailAddress)
    {
        User user = new User();
        user.Id = id;
        user.Username = username;
        user.EmailAddress = emailAddress;
        return user;
    }

    [global::System.Data.Objects.DataClasses.EdmScalarPropertyAttribute(EntityKeyProperty=true, IsNullable=false)]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public int Id
    {
        get
        {
            return this._Id;
        }
        set
        {
            this.OnIdChanging(value);
            this.ReportPropertyChanging("Id");
            this._Id = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value);
            this.ReportPropertyChanged("Id");
            this.OnIdChanged();
        }
    }
    private int _Id;
    partial void OnIdChanging(int value);
    partial void OnIdChanged();

    [global::System.Data.Objects.DataClasses.EdmScalarPropertyAttribute(IsNullable=false)]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public string Username
    {
        get
        {
            return this._Username;
        }
        set
        {
            this.OnUsernameChanging(value);
            this.ReportPropertyChanging("Username");
            this._Username = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value, false);
            this.ReportPropertyChanged("Username");
            this.OnUsernameChanged();
        }
    }
    private string _Username;
    partial void OnUsernameChanging(string value);
    partial void OnUsernameChanged();

    [global::System.Data.Objects.DataClasses.EdmScalarPropertyAttribute(IsNullable=false)]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public string EmailAddress
    {
        get
        {
            return this._EmailAddress;
        }
        set
        {
            this.OnEmailAddressChanging(value);
            this.ReportPropertyChanging("EmailAddress");
            this._EmailAddress = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value, false);
            this.ReportPropertyChanged("EmailAddress");
            this.OnEmailAddressChanged();
        }
    }
    private string _EmailAddress;
    partial void OnEmailAddressChanging(string value);
    partial void OnEmailAddressChanged();

    [global::System.Data.Objects.DataClasses.EdmRelationshipNavigationPropertyAttribute("TestEFAppModel", "FK_Addresses_Users", "Addresses")]
    [global::System.Xml.Serialization.XmlIgnoreAttribute()]
    [global::System.Xml.Serialization.SoapIgnoreAttribute()]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public global::System.Data.Objects.DataClasses.EntityCollection<Address> Addresses
    {
        get
        {
            return ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.GetRelatedCollection<Address>("TestEFAppModel.FK_Addresses_Users", "Addresses");
        }
        set
        {
            if ((value != null))
            {
                ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.InitializeRelatedCollection<Address>("TestEFAppModel.FK_Addresses_Users", "Addresses", value);
            }
        }
    }

    [global::System.Data.Objects.DataClasses.EdmRelationshipNavigationPropertyAttribute("TestEFAppModel", "UserXGroups", "Groups")]
    [global::System.Xml.Serialization.XmlIgnoreAttribute()]
    [global::System.Xml.Serialization.SoapIgnoreAttribute()]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public global::System.Data.Objects.DataClasses.EntityCollection<Group> Groups
    {
        get
        {
            return ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.GetRelatedCollection<Group>("TestEFAppModel.UserXGroups", "Groups");
        }
        set
        {
            if ((value != null))
            {
                ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.InitializeRelatedCollection<Group>("TestEFAppModel.UserXGroups", "Groups", value);
            }
        }
    }
}

Hmmmm.... so where is my domain object in there? In fact, all of the domain objects are generated into a single file like this. You get to extend your domain objects by using partial classes. The partial classes that we implement allow us to take advantage of some partial methods that are in the generated classes. As you can see from this code in one of the above setters:

set
{
    this.OnUsernameChanging(value);
    this.ReportPropertyChanging("Username");
    this._Username = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value, false);
    this.ReportPropertyChanged("Username");
    this.OnUsernameChanged();
}

We have two partial methods "OnUsernameChanging" and "OnUsernameChanged" along with two events "ReportPropertyChanging" and "ReportPropertyChanged". So when we create our partial classes we can tap into these methods. This way if we wanted to intercept the setting of our Username property we could implement a partial class like this (if you haven't ever used partial classes, then go here):

public partial class User
{
    partial void OnUsernameChanging(string value)
    {            
        // check value and do something here
    }
}

But what if we want to do something when the property is retrieved? We don't really have a lot of options, the getter in the user partial class looks like this:

get
{
    return this._Username;
}

Hmmm. So I guess they don't want us to hook into the getter! Having user code hooked into the getter might have caused issues at some infrastructure level for them, so I'm going to give them the benefit of the doubt! But not too much. At first I thought that maybe I could go to the EDM (Entity Data Model) and make the getter and setter on the property as private or protected. Then rename the property to something like "UsernameInternal". Then I could introduce a completely new property in the partial class to expose this property:

public string Username
{
    get
    {
        return this.UsernameInternal;
    }
    set
    {
        this.UsernameInternal = value;
    }
}

The only problem is that because we have wrapped the EDM generated property, we can no longer query against it! How would I write this query?

var userQuery = from u in entities.UserSet 
    where u.Username == "TestUser" 
    select u;

Sure we can put "Username" in the query because we have exposed our own property, but since this property is not in the EDM, the Entity Framework has no knowledge of this property so we can't query against it! Dang. And to think I was going to try and use this same technique to implement some lazy loading. Hmmmmmmm. So if we want to be able to query against any property then we are going to have to directly expose the property that has the "EdmScalarPropertyAttribute" on it. This is the attribute that signifies to the Entity Framework that this property is a mapped scalar property.

This property has to share the name of the property tag in our CSDL file. Sadly, in order to get any control over my classes, it looks like I am going to have to ditch the Entity Designer altogether. Which isn't necessarily a bad thing considering that all of the xml and class files are generated into only two files which creates an interesting situation for teams that are editing these files independently. Definitely a merge nightmare. What blows my mind is that they also actually store the metadata for the EDM diagram right in with the mapping xml.

Before you start thinking that I ditched the Entity Designer too soon, another issue here is that all of the generated entities descend from a base EntityObject class that keeps us from forming our own object heirarchy. And we can't edit any of the generated classes or else all of our changes will be overridden whenever we make a change to it. Another issue is that we don't really have any control over our object lifetime. The Entity Designer spits out default public constructors and a default factory method containing parameters for all properties of the object. Why? I don't understand why they just didn't leave off the constructor and factory method and let me define them in my partial classes. I can't remove them from the generated classes, but I could easily add them if they weren't there. Geeeez.

Alright, so I mentioned that we need to ditch the EDM designer, so what do we use now? Thankfully Microsoft provided us a tool called EdmGen.exe. This is a command-line tool that we can use to generate the mapping files for our database:

 image

EdmGen actually spits out different files for everything. And when I say "everything" I mean only the different file types. So we get these files:

image

What we are going to do here is ditch the ObjectLayer.cs and Views.cs files and create our own Entities. Hopefully we can isolate most of the EF specific code into a base entity class. But since we are going to remove generated entities we are going to have to keep the behavior that the previously generated classes had. This is where that IPOCO stuff comes in. IPOCO allows you to implement a few interfaces (in our case three) instead of using the base EntityObject class. Our base entity class' definition will end up looking like this:

public class Entity : IEntityWithKey, IEntityWithChangeTracker, IEntityWithRelationships

These are the three interfaces that we must implement in order to use this with our EDM. The first just gives the entity framework something to identify your entities by, the second provides a change tracking mechanism, and the third provides a way for your entity to hold its relationships. These are fairly self explanatory, and luckily we can isolate most of the behavior for these into our base class. So our goal is to have business entities that lack Entity Framework specific details, and which expose no Entity Framework specific types. This post is getting  a bit long, so I am going to leave it off right there for now...

In future parts of this series we will break down the base entity class that I have created and take a look at the different entities and how we can create them. We will also take a look at creating an ObjectContext and show you how we can use and query these entities just like we could if we created them through the EDM designer. I will also provide you with the full source to the project that I am using in this series, so stay tuned!

Exploring System.Web.Routing

Most of you have probably been hearing a lot recently about the new ASP.NET MVC framework and the many features that it has which will hopefully simplify web development for those of us that want to get "closer to the browser". One of the features that they were initially implementing for the MVC framework was a new routing engine that gives you a flexible way of mapping urls to specific pages in your application. Early on they realized that the System.Web.Routing infrastructure was not only applicable to ASP.NET MVC, but could be used in any ASP.NET application to allow for much easier url rewriting. (They also realized that they wanted to use it in the Dynamic Data stuff!) Because of this they moved Routing out of the System.Web.Mvc namespace and into the System.Web namespace. Well, this namespace just shipped as part of .net 3.5 SP1, so lets take a look at how it works!

The first thing that you need to do in order to use these features are to add a reference to the System.Web.Routing dll:

image

System.Web.Routing has two core concepts. One is the concept of a route and the second is the concept of a route handler. A route is simply a class that holds a pattern which can be matched by urls coming into your application. Each url coming into the application will be matched against the list of Routes that you have defined, and if one matches then it will be used. A route will look something like this:

"Catalog/{Category}/{ProductId}"

This will match any url coming into the application that starts with "/Catalog/" so something like "/Catalog/Computers/3444" would match this route. The items between the curly braces are called segments, and these will be captured and used later in our route handler. These routes are defined on static property of System.Web.Routing.RouteTable called "Routes". These routes will be defined in the "Application_Start" method of your global.asax file like this:

void Application_Start(object sender, EventArgs e)
{
    RegisterRoutes(RouteTable.Routes);
}

public static void RegisterRoutes(RouteCollection routes)
{
    routes.Add(new Route("Catalog/{Category}/{ProductId}", new CatalogRouteHandler()));
}

Now, you may have noticed the "CatalogRouteHandler" class that we are creating in the "RegisterRoutes" method. This is a class that we need to create in order to processes a request that comes in matching the provided route. You can have any number of these Route handlers to handle different types of request, but for right now we are just going to have one.

These "Route Handlers" implement the IRouteHandler interface which only has a single method called "GetHttpHandler" which, as you probably guessed, returns an IHttpHandler. So, what is IHttpHandler? It is not a new interface, and is actually part of the System.Web namespace. This interface also supports just one method called ProcessRequest which takes an HttpContext. In the context of a normal ASP.NET application, the class that you are going to see which implements this is the Page class. If you are an ASP.NET developer I hope that you are familiar with the Page class, if not, then get a book! So, we have an Interface which returns an IHttpHandler, which in our case, is a page object.

So, I hope you see where this is going. We are just matching a bunch of urls against our list of patterns, and then when we find a matching pattern we will use the associated IRouteHandler to get the IHttpHandler which is going to be able to respond to our request! It is actually a lot more simple than it sounds!

As we already said, in our case, the IRouteHandler is going to be returning a Page object. What page object? It doesn't really matter. We can map any number of urls to any number of Page objects. We could map all of our urls to a single Page object if we wanted. So, now that we know that we can map to any number of pages that we want, how does our IRouteHandler decided which page to map to? What kind of data do we get that allows it to make a decision? This is where the System.Web.Routing.RequestContext class comes in. The "GetHttpHandler" method that I mentioned above returns an IHttpHandler, but it also takes a parameter of type RequestContext. The RequestContext class contains has two properties, one is called "HttpContext" which is of type System.Web.HttpContextBase. The other is "RouteData" which is of type System.Web.Routing.RouteData.

If you saw "System.Web.HttpContextBase" and did a double-take then you are probably not alone. This is another class that was added in .net 3.5 SP1 which is simply an abstract wrapper for our old untestable friend the HttpContext. Remember earlier when I was talking about the nice new testable design, well, this is a big part of it. Just keep in mind that this is part of System.Web.Abstractions and so you'll need to add a reference:

image

The HttpContext property just allows access to all of the normal information that we would garner from our HttpContext, so that we could make potential routing decisions based on the request data itself. For example if we wanted to do something different if we were operating under https or http. Or if we wanted to redirect different places based on the domain, or sub-domain coming in on the request.

The "RouteData" property is where all of our data concerning our route and the segments are stored. First it has a property called "Route" which is of type RouteBase, and this represents the matching route that was picked for this particular request.

Next there is a property called "Values" which holds all of the values and default values for the segments that were specified in the route that we created. For example, the route that we created at the beginning of this post ("Catalog/{Category}/{ProductId}") had "Category" and "ProductId" segments, so for the url that we had above ("/Catalog/Computers/3444") the "Values" property would have values for each:

routeData.Values["Category"] == "Computers"

routeData.Values["ProductId"] == "3444"

So, now we could use these values to pass in our HttpContext.Items collection to our asp.net page! This would be accomplished by implementing that IRouteHandler that we talked about earlier. Here is a basic IRouteHandler for you to take a look at, take note that we have not implemented any security or anything in this.

public class CatalogRouteHandler : IRouteHandler
{
    public IHttpHandler GetHttpHandler(RequestContext requestContext)
    {            
        foreach (KeyValuePair<string, object> token in requestContext.RouteData.Values)
        {                
            requestContext.HttpContext.Items.Add(token.Key, token.Value);
        }            
        IHttpHandler result = BuildManager
            .CreateInstanceFromVirtualPath("~/Product.aspx", typeof(Product)) as IHttpHandler;
        return result;
    }
}

From here you can see that we are shoving our Category and ProductId into the "HttpContext.Items" collection.

Now that we have defined our routes, and we have defined our CatalogRouteHandler, lets look at a few more features of the routes. The first thing that we can do is setup defaults for our routes. So, lets say that we want to default the "ProductId" to 0001 when the "Catalog/{Category}/{ProductId}" route is used without the "ProductId" segment. Let us go ahead and default "Category" to "Default" when that is not passed in as well. This actually quite easy and is simply an overload of the Route constructor. To define the route above with a product default would look like this:

private static void RegisterRoutes(RouteCollection routes)
{
    routes.Add(new Route("Catalog/{Category}/{ProductId}",
        new RouteValueDictionary(new { Category = "Default", ProductId = "0001" }),
        new CatalogRouteHandler()));
}

As you can see we are creating a new RouteValueDictionary and we are initializing it using the constructor that just uses the properties off a given object to initialize it. Here we are using an anonymous type with Category and ProductId properties. This could also be done by just using a collection initializer and it would look like this:

private static void RegisterRoutes(RouteCollection routes)
{                                   
    routes.Add(new Route("Catalog/{Category}/{ProductId}",
        new RouteValueDictionary { {"Category", "Default"}, {"ProductId", "0001"} },  
        new CatalogRouteHandler()));            
}

Now, there is another part to adding a route that we need to discuss and that is route constraints. Route constraints actually take more than one form. First we have regular expressions. Let's say that we want to limit the ProductId on the route above to a max of 4 numbers. Then we could implement the constraint like this:

routes.Add(new Route("Catalog/{Category}/{ProductId}",
    new RouteValueDictionary { {"Category", "Default"}, {"ProductId", "0001"} },
    new RouteValueDictionary { {"ProductId", @"\d{1,4}"} },
    new CatalogRouteHandler()));

Here you see that we have another RouteValueDictionary that has a single entry for "ProductId" which takes the regular expression "\d{1,4}". Well, this regex limits us to 1-4 digits. So, if we enter a 5 digit product id, then this route will no longer be matched. You can check any value that you can validate using a regex. Constraints also take on a second form, which is they implement the IRouteConstraint interface. This interface has a single method called "Match" that passes all relevant info about the request to your class, and allows you to put custom logic for route constraints into your own class. Pretty powerful stuff. There is one already built in called HttpMethodConstraint and it allows you to limit your route to a particular http verb such as "get" or "post". These classes are just inserted into the same RouteValueDictionary that we used above, and they key doesn't matter. The routing infrastructure will reflect the key and see that it supports the IRouteConstraint interface and call the appropriate method. If we wanted to limit the above route to just "get", it would look like this:

routes.Add(new Route("Catalog/{Category}/{ProductId}",
    new RouteValueDictionary { {"Category", "Default"}, {"ProductId", "0001"} },
    new RouteValueDictionary { {"ProductId", @"\d{1,4}"}, {"httpMethod", new HttpMethodConstraint("get")} },
    new CatalogRouteHandler()));

Here we are giving it a key of "httpMethod", but you could really call it anything you want. The HttpMethodConstraint just takes a list of http verbs as a constructor and then checks the request for them when trying to match the route.

At this point I think that you have a pretty darn good overview of how the new System.Web.Routing namespace works, but there is just one more quick thing that I want to touch on. It is the StopRoutingHandler class. All this class does is cause the routing infrastructure to stop trying to route a particular request. This is good if you want to stop a particular extension from checking each and every route. Instead of allowing this, you can just put in routes at the top that will stop the process. So, if we wanted to stop looking for a route on an *.asmx request, we could add this to the top of our route definitions:

routes.Add(new Route("{service}.asmx/{*path}", new StopRoutingHandler()));

Here we are saying that any request which ends in ".asmx" and then has any path following it matches this route. Once this is matched, the route should stop being processed. Very useful if you have a huge number of routes.

So, at this point you may be reading this and thinking "I see how this all works, but how in the world does this actually process requests? Don't I need to get some module in the request pipeline?" And that answer is of course, "yes". There is a class called System.Web.Routing.UrlRoutingModule which is an IHttpModule which plugs into your request pipeline and intercepts the urls as they are coming through to allow the routing support to work. This is done using the exact same mechanism that most of us are using now in order to do url rewriting. Add this to the "modules" section in your web.config:

<add name="UrlRoutingModule" type="System.Web.Routing.UrlRoutingModule, System.Web.Routing, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />

And then add this to the handlers section (if you are using IIS7):

<add name="UrlRoutingHandler" preCondition="integratedMode" verb="*" path="UrlRouting.axd" type="System.Web.HttpForbiddenHandler, System.Web, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />

Well, that about wraps this up, I hope that you now have a good idea of how or if you can leverage the new routing infrastructure in .net 3.5 SP1.

Dissecting Linq Expression Trees - Part 2

In my previous post I talked about Linq expression trees and how you can create them in several ways. We talked about how you could create them simply using a lambda expression:

Expression<Func<int, int>> lammy = n => n * 2;

Or how you could create them manually:

ParameterExpression param = Expression.Parameter(typeof (int), "n");
BinaryExpression body = 
    Expression.Multiply(param, Expression.Constant(2, typeof (int)));
lammy = Expression.Lambda<Func<int, int>>(body, param);

I then showed you some code that would allow us to parse out a string of numbers and math operations and build up an expression tree. Everything looked great, all we had to do was keep parsing our string and chaining the Expressions together using our GetMathExpression method:

public Expression GetMathExpression(char operation, 
            Expression leftExpression, Expression rightExpression)
        {
            switch (operation)
            {
                case '*':
                    return Expression.MultiplyChecked(leftExpression, rightExpression);
                case '+':
                    return Expression.AddChecked(leftExpression, rightExpression);
                case '/':
                    return Expression.Divide(leftExpression, rightExpression);
                case '-':
                    return Expression.SubtractChecked(leftExpression, rightExpression);
            }
 
            return null;
        }

So, we just parse through our math operation building up our expression tree. But, as we discussed in the last post, there is a problem. The tree that we built up is going to be in the order that we had in the math equation that we entered. It is totally ignoring order of operations and just going sequentially down the string. So, for a string of "5+2*5+3" we would end up with a tree that looks like this:

image

Which ends up producing an incorrect answer to our equation, since it would be evaluated as: ((5 + 2) * 5) + 3.

So, what would we have to do in order to get this to evaluate correctly? Well, we would have to manipulate this tree until it took the form of 5 + (2 * 5) + 3. This would create a tree that looks like this:

image

This way, the 2 * 5 would get evaluated instead of 5 + 2. So you will see that the logic we are employing is simply to start from the top of the tree and start walking down the tree looking for nodes where the order of operations of the child is less then the parent. This is true when we hit the multiply node in the first tree (since add has a lower order of operation than the parent). So, we need to move the + sign up and move the multiplication down.

The steps that you will follow in order to do this is:

  1. Move the + node in place of the * node, effectively removing the * nodes left node
  2. Remove the + node's right child and replace it with the * node
  3. Set the + node's former right child as the left node of the * node

This will move the add above the multiply which will cause it to be evaluated later. So, how do we do this in code?

Well, the first thing we have to do is to be able to walk our expression tree. In this example we are using almost all binary expressions, so it probably wouldn't be too hard to walk this tree, but not all linq expressions have two children. Some have more properties and expressions to them, and it would require quite a bit of code to walk all the different types of expressions. Luckily Microsoft has helped us out with this. In this MSDN page Microsoft shows you how to create an expression visitor, which is just a class that has knowledge of how to walk an expression tree. All of the methods to handle each type of expression are virtual, so you can plug your own logic in as you walk the tree.

Whatever gets returned from each of these methods is used to build up the resulting tree. In this way, by changing what is returned from these methods, we can manipulate the tree. So, first I am going to create a class called OrderOfOperationsVisitor that descends from the ExpressionVisitor class that I pulled from Microsoft.

public class OrderOfOperationsVisitor : ExpressionVisitor

Okay, so we have our class signature and now we just need to overload our method:

protected override Expression VisitBinary(BinaryExpression b)
{
    if ((b.Left is BinaryExpression) && IsChildOrderOfOperationsLower(b, b.Left))
    {
        this.SwapOccurred = true;
        return SwapLeft(b, (BinaryExpression)b.Left);
    }
 
    if ((b.Right is BinaryExpression) && IsChildOrderOfOperationsLower(b, b.Right))
    {
        this.SwapOccurred = true;
        return SwapRight(b, (BinaryExpression) b.Right);
    }
 
    return base.VisitBinary(b);
}

Okay, so this where the magic begins. You can see that we are looking at the children of our BinaryExpression and then testing it to see the order of operations is lower. If it is, then we set a property on our class called "SwapOccurred" to true (remember this for later) and then we swap the current node with that child. If neither node is swapped, then we simply continue on down our tree by calling base.VisitBinary(b) which will simply walk the children of this node.

Since we are only supporting +, -, /, and * our "IsChildOrderOfOperationsLower" method will look like this:

private bool IsChildOrderOfOperationsLower(Expression current, Expression child)
{
    if (current.NodeType == ExpressionType.MultiplyChecked 
        || current.NodeType == ExpressionType.Divide)
    {
        if (child.NodeType == ExpressionType.AddChecked ||
            child.NodeType == ExpressionType.SubtractChecked)
        {
            return true;
        }                
    }            
    return false;
}

And our swap methods look like this:

private Expression SwapLeft(BinaryExpression expression, BinaryExpression left)
{
    Expression right = Expression.MakeBinary(expression.NodeType, left.Right, expression.Right);
    return Expression.MakeBinary(left.NodeType, left.Left, right);            
}
 
private Expression SwapRight(BinaryExpression expression, BinaryExpression right)
{
    Expression left = Expression.MakeBinary(expression.NodeType, expression.Left, right.Left);
    return Expression.MakeBinary(right.NodeType, left, right.Right);
}

Easy enough to see what is happening there. I am creating new expressions to do exactly the process that we outlined above for swapping a node with its parent.

Now we just have to use this class on our newly constructed tree:

var orderOfOperationsVisitor = new OrderOfOperationsVisitor();
mathExpression = orderOfOperationsVisitor.Visit(mathExpression);
while (orderOfOperationsVisitor.SwapOccurred)
{
    orderOfOperationsVisitor = new OrderOfOperationsVisitor();
    mathExpression = orderOfOperationsVisitor.Visit(mathExpression);
}

Notice that we run it once, then we start looping on "SwapOccurred". This is because after we reorder the nodes, we simply return the nodes back from our method and stop walking the tree there. We need to check below that node in the tree to make sure that there isn't anything below it that needs to be swapped. Another issue is that a node may need to be moved up multiple levels. Like this:

image

Since we can't walk the tree backwards, we have to make multiple passes in order to get the '+' node to the top.

So, now that we have done all of this, we can look at our expression tree before:

image

and after:

image

Success!

Well, I hope that you have found this little two part series on expression trees to be useful. I have attached the source for this project here, so feel free to download it and play around with it!

Added parallel abilities to Dizzy

I have added an "AsParallel()" method to Dizzy which mimics the way that the Parallel Extensions to the .net framework works. So, if you want to call a parallel map function you simply need to do this:

list.AsParallel().Map(n => n.ToUpper());

The AsParallel() method just looks like this:

public static IParallelEnumerable<T> AsParallel<T>(this IEnumerable<T> list)
{
    return new ParallelEnumerable<T>(list);
}

As you can see, it just takes an IEnumerable<T> and replaces it with a IParallelEnumerable<T>. This Interface/Class combo doesn't provide any sort of work, it is just a flag to use the Parallel version of methods. Currently "Map" is the only method with a parallel version. The ParallelEnumerable class just looks like this:

public class ParallelEnumerable<T> : IParallelEnumerable<T>
{
    private IEnumerable<T> enumerable;        
 
    public ParallelEnumerable(IEnumerable<T> list)
    {
        if (list == null) throw new ArgumentNullException("list");
        this.enumerable = list;
    }
 
    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.enumerable.GetEnumerator();            
    }
 
    public IEnumerator<T> GetEnumerator()
    {
        return this.enumerable.GetEnumerator();            
    }
}

So, you can see that we are just wrapping an IEnumerable and returning the items we need to support the IEnumerable interface. Then we just define a "Map" method that takes an IParallelEnumerable instead of an IEnumerable and voila! Instant parallelism without the need for different method names. The map method is just the same as it was in this post, only the method signature is now change to look like this:

public static IEnumerable<TResult> Map<TArg, TResult>(
    this IParallelEnumerable<TArg> list,
    Func<TArg, TResult> func)

So, whenever we have an IParallelEnumerable we will call this method instead of our other Map method. We still return an IEnumerable because we don't want additional methods chained to this call to automatically try to use parallel versions.

So, I hope you found this useful, and I hope you also check out Dizzy. I've added a few other methods, and plan to keep adding methods regularly. If you have any ideas for new methods, please let me know.