codethinked (kōdthĭngked) adj. To be consumed by or obsessed with code.

Keep Your IQueryable In Check

Updated: I've posted a follow-up on this post here.

There is a good chance that you have seen Oren’s recent “Repository Is The New Singleton” post. I wanted to have a quick discussion of his approach, and discuss why I don’t 100% agree with it. Let me first say that I don’t have a problem with the repository pattern. I do have some issue with the way that it is implemented sometimes, but overall I think it provides a good layer in which to encapsulate the queries of your application.

On that note, I do see value in the pattern which Oren refers to as the Query Object pattern. Some people may look at the pattern Oren is using and see “class explosion”, but I think that it all depends on what your preferences are. I think that the repository method can be implemented in most medium sized application without much trouble, but when you get to really large applications, you can get into trouble with lots of little queries that can create huge repositories. I’d much rather have one hundred query objects than to have repositories with a hundred methods.

Okay, so I agree with Oren that encapsulating queries into objects is not a bad idea, especially if your application has a large number of complex queries. My problem with Oren’s approach comes from the example that he gives of one of these “Query Objects” in a later post (shortened to save space):

public class LatePayingCustomerQuery
{
    /* snip */
    public IQueryable<Customer> GetQuery(ISession session)
    {
        var payments =     from payment in s.Linq<Payment>()
                     where payment.IsLate
                    select payment;
    
        /* snip */
        
        return     from customer in s.Linq<Customer>
                where payments.Any( payment => customer == payment.Customer )
                select customer;
    }
}

My problem is in returning IQueryable from the “GetQuery” method. Oren does it in order to provide the ability to sort, page, etc… closer to the UI layer, and I agree that this is necessary, but I think that the result needs to be constrained a bit more. Passing back the IQueryable allows much more than just sorting and paging, it really allows heavy modification of the query itself, and in a way that you are really no longer able to control. So it seems to me that Oren’s problem really isn’t so much with the repository pattern as it is with the idea of completely encapsulating all interaction with the ORM inside of the repository. He thinks that this causes us to lose much of the power and flexibility that ORMs provide us. But why can’t there be a middle ground?

So let’s assume that we have a fairly large application with a bunch of these finely crafted query objects, and you have a developer who has the above query, but decides that they just need to add one more filter to it. Now I know that people will say that if you have a developer who breaks the rules then you probably don’t want them to work on your team, but this is the real world. And in the real world people cut corners. So they do something like this:

latePayingCustomQuery
  .Where(c => c.LastName.StartsWith("E"))
  .Skip((page - 1) * itemCount)
  .Take(10);

This developer may have known better, or maybe they didn’t, but what they did was change the query so that we are now potentially filtering against a non-indexed string column in the database. And there is really no limit to what the developer could do to the query. I can already hear people screaming about bad developers and not wanting someone on the team who cuts corners, etc…. but I honestly don’t want to hear it. Quite often I don’t get to choose who I have to work with, especially when working with clients. So I don’t want to hear about your fairytale land where all developers do the right thing all the time.

So we really have two conflicting requirements, we want to provide the flexibility of paging and sorting up closer to the UI, but we also want to lock down the queries a bit. So what if we defined our own interfaces which we could use to wrap IQueryable? Something like this:

public interface IPageable<T>: IEnumerable<T>
{
    ISortable<T> Page(int page, int itemCount);
}

public interface ISortable<T>: IEnumerable<T>
{
    ISortable<T> OrderBy<U>(Expression<Func<T,U>> orderBy);
    ISortable<T> OrderByDescending<U>(Expression<Func<T,U>> orderBy);
}

This would allow us to have a result type of IPageable<T> which the developer could use like this:

query.FindAll().Page(1, 3).OrderBy(q => q.PostedDate);

Looks pretty good. The implementation of the IPageable interface could look something like this:

public class Pageable<T>: IPageable<T>
{
    private readonly IQueryable<T> queryable;

    public Pageable(IQueryable<T> queryable)
    {
        this.queryable = queryable;
    }

    public ISortable<T> Page(int page, int itemCount)
    {
        return new Sortable<T>(
            queryable
                .Skip((page - 1) * itemCount)
                .Take(itemCount));
    }

    public IEnumerator<T> GetEnumerator()
    {
        return queryable.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

Creating a pageable item would be as simple as just passing any IQueryable to it. Then all you have to do is pass a page number and an item count and it returns a sortable object. The sortable object would look something like this:

public class Sortable<T>: ISortable<T>
{
    private IQueryable<T> queryable;

    public Sortable(IQueryable<T> queryable)
    {
        this.queryable = queryable;
    }

    public ISortable<T> OrderBy<U>(Expression<Func<T,U>> orderBy)
    {
        queryable = queryable.OrderBy(orderBy);
        return this;
    }

    public ISortable<T> OrderByDescending<U>(Expression<Func<T,U>> orderBy)
    {
        queryable = queryable.OrderByDescending(orderBy);
        return this;
    }

    public IEnumerator<T> GetEnumerator()
    {
        return queryable.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

Here you can see that it allows us to chain sortable calls so that we can still order by multiple columns like this:

sortable.OrderBy(q => q.PostedDate).OrderBy(q => q.Title);

At this point you may be thinking that by exposing the “OrderBy” and “OrderByDescending” methods we have opened up the bad querying box just as much as passing back IQueryable. Well, with the code provided this is true…but if you use a bit of imagination you can also see how we could use these classes to limit the columns on which different types are allowed to be ordered by simply walking the expression tree when the enumerator is requested or intercepting the calls to “OrderBy” and “OrderByDescending”. It is also possible to create similar interfaces in this same way to provide constrained interfaces to other types of functionality such as projections.

Overall I think that ORMs have certainly revolutionized the way that developers think about data access, but I think we really need to be careful about letting IQueryable (or any other querying mechanism) flow up through our applications because all we are doing is letting spreading queries all over. I pretty much agree with Oren in his thoughts that the repository might not be the solution to everything, and that we need to think about data access in a slightly different way, but I also think that there is some serious need for a bit of control.

So what are your thoughts? What are the problems that you see in this approach? Do you feel like we should let IQueryable go throughout our applications? Do you think I am being too Byzantine? Let me know!

Making the Entity Framework Fit Your Domain - Part 2

Wow, I didn't realize when I started this series that it would take me this long to get to part 2. Sorry about that guys (and ladies)! If you have forgotten about the first part of this post, then you can go check it out.

In the first part I talked about getting up to the point where I realized that without going IPOCO I would not be able to use the Entity Framework with any sort of approximation of a real application domain. In this post we are going to go over the entity that I have created and talk about the issues that I had along the way.

In the previous post I showed that in order to create an entity that didn't involve descending from a base class, I would need to implement a class like this:

public class Entity : IEntityWithKey, IEntityWithChangeTracker, IEntityWithRelationships

The first interface "IEntityWithKey" is actually optional, and according to the docs, will decrease performance and increase memory usage. Well, hmmmmm, that makes it sound not so optional anymore, so I went ahead and implemented it. This interface can be implemented entirely within the base entity class:

EntityKey _entityKey;        
EntityKey IEntityWithKey.EntityKey
{
    get
    {
        return _entityKey;
    }
    set
    {
        SetMemberChanging(StructuralObject.EntityKeyPropertyName);
        _entityKey = value;
        SetMemberChanged(StructuralObject.EntityKeyPropertyName);
    }
}

This is a property that the consumer doesn't really need to worry about, the entity framework uses this property for its own internal purposes.

The second interface is required, and lets your object inform the entity framework when properties have changed. The entity framework obviously wants to do change tracking and requires the implementer to do this for themselves. In case you have never used NHibernate, you will know that it uses transparent proxies in order to do automatic change tracking. Since the entity framework doesn't implement dynamic proxies and also doesn't provide any compile time generated proxies, we are forced to implement this ourselves.

My first reaction was to just use DynamicProxy2 in order to generate proxies and do change tracking like that! This is the same library from the Castle Project that NHibernate uses to create its proxies. The issue immediately came up when I realized that the Entity Framework had no way to construct objects from factory methods. In order to create these proxies I would need to setup my entity like this:

protected User()
{            
}

public static User Create()
{
    var proxy = new ProxyGenerator();
    return proxy.CreateClassProxy<User>(new PropertyChangeInterceptor());
}

Not only would the entity framework not be able to construct these proxies, but it even blew up when I constructed one of these classes on my own and tried to pass it as a new entity to the Entity framework. Clearly the entity framework did not like working with runtime generated proxies. The reason for this is that DynamicProxy2 does its magic by creating a subclass of our "User" class at runtime and then passes that class back as a "User" class. So for the code in your application, it looks exactly like a "User" class. This requires that all of your properties and methods be virtual in order to intercept them though. If you look below you will see how the proxy wraps the calls to the underlying type and allows code to be inserted.

 image

But since the Entity Framework doesn't like this kind of proxy, we are going to have to look elsewhere. So instead of runtime generated proxies, I decided to look into using PostSharp to do some compile time method interception. This is exactly what it sounds like, PostSharp actually modifies your assemblies post-compile and allows you to inject code around methods and field accesses.

What we will first need to do is put methods in our base entity class to report property changes:

public void SetMemberChanging(string member)
{
    if (_changeTracker != null)
    {
        _changeTracker.EntityMemberChanging(member);    
    }            
}

public void SetMemberChanged(string member)
{
    if (_changeTracker != null)
    {
        _changeTracker.EntityMemberChanged(member);    
    }            
}

The "_changeTracker" variable is actually passed into the entity on a method that is implemented by the IEntityWithChange tracker interface. If we were to implement this manually, then the "Username" property would look like this:

[EdmScalarProperty(IsNullable = false)]        
public string Username
{
    get
    {
        return this._Username;
    }
    set
    {                        
        SetMemberChanging("Username");
        this._Username = value;
        SetMemberChanged("Username");
    }
}

And we would have to do this with every single property that we were tracking with the entity framework. Instead with PostSharp we will need to create a class which inherits from an "OnMethodBoundaryAspect". This aspect allows us to inject code around a method call. I would explain this further, but right now this isn't a tutorial for PostSharp, I recommend that you go check it out. This aspect that we are going to create will be applied to our entity classes.

public class ChangeTrackingAspectAttribute: OnMethodBoundaryAspect
{
    public override bool CompileTimeValidate(MethodBase method)
    {            
        if (method.Name.StartsWith("set_"))
        {
            return true;
        }            
        return false;
    }

    public override void OnEntry(MethodExecutionEventArgs eventArgs)
    {
        string propertyName = eventArgs.Method.Name.Substring(4);
        PropertyInfo pi = eventArgs.Instance.GetType().GetProperty(propertyName);
        if (pi.IsDefined(typeof(EdmScalarPropertyAttribute), false))
        {
            var changeTrackingEntity = eventArgs.Instance as Entity;
            if (changeTrackingEntity != null)
            {
                changeTrackingEntity.SetMemberChanging(propertyName);
            }    
        }                        
    }

    public override void OnExit(MethodExecutionEventArgs eventArgs)
    {
        string propertyName = eventArgs.Method.Name.Substring(4);
        PropertyInfo pi = eventArgs.Instance.GetType().GetProperty(propertyName);
        if (pi.IsDefined(typeof(EdmScalarPropertyAttribute), false))
        {
            var changeTrackingEntity = eventArgs.Instance as Entity;
            if (changeTrackingEntity != null)
            {
                changeTrackingEntity.SetMemberChanged(propertyName);
            }
        }            
    }
}

This class looks for methods that start with "set_" and apply this attribute to them. Then at runtime we look for the "EdmScalarAttribute" in order to call the "SetMemberChanging" and "SetMemberChanged" attributes. There could certainly be some more caching and optimizations, but for now this will do. We have implemented some very basic "change tracking" on entities while only placing an attribute on our entity class.

The last interface "IEntityWithRelationships" requires a bit more work to get implemented. It also requires us to dirty up our domain entity a bit more than we have had to so far. In the next entry in this series, I'll show you how I implemented "IEntityWithRelationships" and then provide the full source for the base entity. You'll start to see that you can use the Entity Framework within your own domain, but is all of this work worth it? Well that all depends on why you are using the Entity Framework. If you remember, I started this series because I wanted to see if I could fit the Entity Framework into my own domain and be happy with it. So far the results haven't been too bad, and we'll see in the next post where this will go. Stay tuned!

Making the Entity Framework Fit Your Domain - Part 1

I'm assuming that like myself, many of you out there work for companies that base much of their IT infrastructure (or at least software development tools) around Microsoft products. So, when a new tool like the Entity Framework comes out, even if you are not a fan, you still need to have a solid knowledge of it because you are going to have to use it at some point. At this point most of my ORM experiences have been with NHibernate, but I still feel the need to explore the Entity Framework to see if I can make it palatable for me to use. I say "palatable" because of the fact that the Entity Framework is designed almost entirely around database first design, which is not the way that I like to design my applications.

My goal with this post is not to trash talk the entity framework, but instead to take it as far as I can toward a usable solution that I would be okay with putting into a production application. This post is going to be written as I explore, so please let me know if you see anything that is wrong or missing.

Let's first talk about the domain that we are getting ready to look at. It is going to be a very simple domain, because otherwise it would just overwhelm the blog post by introducing too much complexity. I do want to have enough entities though so that you can see where each technology differs. What we are going to do is start off with a scenario that everyone is familiar with... a user with groups and roles. The user will also have a list of addresses associated with it.

Database Schema

We have 4 main tables along with two join tables. I would explain this schema to you, but if you don't get the schema then this article might be confusing anyway so I'm not going to waste everybody else's time with it. Basically we have already started designing this application in a way that would bother most people who are using DDD. Normally I wouldn't start with the database, but since the Entity Framework essentially forces you into starting with the database first, we are going to take this approach. The reason that I say that the Entity Framework forces you into database first design is because the primary method of generating an EF model is to generate it off the database. And then later on as you make changes, you can then update your model to reflect those changes.

At this point in the process the Entity Framework allows us to get up and running very quickly. We simply add a new Entity Data Model:

image

Then we get a wizard that lets us connect the model to our database and generate our entities. So, in just a few seconds we are looking at this:

Entity Data Model

Kinda cool actually. It knows about our join tables and generates many to many relationships automatically. It doesn't however know about join tables with payloads, but then again there is ambiguity about how we might want that sort of data modeled in our app. So now we have our entities in our Entity Data Model, but where are we really at this point? Well, we are actually already at the point where we can create new entities and save them off to the database.

var user = new User();
user.Username = "TestUser";
user.EmailAddress = "test@test.com";

var address1 = new Address();
address1.Street = "111 Test Street";
address1.City = "Test City";
address1.State = "Virginia";
address1.PostalCode = "22055";

var address2 = new Address();
address2.Street = "222 Test Street";
address2.City = "Test City";
address2.State = "Virginia";
address2.PostalCode = "23000";

user.Addresses.Add(address1);
user.Addresses.Add(address2);

var entities = new TestEFAppEntities();
entities.AddToUserSet(user);
entities.SaveChanges(true);

That was painless, wasn't it? But where did the "User" and "Address" classes come from? I don't remember creating any classes... But that is because we didn't. The Entity framework spit out all of these classes into a file that is hidden under our Entity Model called "Domain.Designer.cs". Here is the user class that was generated for the model (I removed all the comments so that it would only be kinda huge) ;-)

[global::System.Data.Objects.DataClasses.EdmEntityTypeAttribute(NamespaceName="TestEFAppModel", Name="User")]
[global::System.Runtime.Serialization.DataContractAttribute(IsReference=true)]
[global::System.Serializable()]
public partial class User : global::System.Data.Objects.DataClasses.EntityObject
{
    public static User CreateUser(int id, string username, string emailAddress)
    {
        User user = new User();
        user.Id = id;
        user.Username = username;
        user.EmailAddress = emailAddress;
        return user;
    }

    [global::System.Data.Objects.DataClasses.EdmScalarPropertyAttribute(EntityKeyProperty=true, IsNullable=false)]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public int Id
    {
        get
        {
            return this._Id;
        }
        set
        {
            this.OnIdChanging(value);
            this.ReportPropertyChanging("Id");
            this._Id = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value);
            this.ReportPropertyChanged("Id");
            this.OnIdChanged();
        }
    }
    private int _Id;
    partial void OnIdChanging(int value);
    partial void OnIdChanged();

    [global::System.Data.Objects.DataClasses.EdmScalarPropertyAttribute(IsNullable=false)]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public string Username
    {
        get
        {
            return this._Username;
        }
        set
        {
            this.OnUsernameChanging(value);
            this.ReportPropertyChanging("Username");
            this._Username = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value, false);
            this.ReportPropertyChanged("Username");
            this.OnUsernameChanged();
        }
    }
    private string _Username;
    partial void OnUsernameChanging(string value);
    partial void OnUsernameChanged();

    [global::System.Data.Objects.DataClasses.EdmScalarPropertyAttribute(IsNullable=false)]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public string EmailAddress
    {
        get
        {
            return this._EmailAddress;
        }
        set
        {
            this.OnEmailAddressChanging(value);
            this.ReportPropertyChanging("EmailAddress");
            this._EmailAddress = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value, false);
            this.ReportPropertyChanged("EmailAddress");
            this.OnEmailAddressChanged();
        }
    }
    private string _EmailAddress;
    partial void OnEmailAddressChanging(string value);
    partial void OnEmailAddressChanged();

    [global::System.Data.Objects.DataClasses.EdmRelationshipNavigationPropertyAttribute("TestEFAppModel", "FK_Addresses_Users", "Addresses")]
    [global::System.Xml.Serialization.XmlIgnoreAttribute()]
    [global::System.Xml.Serialization.SoapIgnoreAttribute()]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public global::System.Data.Objects.DataClasses.EntityCollection<Address> Addresses
    {
        get
        {
            return ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.GetRelatedCollection<Address>("TestEFAppModel.FK_Addresses_Users", "Addresses");
        }
        set
        {
            if ((value != null))
            {
                ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.InitializeRelatedCollection<Address>("TestEFAppModel.FK_Addresses_Users", "Addresses", value);
            }
        }
    }

    [global::System.Data.Objects.DataClasses.EdmRelationshipNavigationPropertyAttribute("TestEFAppModel", "UserXGroups", "Groups")]
    [global::System.Xml.Serialization.XmlIgnoreAttribute()]
    [global::System.Xml.Serialization.SoapIgnoreAttribute()]
    [global::System.Runtime.Serialization.DataMemberAttribute()]
    public global::System.Data.Objects.DataClasses.EntityCollection<Group> Groups
    {
        get
        {
            return ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.GetRelatedCollection<Group>("TestEFAppModel.UserXGroups", "Groups");
        }
        set
        {
            if ((value != null))
            {
                ((global::System.Data.Objects.DataClasses.IEntityWithRelationships)(this)).RelationshipManager.InitializeRelatedCollection<Group>("TestEFAppModel.UserXGroups", "Groups", value);
            }
        }
    }
}

Hmmmm.... so where is my domain object in there? In fact, all of the domain objects are generated into a single file like this. You get to extend your domain objects by using partial classes. The partial classes that we implement allow us to take advantage of some partial methods that are in the generated classes. As you can see from this code in one of the above setters:

set
{
    this.OnUsernameChanging(value);
    this.ReportPropertyChanging("Username");
    this._Username = global::System.Data.Objects.DataClasses.StructuralObject.SetValidValue(value, false);
    this.ReportPropertyChanged("Username");
    this.OnUsernameChanged();
}

We have two partial methods "OnUsernameChanging" and "OnUsernameChanged" along with two events "ReportPropertyChanging" and "ReportPropertyChanged". So when we create our partial classes we can tap into these methods. This way if we wanted to intercept the setting of our Username property we could implement a partial class like this (if you haven't ever used partial classes, then go here):

public partial class User
{
    partial void OnUsernameChanging(string value)
    {            
        // check value and do something here
    }
}

But what if we want to do something when the property is retrieved? We don't really have a lot of options, the getter in the user partial class looks like this:

get
{
    return this._Username;
}

Hmmm. So I guess they don't want us to hook into the getter! Having user code hooked into the getter might have caused issues at some infrastructure level for them, so I'm going to give them the benefit of the doubt! But not too much. At first I thought that maybe I could go to the EDM (Entity Data Model) and make the getter and setter on the property as private or protected. Then rename the property to something like "UsernameInternal". Then I could introduce a completely new property in the partial class to expose this property:

public string Username
{
    get
    {
        return this.UsernameInternal;
    }
    set
    {
        this.UsernameInternal = value;
    }
}

The only problem is that because we have wrapped the EDM generated property, we can no longer query against it! How would I write this query?

var userQuery = from u in entities.UserSet 
    where u.Username == "TestUser" 
    select u;

Sure we can put "Username" in the query because we have exposed our own property, but since this property is not in the EDM, the Entity Framework has no knowledge of this property so we can't query against it! Dang. And to think I was going to try and use this same technique to implement some lazy loading. Hmmmmmmm. So if we want to be able to query against any property then we are going to have to directly expose the property that has the "EdmScalarPropertyAttribute" on it. This is the attribute that signifies to the Entity Framework that this property is a mapped scalar property.

This property has to share the name of the property tag in our CSDL file. Sadly, in order to get any control over my classes, it looks like I am going to have to ditch the Entity Designer altogether. Which isn't necessarily a bad thing considering that all of the xml and class files are generated into only two files which creates an interesting situation for teams that are editing these files independently. Definitely a merge nightmare. What blows my mind is that they also actually store the metadata for the EDM diagram right in with the mapping xml.

Before you start thinking that I ditched the Entity Designer too soon, another issue here is that all of the generated entities descend from a base EntityObject class that keeps us from forming our own object heirarchy. And we can't edit any of the generated classes or else all of our changes will be overridden whenever we make a change to it. Another issue is that we don't really have any control over our object lifetime. The Entity Designer spits out default public constructors and a default factory method containing parameters for all properties of the object. Why? I don't understand why they just didn't leave off the constructor and factory method and let me define them in my partial classes. I can't remove them from the generated classes, but I could easily add them if they weren't there. Geeeez.

Alright, so I mentioned that we need to ditch the EDM designer, so what do we use now? Thankfully Microsoft provided us a tool called EdmGen.exe. This is a command-line tool that we can use to generate the mapping files for our database:

 image

EdmGen actually spits out different files for everything. And when I say "everything" I mean only the different file types. So we get these files:

image

What we are going to do here is ditch the ObjectLayer.cs and Views.cs files and create our own Entities. Hopefully we can isolate most of the EF specific code into a base entity class. But since we are going to remove generated entities we are going to have to keep the behavior that the previously generated classes had. This is where that IPOCO stuff comes in. IPOCO allows you to implement a few interfaces (in our case three) instead of using the base EntityObject class. Our base entity class' definition will end up looking like this:

public class Entity : IEntityWithKey, IEntityWithChangeTracker, IEntityWithRelationships

These are the three interfaces that we must implement in order to use this with our EDM. The first just gives the entity framework something to identify your entities by, the second provides a change tracking mechanism, and the third provides a way for your entity to hold its relationships. These are fairly self explanatory, and luckily we can isolate most of the behavior for these into our base class. So our goal is to have business entities that lack Entity Framework specific details, and which expose no Entity Framework specific types. This post is getting  a bit long, so I am going to leave it off right there for now...

In future parts of this series we will break down the base entity class that I have created and take a look at the different entities and how we can create them. We will also take a look at creating an ObjectContext and show you how we can use and query these entities just like we could if we created them through the EDM designer. I will also provide you with the full source to the project that I am using in this series, so stay tuned!

The Data Disconnect

I saw yesterday on Twitter that Sam Gentile had responded to Oren Eini's "Impedence Mismatch and System Evolution" post that was actually a response to Stephen Forte's "Impedance Mismatch" post. Phew, you got that? That is why I love the blogger world, it is like one big distributed, asynchronous conversation! In these posts there was a line stated by Stephen Forte that Oren had a problem with:

My first problem with ORMs in general is that they force you into a "objects first" box. Design your application and then click a button and magically all the data modeling and data access code will work itself out. This is wrong because it makes you very application centric and a lot of times a database model is going to support far more than your application.

Oren then went on to say that he "absolutely rejected" the statement that the database model is going to support far more than your application. Sam Gentile then chimed in that he fully agreed with this statement. I also agree with Oren's statement, but that isn't really why I am writing this blog post. I am writing this blog post because I think that we are at the edge of a huge schism in the data-centric application development community.

If you read Stephen Forte's post you will see that it was actually started by talking about the Entity Framework, and I think that is an important note to make. Why? Because these tools power is manifested by a fundamental shift in who owns an application's data. So, who does own the application's data? Well, I think that you can tell which way I lean just by the way I worded my last sentence. In my mind, the data is owned by the application that is reading and writing the data, and hopefully most people will agree with this. But what happens when you have numerous applications reading and writing the same data? Well, you have to move the control over the data into the database because the database becomes the single point at which all data passes through.

Historically stored procedures have been viewed as the data gatekeepers. You put a bunch of stored procs in your database and they shielded your tables from bad input, incorrect or incomplete data, etc... In fact many people have argued for putting business logic in stored procedures, because how else would you make sure valid data is entering the database if you have a multitude of applications hitting these stored procs? Well, you can't really. If you don't put business logic into your stored procs, then your data is only as good as the individual application that is entering the data. The proverbial "weakest link" problem.

How do you combat this? It is simple really, you move from a database-centric view, to a an application-centric view. The very thing that Stephen above was speaking out against. And I know that there are going to be people out there that are going to throw their hands up in the air and say "blasphemy!" when they read that. But I am not arguing that we throw the database out, or marginalize its importance, I am saying that the single point at which all data passes through needs to be pushed up. There needs to be application code responsible for transforming and validating data. In fact, most well thought out applications are using an architecture like this already. They have an abstraction layer that sits between the database and your domain model that shields you from schema changes and they also have business logic that ensures good data. The issue is that there may be several applications hitting a database, each with their own abstraction layers that are operating directly against the database schema.

All of that data should be funneled through a single application! If you see the need for two applications to be hitting the same schema, then you should at least define a thin service layer between the database and those two applications. Even if the applications need the same schema now, they most likely won't need the same schema a few months, or years, from now. Business logic just doesn't work like that.

I am not advocating anything new here. People have been discussing it for years, it is called Service Oriented Architecture. And yes, I know that SOA is probably the most overloaded term in the history of software development, but in its most basic form it represents an architecture where you have a bunch of loosely connected services. So, what are those services? Well, they could take many forms, but they are all going to be some application that is sitting in front of a data back-end. And in this layer, all of our bajillion lines of business and application logic reside. The data is abstracted behind many layers so that we can mold, transform, and even query from multiple sources. All of this is hidden from the application or person that requests the data, just as it should be.

If you have a single rich layer of business and application logic that can be as thin or as thick as you need it, that can be changed and molded to produce the data you need, and can keep knowledge of your schema out of the hands of many other applications, then how can this be a bad thing? In my mind I would think that even the data people would love this, because it means that the database schema can be much more flexible and can change more rapidly to suit the needs of the company. Or not change at all, since the service layer can transform the data, if the needs of the enterprise dictate this.

But where do the ORM tools come into this? If you are looking at a database as the center of the application, with multiple applications hitting the same database, then an ORM solution would probably look less realistic for you. ORMs biggest advantages come into play when they are generating their own SQL (so as to avoid having to maintain separate stored procs) and when you let the database schema stay relatively simple and have the translation layer massage the data into the exact format you need. These goals can often conflict with the database-centric view of an application, which likes to keep more of this control inside the database.

So, is the data divide that I am talking about just an argument between the data and service centric approaches? That is the way that I see it. Did I think that we would still be arguing this point in 2008? Nope.