Showing posts with label IRepository. Show all posts
Showing posts with label IRepository. Show all posts

Friday, August 08, 2008

The Queryable Domain Property Problem

LINQ has revolutionised the way we do data access. Being able to fluently describe queries in C# means that you never have to write a single line of SQL again. Of course LINQ isn't the only game in town. NHibernate has a rich API for describing queries as do most mature ORM tools. But to be a player in the .NET ORM game you simply have to provide a LINQ IQueryable API. It's been really nice to see the NHibernate-to-LINQ project take off and apparently LLBLGen Pro has an excellent LINQ implementation too.

Now that we can write our queries in C# it should mean that we can have completely DRY business logic. No more duplicate rules, one set in SQL, the other in the domain classes. But there's a problem: LINQ doesn't understand IL. If you write a query that includes a property or method, LINQ-to-SQL can't turn the logic encapsulated by it into a SQL statement.

To illustrate the problem take this simple schema for an order:

queriable_scema

Let's use the LINQ-to-SQL designer to create some classes:

queriable_classes

Now lets create a 'Total' property for the order that calculates the total by summing the order lines' quantities times their product's price.

public decimal Total
{
get
{
    return OrderLines.Sum(line => line.Quantity * line.Product.Price);
}
}

Here's a test to demonstrate that it works

[Test]
public void Total_ShouldCalculateCorrectTotal()
{
const decimal expectedTotal = 23.21m + 14.30m * 2 + 7.20m * 3;

var widget = new Product { Price = 23.21m };
var gadget = new Product { Price = 14.30m };
var wotsit = new Product { Price = 7.20m };

var order = new Order
{
    OrderLines =
    {
        new OrderLine { Quantity = 1, Product = widget},
        new OrderLine { Quantity = 2, Product = gadget},
        new OrderLine { Quantity = 3, Product = wotsit}
    }
};

Assert.That(order.Total, Is.EqualTo(expectedTotal));
}

Now, what happens when we use the Total property in a LINQ query like this:

[Test]
public void Total_ShouldCalculateCorrectTotalOfItemsInDb()
{
var total = dataContext.Orders.Select(order => order.Total).First();
Assert.That(total, Is.EqualTo(expectedTotal));
}

The test passes, but when we look at the SQL that was generated by LINQ-to-SQL we get this:

SELECT TOP (1) [t0].[Id]
FROM [dbo].[Order] AS [t0]

SELECT [t0].[Id], [t0].[OrderId], [t0].[Quantity], [t0].[ProductId]
FROM [dbo].[OrderLine] AS [t0]
WHERE [t0].[OrderId] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

SELECT [t0].[Id], [t0].[Price]
FROM [dbo].[Product] AS [t0]
WHERE [t0].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

SELECT [t0].[Id], [t0].[Price]
FROM [dbo].[Product] AS [t0]
WHERE [t0].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [2]

SELECT [t0].[Id], [t0].[Price]
FROM [dbo].[Product] AS [t0]
WHERE [t0].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [3]

LINQ-to-SQL doesn't know anything about the Total property, so it does as much as it can. It loads the Order. When the Total property executes, OrderLines is evaluated which causes the order lines to be loaded with a single select statement. Next each Product property of each OrderLine is evaluated in turn causing each Product to be selected individually. So we've had five SQL statements executed and the entire Order object graph loaded into memory just to find out the order total. Yes of course we could add data load options to eagerly load the entire object graph with one query, but we would still end up with the entire object graph in memory. If all we wanted was the order total this is very inefficient.

Now, if we construct a query where we explicitly ask for the sum of order line quantities times product prices, like this:

[Test]
public void CalculateTotalWithQuery()
{
var total = dataContext.OrderLines
    .Where(line => line.Order.Id == 1)
    .Sum(line => line.Quantity * line.Product.Price);

Assert.That(total, Is.EqualTo(expectedTotal));
}

We get this SQL

SELECT SUM([t3].[value]) AS [value]
FROM (
SELECT (CONVERT(Decimal(29,4),[t0].[Quantity])) * [t2].[Price] AS [value], [t1].[Id]
FROM [dbo].[OrderLine] AS [t0]
INNER JOIN [dbo].[Order] AS [t1] ON [t1].[Id] = [t0].[OrderId]
INNER JOIN [dbo].[Product] AS [t2] ON [t2].[Id] = [t0].[ProductId]
) AS [t3]
WHERE [t3].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

One SQL statement has been created that returns a scalar value for the total. Much better. But now we've got duplicate business logic. We have definition of the order total calculation in the Total property of Order and another in the our query.

So what's the solution?

What we need is a way of creating our business logic in a single place that we can use in both our domain properties and in our queries. This brings me to two guys who have done some excellent work in trying to solve this problem: Fredrik Kalseth and Luke Marshall. I'm going to show you Luke's solution which is detailed in this series of blog posts.

It's based on the specification pattern. If you've not come across this before, Ian Cooper has a great description here. The idea with specifications is that you factor out your domain business logic into small composable classes. You can then test small bits of business logic in isolation and then compose them to create more complex rules; because we all know that rules rely on rules :)

The neat trick is to implement the specification as a lambda expression that can be executed against in-memory object graphs or inserted into an expression tree to be compiled into SQL.

Here's our Total property as a specification, or as Luke calls it, QueryProperty.

static readonly TotalProperty total = new TotalProperty();

[QueryProperty(typeof(TotalProperty))]
public decimal Total
{
get
{
    return total.Value(this);
}
}

class TotalProperty : QueryProperty<Order, decimal>
{
public TotalProperty()
    : base(order => order.OrderLines.Sum(line => line.Quantity * line.Product.Price))
{
}
}

We factored out the Total calculation into a specification called TotalProperty which passes the rule into the constructor of the QueryProperty base class. We also have a static instance of the TotalProperty specification. This is simply for performance reasons and acts a specification cache. Then in the Total property getter we ask the specification to calculate its value for the current instance.

Note that the Total property is decorated with a QueryPropertyAttribute. This is so that the custom query provider can recognise that this property also supplies a lambda expression via its specification, which is the type specified in the attribute constructor. This is the main weakness of this approach because there's an obvious error waiting to happen. The type passed in the QueryPropertyAttribute has to match the type of the specification. It's also very invasive since we have various bits of the framework (QueryProperty, QueryPropertyAttribute) surfacing in our domain code.

These days simply everyone has a generic repository and Luke is no different. His repository chains a custom query provider before the LINQ-to-SQL query provider that knows how to insert the specification expressions into the expression tree. We can use the repository like this:

[Test]
public void TotalQueryUsingRepository()
{
var repository = new RepositoryDatabase<Order>(dataContext);

var total = repository.AsQueryable().Select(order => order.Total).First();
Assert.That(total, Is.EqualTo(expectedTotal));
}

Note how the LINQ expression is exactly the same as one we ran above which caused five select statements to be executed and the entire Order object graph to be loaded into memory. When we run this new test we get this SQL:

SELECT TOP (1) [t4].[value]
FROM [dbo].[Order] AS [t0]
OUTER APPLY (
SELECT SUM([t3].[value]) AS [value]
FROM (
    SELECT (CONVERT(Decimal(29,4),[t1].[Quantity])) * [t2].[Price] AS [value], [t1].[OrderId]
    FROM [dbo].[OrderLine] AS [t1]
    INNER JOIN [dbo].[Product] AS [t2] ON [t2].[Id] = [t1].[ProductId]
    ) AS [t3]
WHERE [t3].[OrderId] = [t0].[Id]
) AS [t4]

A single select statement that returns a scalar value for the total. It's very nice, and with the caveats above it's by far the nicest solution to this problem that I've seen yet.

Wednesday, April 30, 2008

Generic reordering

Do you have items in your application that need to be ordered in some arbitrary, user defined order? Countries could very well be ordered alphabetically, but let's assume that one of our requirements is for arbitrarily ordered countries. Here's a user interface for this. The little green up and down arrows can be clicked to reorder the countries:

IOrderable

Here's the country table, note that it's got a position field that tells us the position of the country.

CountryTable

So, when I click an up arrow I just need to find the country above's position number and swap the current position number with it. This is all standard stuff. But what I want to show you today is how we can make a generic ordering component that can do all this stuff for us for any entity that implements an IOrderable interface:

public interface IOrderable
{
   int Position { get; set; }
}

Here's a snippet of my CountryController, note that I've got an instance of IOrderableService<Country> and two actions, 'MoveUp' and 'MoveDown'. All these two actions have to do is call MoveItemAtPosition(x).UpOne() or DownOne() in order to reorder the country.

public class CountryController : ControllerBase
{
 IOrderableService<Country> countryOrderableService;

 public CountryController(IOrderableService<Country> countryOrderableService)
 {
  this.countryOrderableService = countryOrderableService;
 }

 public ActionResult MoveUp(int id)
 {
  countryOrderableService.MoveItemAtPosition(id).UpOne();
  return RenderIndexView();
 }

 public ActionResult MoveDown(int id)
 {
  countryOrderableService.MoveItemAtPosition(id).DownOne();
  return RenderIndexView();
 }
}

This is one of the cool side effects of having a generic repository. See my previous post on Using the IRepository pattern with LINQ to SQL. Because of the generic repository we can write a generic ordering service for any entity.

Now consider these categories:

Categories

These are orderable but also nested. We only want to order an item within it's nested level. Here's a snippet of the category controller:

public class CategoryController : ControllerBase
{
   IOrderableService<Category> orderableService;

   public CategoryController(
       IOrderableService<Category> orderableService)
   {
       this.orderableService = orderableService;
   }

   public ActionResult MoveUp(int id)
   {
       Category category = categoryRepository.GetById(id);
       orderableService
           .MoveItemAtPosition(category.Position)
           .ConstrainedBy(c => c.ParentId == category.ParentId).UpOne();
       return RenderIndexView();
   }

   public ActionResult MoveDown(int id)
   {
       Category category = categoryRepository.GetById(id);
       orderableService
           .MoveItemAtPosition(category.Position)
           .ConstrainedBy(c => c.ParentId == category.ParentId).DownOne();
       return RenderIndexView();
   }
}

As you can see, this time we've got the same IOrderableService<T>, but now we're using ConstrainedBy and telling the service that we only want to reorder within the set of categories that have the same parent id. Since the constraint is a lambda expression we can have considerable scope for defining the constraint. In another part of the application products are constrained by their categoryId when reordering.

All the code for these examples is in my suteki shop application that you can check out from google code here:

http://code.google.com/p/sutekishop/

Have a look at the OrderableService and Move classes. You can see Suteki Shop in action here:

http://sutekishop.co.uk/home

Happy ordering!

Friday, April 11, 2008

MVC Storefront

Rob Conery, the genius behind Subsonic, and now assimilated by the Borg, has just started a series of posts describing the process of building an eCommerce application using the MVC Framework. What's really cool about it is that he's soliciting input from some well known figures in the community as he evolves the project, so in the first screencast of the series he talked to Ayende and Steve Harman about some initial architectural choices including the repository pattern.

So far there are three episodes:

ASP.NET MVC: Introducing The MVC Storefront Series

ASP.NET MVC: MVC Storefront, Part 2

ASP.NET MVC: MVC Storefront, Part 3

And the source is available on Codeplex here.

The brilliant thing is, I'm currently working on an MVC eCommerce application as well. It's called sutekishop. It has a real first customer who's actually paying for it, but it's also open source. I've set up a Google code project here:

http://code.google.com/p/sutekishop/

I'm going to be keeping a very close eye on Rob's progress and incorporating as many of his ideas as possible into my project. So far it's very early days and I've only got a repository and the IoC container set up, but I'll be posting on my progress, so watch this space.

Thursday, April 03, 2008

Repository.GetById using LINQ Expression Syntax

A while ago I talked about using the IRepository<T> pattern with LINQ to SQL. One of the methods of my repository is GetById. The slight difficulty here is that we need to discover the primary key property of the (generated) type at runtime so we can't use a vanilla lambda expression in the Where clause. Before I used the DynamicQueriable helper library, but I've just re-written the function using an explicit expression tree which removes the need to reference the helper library.

public T GetById(int id)
{
    var itemParameter = Expression.Parameter(typeof(T), "item");

    var whereExpression = Expression.Lambda<Func<T, bool>>
        (
            Expression.Equal(
                Expression.Property(
                    itemParameter,
                    typeof(T).GetPrimaryKey().Name
                ),
                Expression.Constant(id)
            ),
            new ParameterExpression[] { itemParameter }
        );

    return dataContext.GetTable<T>().Where(whereExpression).Single();
}

Here is the extension method that finds the primary key:

public static PropertyInfo GetPrimaryKey(this Type entityType)
{
    foreach (PropertyInfo property in entityType.GetProperties())
    {
        ColumnAttribute[] attributes = (ColumnAttribute[])property.GetCustomAttributes(typeof(ColumnAttribute), true);
        if (attributes.Length == 1)
        {
            ColumnAttribute columnAttribute = attributes[0];
            if (columnAttribute.IsPrimaryKey)
            {
                if (property.PropertyType != typeof(int))
                {
                    throw new ApplicationException(string.Format("Primary key, '{0}', of type '{1}' is not int",
                        property.Name, entityType));
                }
                return property;
            }
        }
    }
    throw new ApplicationException(string.Format("No primary key defined for type {0}", entityType.Name));
}

Thursday, March 20, 2008

Using the IRepository pattern with LINQ to SQL

27th March 2009. It's now a year since I wrote this post. Thanks to some comments by Janus2007 I've realised that it needs updating. I've replaced the code with the current version of the Suteki Shop LINQ generic repository. There are a number of changes in the way it works. The most obvious, and one that I should have updated a long time ago is the GetAll method returns  an IQueryable<T> rather than an array. I actually changed this soon after I wrote the post, but totally forgot about the naive implementation given here.

The other major change is marking the SubmitChanges methods as obsolete. Jeremy Skinner, who has been doing some excellent work on Suteki Shop has pushed this change. UoW (DataContext) management is now handled by attributes on action methods.

Please have a look at the Suteki Shop code to see the generic repository in action:

LINQ to SQL is a quantum leap in productivity for most mainstream .NET developers. Some folks may have been using NHibernate or some other ORM tool for years, but my experience in a number of .NET shops has been that the majority of developers still hand code their data access layer. LINQ is going to bring some fundamental changes to the way we architect our applications. Especially being able to write query style syntax directly in C# against both a SQL Server database and against in memory object graphs begs some interesting questions about application architecture.

So where is our point of separation between data access and domain? Surely I'm not recommending that we abandon a layered architecture and write all our data access directly into our domain classes?

My current project is based on the new MVC Framework. I've been using LINQ to SQL for data access as well as an IoC container (Windsor) and NUnit plus Rhino Mocks for testing. For my data access layer I've used the IRepository pattern popularized by Ayende in his excellent MSDN article on IoC and DI. My Repository looks like this:

using System;
using System.Linq;
using System.Linq.Expressions;
using System.Data.Linq;
using Suteki.Common.Extensions;
namespace Suteki.Common.Repositories
{
    public interface IRepository<T> where T : class
    {
        T GetById(int id);
        IQueryable<T> GetAll();
        void InsertOnSubmit(T entity);
        void DeleteOnSubmit(T entity);
		[Obsolete("Units of Work should be managed externally to the Repository.")]
        void SubmitChanges();
    }
    public interface IRepository
    {
        object GetById(int id);
        IQueryable GetAll();
        void InsertOnSubmit(object entity);
        void DeleteOnSubmit(object entity);
		[Obsolete("Units of Work should be managed externally to the Repository.")]
        void SubmitChanges();
    }
    public class Repository<T> : IRepository<T>, IRepository where T : class
    {
        readonly DataContext dataContext;
        public Repository(IDataContextProvider dataContextProvider)
        {
            dataContext = dataContextProvider.DataContext;
        }
        public virtual T GetById(int id)
        {
            var itemParameter = Expression.Parameter(typeof(T), "item");
            var whereExpression = Expression.Lambda<Func<T, bool>>
                (
                Expression.Equal(
                    Expression.Property(
                        itemParameter,
                        typeof(T).GetPrimaryKey().Name
                        ),
                    Expression.Constant(id)
                    ),
                new[] { itemParameter }
                );
            return GetAll().Where(whereExpression).Single();
        }
        public virtual IQueryable<T> GetAll()
        {
            return dataContext.GetTable<T>();
        }
        public virtual void InsertOnSubmit(T entity)
        {
            GetTable().InsertOnSubmit(entity);
        }
        public virtual void DeleteOnSubmit(T entity)
        {
            GetTable().DeleteOnSubmit(entity);
        }
        public virtual void SubmitChanges()
        {
            dataContext.SubmitChanges();
        }
        public virtual ITable GetTable()
        {
            return dataContext.GetTable<T>();
        }
        IQueryable IRepository.GetAll()
        {
            return GetAll();
        }
        void IRepository.InsertOnSubmit(object entity)
        {
            InsertOnSubmit((T)entity);
        }
        void IRepository.DeleteOnSubmit(object entity)
        {
            DeleteOnSubmit((T)entity);
        }
        object IRepository.GetById(int id)
        {
            return GetById(id);
        }
    }
}

As you can see, this generic repository insulates the rest of the application from the LINQ to SQL DataContext and provides basic data access methods for any domain class. Here's an example of it being used in a simple controller.

using System.Web.Mvc;
using Suteki.Common.Binders;
using Suteki.Common.Filters;
using Suteki.Common.Repositories;
using Suteki.Common.Validation;
using Suteki.Shop.Filters;
using Suteki.Shop.Services;
using Suteki.Shop.ViewData;
using Suteki.Shop.Repositories;
using MvcContrib;
namespace Suteki.Shop.Controllers
{
	[AdministratorsOnly]
    public class UserController : ControllerBase
    {
        readonly IRepository<User> userRepository;
        readonly IRepository<Role> roleRepository;
    	private readonly IUserService userService;
    	public UserController(IRepository<User> userRepository, IRepository<Role> roleRepository, IUserService userService)
        {
            this.userRepository = userRepository;
            this.roleRepository = roleRepository;
        	this.userService = userService;
        }
        public ActionResult Index()
        {
            var users = userRepository.GetAll().Editable();
            return View("Index", ShopView.Data.WithUsers(users));
        }
        public ActionResult New()
        {
            return View("Edit", EditViewData.WithUser(Shop.User.DefaultUser));
        }
		[AcceptVerbs(HttpVerbs.Post), UnitOfWork]
		public ActionResult New(User user, string password)
		{
			if(! string.IsNullOrEmpty(password))
			{
				user.Password = userService.HashPassword(password);
			}
			try
			{
				user.Validate();
			}
			catch(ValidationException ex)
			{
				ex.CopyToModelState(ModelState, "user");
				return View("Edit", EditViewData.WithUser(user));
			}
			userRepository.InsertOnSubmit(user);
			Message = "User has been added.";
			return this.RedirectToAction(c => c.Index());
		}
        public ActionResult Edit(int id)
        {
            User user = userRepository.GetById(id);
            return View("Edit", EditViewData.WithUser(user));
        }
		[AcceptVerbs(HttpVerbs.Post), UnitOfWork]
		public ActionResult Edit([DataBind] User user, string password)
		{
			if(! string.IsNullOrEmpty(password))
			{
				user.Password = userService.HashPassword(password);
			}
			try
			{
				user.Validate();
			}
			catch (ValidationException validationException) 
			{
				validationException.CopyToModelState(ModelState, "user");
				return View("Edit", EditViewData.WithUser(user));
			}
			return View("Edit", EditViewData.WithUser(user).WithMessage("Changes have been saved")); 
		}
        public ShopViewData EditViewData
        {
            get
            {
                return ShopView.Data.WithRoles(roleRepository.GetAll());
            }
        }
    }
}

Because I'm using an IoC container I don't have to do any more than request an instance of IRepository<User> in the constructor and because the Windsor Container understands generics I only have a single configuration entry for all my generic repositories:

<?xml version="1.0"?>
<configuration>
  <!-- windsor configuration. 
  This is a web application, all components must have a lifesytle of 'transient' or 'preWebRequest' -->
  <components>
    <!-- repositories -->
    <!-- data context provider (this must have a lifestyle of 'perWebRequest' to allow the same data context
    to be used by all repositories) -->
    <component
      id="datacontextprovider"
      service="Suteki.Common.Repositories.IDataContextProvider, Suteki.Common"
      type="Suteki.Common.Repositories.DataContextProvider, Suteki.Common"
      lifestyle="perWebRequest"
     />
	<component
		id="menu.repository" 
		service="Suteki.Common.Repositories.IRepository`1[[Suteki.Shop.Menu, Suteki.Shop]], Suteki.Common"
		type="Suteki.Shop.Models.MenuRepository, Suteki.Shop"
		lifestyle="transient"
		/>
		
    <component
      id="generic.repository"
      service="Suteki.Common.Repositories.IRepository`1, Suteki.Common"
      type="Suteki.Common.Repositories.Repository`1, Suteki.Common"
      lifestyle="transient" />
    
	....
  </components>
  
</configuration>

The IoC Container also provides the DataContext. Note that the DataContext's lifestyle is perWebRequest. This means that a single DataContext is shared between all the repositories in a single request.

To test my UserController I can pass an object graph from a mock IRepository<User>.

[Test]
public void IndexShouldDisplayListOfUsers()
{
    User[] users = new User[] { };
    UserListViewData viewData = null;

    using (mocks.Record())
    {
        Expect.Call(userRepository.GetAll()).Return(users);

        userController.RenderView(null, null);
        LastCall.Callback(new Func<string, string, object, bool>((v, m, vd) => 
        {
            viewData = (UserListViewData)vd;
            return true;
        }));
    }

    using (mocks.Playback())
    {
        userController.Index();
        Assert.AreSame(users, viewData.Users);
    }
}

Established layered architecture patterns insulate the domain model (business objects) of a database from the data access code by layering that code into a data access layer that provides services for persisting and de-persisting objects from and to the database. This layered approach becomes essential as soon as you start doing Test Driven Development which requires you to test you code in isolation from your database.

So is LINQ data access code? I don't think so. Because the syntax for querying in-memory object graphs is identical to that for querying the database it makes sense to place LINQ queries in your domain layer. During testing, the component under test can work with in memory object graphs, but when integrated with the data access layer those same queries become SQL queries to a database.

Here's a simple example. I've got the canonical Customer->Orders->OrderLines business entities. Now say I get an Customer from my IRepository<Customer>, I can then query my Customer's Orders using LINQ inside my controller:

Customer customer = customerRepository.GetById(customerId);
int numberOfOrders = customer.Orders.Count(); // LINQ

I unit test this by returning a fully formed Customer with Orders from my mock customerRepository, but when I run this code against a concrete repository LINQ to SQL will construct a SQL statement, something like:

SELECT COUNT(*) FROM Order WHERE CustomerId = 34.