Thursday, March 20, 2008

Using the IRepository pattern with LINQ to SQL

27th March 2009. It's now a year since I wrote this post. Thanks to some comments by Janus2007 I've realised that it needs updating. I've replaced the code with the current version of the Suteki Shop LINQ generic repository. There are a number of changes in the way it works. The most obvious, and one that I should have updated a long time ago is the GetAll method returns  an IQueryable<T> rather than an array. I actually changed this soon after I wrote the post, but totally forgot about the naive implementation given here.

The other major change is marking the SubmitChanges methods as obsolete. Jeremy Skinner, who has been doing some excellent work on Suteki Shop has pushed this change. UoW (DataContext) management is now handled by attributes on action methods.

Please have a look at the Suteki Shop code to see the generic repository in action:

LINQ to SQL is a quantum leap in productivity for most mainstream .NET developers. Some folks may have been using NHibernate or some other ORM tool for years, but my experience in a number of .NET shops has been that the majority of developers still hand code their data access layer. LINQ is going to bring some fundamental changes to the way we architect our applications. Especially being able to write query style syntax directly in C# against both a SQL Server database and against in memory object graphs begs some interesting questions about application architecture.

So where is our point of separation between data access and domain? Surely I'm not recommending that we abandon a layered architecture and write all our data access directly into our domain classes?

My current project is based on the new MVC Framework. I've been using LINQ to SQL for data access as well as an IoC container (Windsor) and NUnit plus Rhino Mocks for testing. For my data access layer I've used the IRepository pattern popularized by Ayende in his excellent MSDN article on IoC and DI. My Repository looks like this:

using System;
using System.Linq;
using System.Linq.Expressions;
using System.Data.Linq;
using Suteki.Common.Extensions;
namespace Suteki.Common.Repositories
{
    public interface IRepository<T> where T : class
    {
        T GetById(int id);
        IQueryable<T> GetAll();
        void InsertOnSubmit(T entity);
        void DeleteOnSubmit(T entity);
		[Obsolete("Units of Work should be managed externally to the Repository.")]
        void SubmitChanges();
    }
    public interface IRepository
    {
        object GetById(int id);
        IQueryable GetAll();
        void InsertOnSubmit(object entity);
        void DeleteOnSubmit(object entity);
		[Obsolete("Units of Work should be managed externally to the Repository.")]
        void SubmitChanges();
    }
    public class Repository<T> : IRepository<T>, IRepository where T : class
    {
        readonly DataContext dataContext;
        public Repository(IDataContextProvider dataContextProvider)
        {
            dataContext = dataContextProvider.DataContext;
        }
        public virtual T GetById(int id)
        {
            var itemParameter = Expression.Parameter(typeof(T), "item");
            var whereExpression = Expression.Lambda<Func<T, bool>>
                (
                Expression.Equal(
                    Expression.Property(
                        itemParameter,
                        typeof(T).GetPrimaryKey().Name
                        ),
                    Expression.Constant(id)
                    ),
                new[] { itemParameter }
                );
            return GetAll().Where(whereExpression).Single();
        }
        public virtual IQueryable<T> GetAll()
        {
            return dataContext.GetTable<T>();
        }
        public virtual void InsertOnSubmit(T entity)
        {
            GetTable().InsertOnSubmit(entity);
        }
        public virtual void DeleteOnSubmit(T entity)
        {
            GetTable().DeleteOnSubmit(entity);
        }
        public virtual void SubmitChanges()
        {
            dataContext.SubmitChanges();
        }
        public virtual ITable GetTable()
        {
            return dataContext.GetTable<T>();
        }
        IQueryable IRepository.GetAll()
        {
            return GetAll();
        }
        void IRepository.InsertOnSubmit(object entity)
        {
            InsertOnSubmit((T)entity);
        }
        void IRepository.DeleteOnSubmit(object entity)
        {
            DeleteOnSubmit((T)entity);
        }
        object IRepository.GetById(int id)
        {
            return GetById(id);
        }
    }
}

As you can see, this generic repository insulates the rest of the application from the LINQ to SQL DataContext and provides basic data access methods for any domain class. Here's an example of it being used in a simple controller.

using System.Web.Mvc;
using Suteki.Common.Binders;
using Suteki.Common.Filters;
using Suteki.Common.Repositories;
using Suteki.Common.Validation;
using Suteki.Shop.Filters;
using Suteki.Shop.Services;
using Suteki.Shop.ViewData;
using Suteki.Shop.Repositories;
using MvcContrib;
namespace Suteki.Shop.Controllers
{
	[AdministratorsOnly]
    public class UserController : ControllerBase
    {
        readonly IRepository<User> userRepository;
        readonly IRepository<Role> roleRepository;
    	private readonly IUserService userService;
    	public UserController(IRepository<User> userRepository, IRepository<Role> roleRepository, IUserService userService)
        {
            this.userRepository = userRepository;
            this.roleRepository = roleRepository;
        	this.userService = userService;
        }
        public ActionResult Index()
        {
            var users = userRepository.GetAll().Editable();
            return View("Index", ShopView.Data.WithUsers(users));
        }
        public ActionResult New()
        {
            return View("Edit", EditViewData.WithUser(Shop.User.DefaultUser));
        }
		[AcceptVerbs(HttpVerbs.Post), UnitOfWork]
		public ActionResult New(User user, string password)
		{
			if(! string.IsNullOrEmpty(password))
			{
				user.Password = userService.HashPassword(password);
			}
			try
			{
				user.Validate();
			}
			catch(ValidationException ex)
			{
				ex.CopyToModelState(ModelState, "user");
				return View("Edit", EditViewData.WithUser(user));
			}
			userRepository.InsertOnSubmit(user);
			Message = "User has been added.";
			return this.RedirectToAction(c => c.Index());
		}
        public ActionResult Edit(int id)
        {
            User user = userRepository.GetById(id);
            return View("Edit", EditViewData.WithUser(user));
        }
		[AcceptVerbs(HttpVerbs.Post), UnitOfWork]
		public ActionResult Edit([DataBind] User user, string password)
		{
			if(! string.IsNullOrEmpty(password))
			{
				user.Password = userService.HashPassword(password);
			}
			try
			{
				user.Validate();
			}
			catch (ValidationException validationException) 
			{
				validationException.CopyToModelState(ModelState, "user");
				return View("Edit", EditViewData.WithUser(user));
			}
			return View("Edit", EditViewData.WithUser(user).WithMessage("Changes have been saved")); 
		}
        public ShopViewData EditViewData
        {
            get
            {
                return ShopView.Data.WithRoles(roleRepository.GetAll());
            }
        }
    }
}

Because I'm using an IoC container I don't have to do any more than request an instance of IRepository<User> in the constructor and because the Windsor Container understands generics I only have a single configuration entry for all my generic repositories:

<?xml version="1.0"?>
<configuration>
  <!-- windsor configuration. 
  This is a web application, all components must have a lifesytle of 'transient' or 'preWebRequest' -->
  <components>
    <!-- repositories -->
    <!-- data context provider (this must have a lifestyle of 'perWebRequest' to allow the same data context
    to be used by all repositories) -->
    <component
      id="datacontextprovider"
      service="Suteki.Common.Repositories.IDataContextProvider, Suteki.Common"
      type="Suteki.Common.Repositories.DataContextProvider, Suteki.Common"
      lifestyle="perWebRequest"
     />
	<component
		id="menu.repository" 
		service="Suteki.Common.Repositories.IRepository`1[[Suteki.Shop.Menu, Suteki.Shop]], Suteki.Common"
		type="Suteki.Shop.Models.MenuRepository, Suteki.Shop"
		lifestyle="transient"
		/>
		
    <component
      id="generic.repository"
      service="Suteki.Common.Repositories.IRepository`1, Suteki.Common"
      type="Suteki.Common.Repositories.Repository`1, Suteki.Common"
      lifestyle="transient" />
    
	....
  </components>
  
</configuration>

The IoC Container also provides the DataContext. Note that the DataContext's lifestyle is perWebRequest. This means that a single DataContext is shared between all the repositories in a single request.

To test my UserController I can pass an object graph from a mock IRepository<User>.

[Test]
public void IndexShouldDisplayListOfUsers()
{
    User[] users = new User[] { };
    UserListViewData viewData = null;

    using (mocks.Record())
    {
        Expect.Call(userRepository.GetAll()).Return(users);

        userController.RenderView(null, null);
        LastCall.Callback(new Func<string, string, object, bool>((v, m, vd) => 
        {
            viewData = (UserListViewData)vd;
            return true;
        }));
    }

    using (mocks.Playback())
    {
        userController.Index();
        Assert.AreSame(users, viewData.Users);
    }
}

Established layered architecture patterns insulate the domain model (business objects) of a database from the data access code by layering that code into a data access layer that provides services for persisting and de-persisting objects from and to the database. This layered approach becomes essential as soon as you start doing Test Driven Development which requires you to test you code in isolation from your database.

So is LINQ data access code? I don't think so. Because the syntax for querying in-memory object graphs is identical to that for querying the database it makes sense to place LINQ queries in your domain layer. During testing, the component under test can work with in memory object graphs, but when integrated with the data access layer those same queries become SQL queries to a database.

Here's a simple example. I've got the canonical Customer->Orders->OrderLines business entities. Now say I get an Customer from my IRepository<Customer>, I can then query my Customer's Orders using LINQ inside my controller:

Customer customer = customerRepository.GetById(customerId);
int numberOfOrders = customer.Orders.Count(); // LINQ

I unit test this by returning a fully formed Customer with Orders from my mock customerRepository, but when I run this code against a concrete repository LINQ to SQL will construct a SQL statement, something like:

SELECT COUNT(*) FROM Order WHERE CustomerId = 34.

15 comments:

Mike Hadlow said...

There was a good discussion about this post on the ALT.NET mailing list:

http://tech.groups.yahoo.com/group/altdotnet/message/4706

Dragan Panjkov said...

How would you use it without IoC container?

Andrei Rinea said...

There is something that I don’t understand in using LINQ to SQL in Repository pattern :

Let’s say that I will create an assembly for each of the Repository providers (LINQ to SQL, MySQL, Oracle, WebService, XML files etc. etc.). I will define the IRepository interface in a common assembly (let’s call it ‘Interfaces’) that will be referenced by any Repository provider assembly. But where will the entity classes be placed? Since they are generated by Visual Studio or by SQL Metal they would live in the LINQ to SQL provider but they are required in the ‘Interfaces’ assembly so I have a circular problem.

How do I go about this?

Mike Hadlow said...

Hi Andrei.

Your domian should be separate from any particular repository implementation. If you generate your 'domain' using the LINQ designer you won't get a persistence ignorant domain model. You should code your entities to be independent of any particular persistence technique. You can persuade LINQ-to-SQL persist POCOs, but it's a PITA. NHibernate is a much better choice of ORM in my opinion because it takes persistence ignorance as its starting point.

Google for onion or hexagonal architecture. The core idea is that at the centre of your application are your entities (domain model), surrounded by application services, including repositories. Your services can have a dependency on your domain but not vis-versa. The generic repository pattern allows you do implement repositories independent of domain, but it does rely on your proviers not expecting the domain to have knowledge of them. My IRepository pattern also expects your provider to have a LINQ implementation and unit of work. That's not always the case so you might find yourself providing more application specific services that in turn wrap specific persistence technologies.

It's a huge subject and difficult to summerise in a brief comment. If you want to really good introduction Domain Driven Development. Check out Eric Evan's book.

Janus007 said...

Once again we see a strange way of solving things.. I've been there, done that - read my comments, remember I'm not trying to offend you, but not pedagogical :-)

All too many bloggers write about LINQ without leveraging the pure value in data-driven applications using LINQ.

Why would anybody choose a Repository-pattern and especially a Generic one?
A Repository-pattern with GetByID, which retrieves one row - remember the old rule of thumb? Never ask for fields you don't need :)

Where do you create a join? Where do you finetune the loading? etc. etc.

And why a GetAll? A GetAll-records in our table with +100K records?

And what about Delete / Add, give me one good reason why this should have a generic approach? This is a good example of a Repository-antipattern. Looks like everybody have been infected with a typeof(Syndrome) *LOL* , just because you can, doesn't mean you're forced to do it.

I will not mention your Exception-handling - Throw a new one?

I will not mention your DataContext -webrequest life-cycle.

Mike Hadlow -> You're also mixing things, this isn't about DDD, but about LINQ - huge difference ;-) You said it yourself... a whole other story, so why even bother to comment?

Mike Hadlow said...

Hi Janus,

Thanks for making some great points so robustly :)

So I guess you are saying that there's no place for a generic interface for data access? we should create specific repositories for each aggregate root and the repository implementation should directly access the underlying ORM (LINQ-to-SQL, NHibernate or whatever)?

I think that's a good approach. However I still see some value in a generic abstraction, because there are things that are common to each repository. You will always want to get an entity by id. You will always need to insert and entity, and, most of the time, delete it.

There's nothing to say that you can't then wrap the generic repository in an entity specific one.

GetAll returns IQueryable<T>. This allows us to have nicely factored filtering logic. We can separate domain concerns: customerRepository.GetAll().WithOutstandingOrders() from UI concerns: results.ToPagedList(page, pageSize).

Separating query specifications from the repository is a pattern that Eric Recommends in the Blue Book. We can also raise specific exceptions from these specifications. Decoupling specifications from repositories also promotes reuse of generic specifications: ToPagedList, Active, InOrder etc.

I think the only real drawback of this approach (specifications as extension methods) is the difficulty of mocking extension methods.

I think there's general agreement that UoW, transaction boundaries and loading strategies shouldn't be a concern of the repository, but handled by other infrastructure elements. Take a look at recent commits of Suteki Shop by Jeremy Skinner. He's doing some very interesting work attributing controller actions to satisfy these concerns.

I hope I don't give the impression that I think I'm some kind of authority on application architecture, because I'm not. My blog merely documents my slow and bumbling efforts to try and find a better way of doing things by learning from far more intelligent and experienced people like yourself. And I do learn an awful lot from the comments people leave. My views on the repository pattern have changed quite considerably since this post.

I think having a debate like this is by far the best way of exploring ideas and if nobody 'bothered to comment' as you say, we wouldn't get anywhere.

Janus007 said...

FIrst of all, sorry for the last comment to Mike (it was stupid), you are absolutely right in "we wouldn't go anywhere"

Back to your post :)

Yes I would say that the need for a generic approach will be very limited when using a repository pattern because of the nature of the pattern ( http://www.martinfowler.com/eaaCatalog/repository.html )
When using L2S and the RepositoryPattern then the object graph needs to be loaded fully to comply with the pattern - we both know that this would be a not so wise performance decision. By implementing a generic IRepository you are indicating that all tables should be loaded equally and also the fact that all should be handled with the same C(R)UD. Of cause there will be some similarity between tables, but.... I'm almost sure that in a real-world scenario there will exist different constraints for different tables. I cannot see how this should be accomplished used a generic approach? Everything can be programmed, no doubt about that, but I'm a KISS-guy - not that I don't use generics, I certainly do, but only in some highly needed occasions.

Try this:
public T[] GetAll()
{
return Table.ToArray();
}

It will retrieve the Table and convert it to an array, when that happens you kill the object-graph and you can say goodbye to LINQ. If you then for some reason (not a very strange reason though :-) ) needed a child/ reference/ foreign key relation to another table you would need to call the Repository yet again, this time maybe using the GetById, afterwards you are on your own by joining those etc.

I've never heard about Eric and the Blue Book, but if Eric would say that specifications should be seperated from the Repository and the query against the datastore, then Eric should revise his understanding of how a database works. - Sorry to say :-| or maybe use another approach than Repository and LINQ :-) (I don't know the details of his opinion, so I could easily be wrong here)

Believe it or not.. But I'm using a Repository-pattern myself *LOL*, well... Every repository inherits from a BaseRepository where I manage the datacontext etc., and then from my specialized Repository I call base.Find().AsQueryAble();

Using that approach I can construct my LINQToSQL in the servicelayer whilst still having the freedom for using the object-graph and deferred loading until usage etc.

I think you better understand now why I did bother to post ;-)

Some would argue that having the service layer to construct the queries is totally wrong and against common sense in a n-tiered scenario. But I'm not working in a n-tiered scenario and if I were I would use a WCF-service between my WebLayer and Service.
To those I will say: Read carefully about the RepositoryPattern (snippet: Client objects construct query specifications declaratively and submit them to Repository for satisfaction)

However I do work with n-layer architecture, but decoupled with dependency-injection like yours :)

Mike Hadlow said...

Hi Janus007,

I've updated the post to reflect some of your comments. To be honest I hadn't read it again until just now and it does show a very early version of my repository. Of course you are absolutely right that the GetAll method should return IQueryable<T> rather than an array.

You should check out Eric Evan's book Domain Driven Design it's become a core reference for DDD. There's a whole section on factoring specifications from repositories that you might be interested in.

James said...

Hi,

I was wondering if you had any interest in separating the generic bits (repository, unit of work, etc) into a common library that could be hosted at Codeplex or Google code? I ask because I see alot of implementations of IRepository and would like to commit to one, but as part of a larger framework.

Thanks!

James

PS - I know Oren has largely deprecated IRepository, do you think it introduces unnecessary limitations?

Mike Hadlow said...

Hi James,

We've separated out the common bits (things that could be used in any web application) into a separate project, Suteki.Common, but I'd agree that that's a lot of code to reference if you are only interested in the repository related stuff.

I would take a close look at Sharp Architecture if you are interested in adopting a whole framework, there's a very strong community around it. You *could* use the Suteki.Common stuff, but it's completely undocumented and really only used by Suteki Shop and a couple of commercial projects that I've worked on.

I think Oren has a very good point. As I read it, he wasn't saying that the generic repository introduces limitations, but that it doesn't really do the job it's intended for; to insulate your domain model from your data access technology. Simply having a generic repository doesn't insulate you from the behaviour of your ORM. It could, but then you'd have to encapsulate it to a degree that negated the benefits of the ORM in the first place. If your repository isn't insulating you from your data access technology then what is it for?

oggy said...

I completely agree with Andrei. Each of the repository frameworks uses its own entity classes, so using a concrete framework enforces you to use these Entity Classes inside your controller. Once you place them in your controller you make a strong dependency on the repository framework.
As you Mike said, your entity classes should be independent, but I don't know how??
As far as for ADO.NET Entity Framework, it won't you allow to make your own entity classes. You have to use these classes provided by the framework. Even more, I thought of making my entity classes abstract and inherit them in the ADO.NET entity framework, but it happened to be impossible since they are already inherited from:
System.Data.Objects.DataClasses.EntityObject, and as we all know there are no multiple inhereritance in C#.

So, what is the solution to this problem?

Mike Hadlow said...

Hi Oggy,

I can't comment on Entity Framework, since I haven't used it. NHibernate supports a POCO domain model. I'd recommend checking it out.

oggy said...

But isn't there the same problem? NHibernate also creates its own enetity classes. Right?

Anyways, my solution to the problem was to have both your entity classes and entity classes made by certain Repository(e.g. NHibernate, LINQ to SQL etc). On the IRepository interface your entitiy classes should be tossed back and forth. Inside each of the IRepository method there should be translating between your entity classes and the ones provided by Repository.

This would work fine, except there is a little performance issue with translating.

Does anybody have any other idea?

Mike Hadlow said...

Hi Oggy,

No, NHibernate works with entity classes that you write. It requires you to define mappings between your classes and your database schema, but it doesn't actually build either - that's its main advantage over most other ORMs. It's mappings are pretty flexible so you can break away from the 'class == table row' straight-jacket.

asava samuel said...

Mike Hadlow

It looks like Kellerman Software has a MySQL LINQ Provider:
https://www.kellermansoftware.com/p-47-net-data-access-layer.aspx