Showing posts with label Linq. Show all posts
Showing posts with label Linq. Show all posts

Friday, November 05, 2010

Composing Sequential Tasks With Linq

This is a continuation of my Task Parallel Library investigations. Yesterday I wrote about using TPL with MVC.

Say we have a number of asynchronous tasks that we want to execute in series because the result of the first task is an input value to the second task, and the result of the second task is an input to the third. To demonstrate, I’ve created two simple task generators:

static Task<int> CreateInt(int a)
{
    return Task<int>.Factory.StartNew(() =>
    {
        Console.WriteLine("Starting CreateInt   {0}", a);
        Thread.Sleep(1000);
        Console.WriteLine("Completing CreateInt {0}", a);
        return a;
    });
}

static Task<int> AddInts(int a, int b)
{
    return Task<int>.Factory.StartNew(() =>
    {
        Console.WriteLine("Starting AddInts     {0} + {1}", a, b);
        Thread.Sleep(1000);
        Console.WriteLine("Completing AddInts   {0} + {1}", a, b);
        return a + b;
    });
}

I want to create an int and then add 3 to it, and then add 4. It’s difficult to compose these using the standard ‘ContinueWith’ callback:

public void ComposeWithContinueWith()
{
    var result = CreateInt(2)
        .ContinueWith(t1 => AddInts(t1.Result, 3)
            .ContinueWith(t2 => AddInts(t2.Result, 4))
            );

    // result is the first task, how do you get the third task's result?
}

You can simply put two lots of ‘Unwrap()’ at the end of the expression:

public void ComposeWithContinueWith() { var result = CreateInt(2) .ContinueWith(t1 => AddInts(t1.Result, 3) .ContinueWith(t2 => AddInts(t2.Result, 4)) ).Unwrap().Unwrap();

Console.WriteLine("Completed with result {0}", result.Result); }

But there is a much nicer way. But because tasks are Monadic, you can compose them using Linq:

Update / Correction: The out-of-the-box Task<T> doesn’t have Linq methods (SelectMany etc) built in.

However, there is an implementation in the ParallelExtensionsExtras assembly. I confused myself (it happens a lot) because I’d included the ParallelExtensionsExtras for the Task extension methods on SmtpClient and SqlDataReader, and has simply assumed that Task<T> has the Linq extension methods built in.

You can build the ParallelExtensionsExtras.dll yourself from the Samples for Parallel Programming. Alternatively, you can just grab a compiled ParallelExtensionsExtras.dll from my sample solution at: https://github.com/mikehadlow/Suteki.AsyncMvcTpl. Stephen Toub has a great write up on the goodies in the Parallel Extensions Extras library here, it’s a great read.

Anyway, so once you have a reference to ParallelExtensionsExtras, you can compose Tasks using Linq expressions:

public void CanComposeTasksWithLinq()
{
    var result = from a in CreateInt(2)
                 from b in AddInts(a, 3)
                 from c in AddInts(b, 4)
                 select c;

    Console.WriteLine("Completed with result {0}", result.Result);
}

Which outputs:

Starting CreateInt   2
Completing CreateInt 2
Starting AddInts     2 + 3
Completing AddInts   2 + 3
Starting AddInts     5 + 4
Completing AddInts   5 + 4
Completed with result 9

This is a really nice pattern to use if you have a number of async IO tasks to do in series.

Friday, August 06, 2010

NHibernate Linq Eager Fetching

The new NHibernate Linq provider provides eager fetching out of the box. Here’s how you do it:

var customers = session.Query<Customer>().Fetch(c => c.Orders).ToList();

Note, Query<T> is an extension method in the NHibernate.Linq namespace.

The statement above will cause the following SQL to be executed:

select customer0_.CustomerId   as CustomerId0_0_,
       orders1_.OrderId        as OrderId3_1_,
       customer0_.CompanyName  as CompanyN2_0_0_,
       customer0_.ContactName  as ContactN3_0_0_,
       customer0_.ContactTitle as ContactT4_0_0_,
       customer0_.Address      as Address0_0_,
       customer0_.City         as City0_0_,
       customer0_.Region       as Region0_0_,
       customer0_.PostalCode   as PostalCode0_0_,
       customer0_.Country      as Country0_0_,
       customer0_.Phone        as Phone0_0_,
       customer0_.Fax          as Fax0_0_,
       orders1_.CustomerId     as CustomerId3_1_,
       orders1_.EmployeeId     as EmployeeId3_1_,
       orders1_.OrderDate      as OrderDate3_1_,
       orders1_.RequiredDate   as Required5_3_1_,
       orders1_.ShippedDate    as ShippedD6_3_1_,
       orders1_.ShipVia        as ShipVia3_1_,
       orders1_.Freight        as Freight3_1_,
       orders1_.ShipName       as ShipName3_1_,
       orders1_.ShipAddress    as ShipAdd10_3_1_,
       orders1_.ShipCity       as ShipCity3_1_,
       orders1_.ShipRegion     as ShipRegion3_1_,
       orders1_.ShipPostalCode as ShipPos13_3_1_,
       orders1_.ShipCountry    as ShipCou14_3_1_,
       orders1_.CustomerId     as CustomerId0__,
       orders1_.OrderId        as OrderId0__
from   Customers customer0_
       left outer join Orders orders1_
         on customer0_.CustomerId = orders1_.CustomerId

As you can see a single statement returns the customer and all the customer’s orders, just as expected.

Note that if you want to mix Fetch with other clauses, Fetch must always come last. So for example:

var customers = session.Query<Customer>().Fetch(c => c.Orders).Where(c => c.CustomerId == "ANATR").ToList();

Will throw a nasty parse exception:

Test 'NHibernate.Test.Linq.EagerLoadTests.WhereWorksWithFetch' failed: System.NotSupportedException : Specified method is not supported.

But this will work fine:

var customers = session.Query<Customer>().Where(c => c.CustomerId == "ANATR").Fetch(c => c.Orders).ToList();

Be careful not to eagerly fetch multiple collection properties at the same time. Although this statement will work fine:

var employees = session.Query<Employee>()
    .Fetch(e => e.Subordinates)
    .Fetch(e => e.Orders).ToList();

It executes a Cartesian product query against the database, so the total number of rows returned will be the total Subordinates times the total orders. Ayende discusses this behaviour here.

You can fetch grandchild collections too. Here we use ‘FetchMany’ and ‘ThenFetchMany’:

var customers = session.Query<Customer>()
    .FetchMany(c => c.Orders)
    .ThenFetchMany(o => o.OrderLines).ToList();

Which produces the following SQL:

select customer0_.CustomerId    as CustomerId0_0_,
       orders1_.OrderId         as OrderId3_1_,
       orderlines2_.OrderLineId as OrderLin1_4_2_,
       customer0_.CompanyName   as CompanyN2_0_0_,
       customer0_.ContactName   as ContactN3_0_0_,
       customer0_.ContactTitle  as ContactT4_0_0_,
       customer0_.Address       as Address0_0_,
       customer0_.City          as City0_0_,
       customer0_.Region        as Region0_0_,
       customer0_.PostalCode    as PostalCode0_0_,
       customer0_.Country       as Country0_0_,
       customer0_.Phone         as Phone0_0_,
       customer0_.Fax           as Fax0_0_,
       orders1_.CustomerId      as CustomerId3_1_,
       orders1_.EmployeeId      as EmployeeId3_1_,
       orders1_.OrderDate       as OrderDate3_1_,
       orders1_.RequiredDate    as Required5_3_1_,
       orders1_.ShippedDate     as ShippedD6_3_1_,
       orders1_.ShipVia         as ShipVia3_1_,
       orders1_.Freight         as Freight3_1_,
       orders1_.ShipName        as ShipName3_1_,
       orders1_.ShipAddress     as ShipAdd10_3_1_,
       orders1_.ShipCity        as ShipCity3_1_,
       orders1_.ShipRegion      as ShipRegion3_1_,
       orders1_.ShipPostalCode  as ShipPos13_3_1_,
       orders1_.ShipCountry     as ShipCou14_3_1_,
       orders1_.CustomerId      as CustomerId0__,
       orders1_.OrderId         as OrderId0__,
       orderlines2_.OrderId     as OrderId4_2_,
       orderlines2_.ProductId   as ProductId4_2_,
       orderlines2_.UnitPrice   as UnitPrice4_2_,
       orderlines2_.Quantity    as Quantity4_2_,
       orderlines2_.Discount    as Discount4_2_,
       orderlines2_.OrderId     as OrderId1__,
       orderlines2_.OrderLineId as OrderLin1_1__
from   Customers customer0_
       left outer join Orders orders1_
         on customer0_.CustomerId = orders1_.CustomerId
       left outer join OrderLines orderlines2_
         on orders1_.OrderId = orderlines2_.OrderId

Once again, exactly as expected.

Happy Fetching!

Tuesday, September 02, 2008

What's up with Linq to NHibernate?

With my current clients I'm building an MVC Framework application using NHibernate. I love NHibernate. Like all the best frameworks it just works for most common scenarios. I'm a new user, but I've been able to get up to speed with it very quickly. The power and flexibility of the mapping options make it easy to use with the legacy database that we're lumbered with. But the best thing about it though, is that it's built from the ground up for Domain Driven Development. This means it's possible to build a finely grained domain model unencumbered by data access concerns.

Seriously, if you're building .NET enterprise level business software and you haven't considered using NHibernate (or a competing ORM) you missing out on a huge productivity win.

Any ORM targeting .NET has to support Linq. Here the NHibernate story's not so good at the moment. The coding phenomena that is Ayende kicked off the first Linq to NHibernate implementation a while back. It was originally part of his Rhino Tools project and layered the Linq implementation on top of the NHibernate criteria API. This has now been moved to the NHibernate contrib project. The source is here:

http://sourceforge.net/projects/nhcontrib/

I've been using this in my current project, it works but only with the most straightforward queries. Anything tricky tends to fail and I've tended to go through a process of trial and error to find the right patterns. The problem seems to be mostly around the criteria API. It doesn't support a lot of the features that Linq requires, especially nested queries.

Recently a new NHibernate.Linq project has been started in the NHibernate trunk that takes a different approach. The authors are building SQL directly from the expression tree which obviously gives a lot more flexibility. The intention is to also to convert HQL to an expression tree thus having a single uniform query translation.

This is very exciting for NHibernate but there's going to be a hiatus while the new implementation evolves.

Thanks to Tuna Toksöz for the info.

Friday, August 08, 2008

The Queryable Domain Property Problem

LINQ has revolutionised the way we do data access. Being able to fluently describe queries in C# means that you never have to write a single line of SQL again. Of course LINQ isn't the only game in town. NHibernate has a rich API for describing queries as do most mature ORM tools. But to be a player in the .NET ORM game you simply have to provide a LINQ IQueryable API. It's been really nice to see the NHibernate-to-LINQ project take off and apparently LLBLGen Pro has an excellent LINQ implementation too.

Now that we can write our queries in C# it should mean that we can have completely DRY business logic. No more duplicate rules, one set in SQL, the other in the domain classes. But there's a problem: LINQ doesn't understand IL. If you write a query that includes a property or method, LINQ-to-SQL can't turn the logic encapsulated by it into a SQL statement.

To illustrate the problem take this simple schema for an order:

queriable_scema

Let's use the LINQ-to-SQL designer to create some classes:

queriable_classes

Now lets create a 'Total' property for the order that calculates the total by summing the order lines' quantities times their product's price.

public decimal Total
{
get
{
    return OrderLines.Sum(line => line.Quantity * line.Product.Price);
}
}

Here's a test to demonstrate that it works

[Test]
public void Total_ShouldCalculateCorrectTotal()
{
const decimal expectedTotal = 23.21m + 14.30m * 2 + 7.20m * 3;

var widget = new Product { Price = 23.21m };
var gadget = new Product { Price = 14.30m };
var wotsit = new Product { Price = 7.20m };

var order = new Order
{
    OrderLines =
    {
        new OrderLine { Quantity = 1, Product = widget},
        new OrderLine { Quantity = 2, Product = gadget},
        new OrderLine { Quantity = 3, Product = wotsit}
    }
};

Assert.That(order.Total, Is.EqualTo(expectedTotal));
}

Now, what happens when we use the Total property in a LINQ query like this:

[Test]
public void Total_ShouldCalculateCorrectTotalOfItemsInDb()
{
var total = dataContext.Orders.Select(order => order.Total).First();
Assert.That(total, Is.EqualTo(expectedTotal));
}

The test passes, but when we look at the SQL that was generated by LINQ-to-SQL we get this:

SELECT TOP (1) [t0].[Id]
FROM [dbo].[Order] AS [t0]

SELECT [t0].[Id], [t0].[OrderId], [t0].[Quantity], [t0].[ProductId]
FROM [dbo].[OrderLine] AS [t0]
WHERE [t0].[OrderId] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

SELECT [t0].[Id], [t0].[Price]
FROM [dbo].[Product] AS [t0]
WHERE [t0].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

SELECT [t0].[Id], [t0].[Price]
FROM [dbo].[Product] AS [t0]
WHERE [t0].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [2]

SELECT [t0].[Id], [t0].[Price]
FROM [dbo].[Product] AS [t0]
WHERE [t0].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [3]

LINQ-to-SQL doesn't know anything about the Total property, so it does as much as it can. It loads the Order. When the Total property executes, OrderLines is evaluated which causes the order lines to be loaded with a single select statement. Next each Product property of each OrderLine is evaluated in turn causing each Product to be selected individually. So we've had five SQL statements executed and the entire Order object graph loaded into memory just to find out the order total. Yes of course we could add data load options to eagerly load the entire object graph with one query, but we would still end up with the entire object graph in memory. If all we wanted was the order total this is very inefficient.

Now, if we construct a query where we explicitly ask for the sum of order line quantities times product prices, like this:

[Test]
public void CalculateTotalWithQuery()
{
var total = dataContext.OrderLines
    .Where(line => line.Order.Id == 1)
    .Sum(line => line.Quantity * line.Product.Price);

Assert.That(total, Is.EqualTo(expectedTotal));
}

We get this SQL

SELECT SUM([t3].[value]) AS [value]
FROM (
SELECT (CONVERT(Decimal(29,4),[t0].[Quantity])) * [t2].[Price] AS [value], [t1].[Id]
FROM [dbo].[OrderLine] AS [t0]
INNER JOIN [dbo].[Order] AS [t1] ON [t1].[Id] = [t0].[OrderId]
INNER JOIN [dbo].[Product] AS [t2] ON [t2].[Id] = [t0].[ProductId]
) AS [t3]
WHERE [t3].[Id] = @p0
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

One SQL statement has been created that returns a scalar value for the total. Much better. But now we've got duplicate business logic. We have definition of the order total calculation in the Total property of Order and another in the our query.

So what's the solution?

What we need is a way of creating our business logic in a single place that we can use in both our domain properties and in our queries. This brings me to two guys who have done some excellent work in trying to solve this problem: Fredrik Kalseth and Luke Marshall. I'm going to show you Luke's solution which is detailed in this series of blog posts.

It's based on the specification pattern. If you've not come across this before, Ian Cooper has a great description here. The idea with specifications is that you factor out your domain business logic into small composable classes. You can then test small bits of business logic in isolation and then compose them to create more complex rules; because we all know that rules rely on rules :)

The neat trick is to implement the specification as a lambda expression that can be executed against in-memory object graphs or inserted into an expression tree to be compiled into SQL.

Here's our Total property as a specification, or as Luke calls it, QueryProperty.

static readonly TotalProperty total = new TotalProperty();

[QueryProperty(typeof(TotalProperty))]
public decimal Total
{
get
{
    return total.Value(this);
}
}

class TotalProperty : QueryProperty<Order, decimal>
{
public TotalProperty()
    : base(order => order.OrderLines.Sum(line => line.Quantity * line.Product.Price))
{
}
}

We factored out the Total calculation into a specification called TotalProperty which passes the rule into the constructor of the QueryProperty base class. We also have a static instance of the TotalProperty specification. This is simply for performance reasons and acts a specification cache. Then in the Total property getter we ask the specification to calculate its value for the current instance.

Note that the Total property is decorated with a QueryPropertyAttribute. This is so that the custom query provider can recognise that this property also supplies a lambda expression via its specification, which is the type specified in the attribute constructor. This is the main weakness of this approach because there's an obvious error waiting to happen. The type passed in the QueryPropertyAttribute has to match the type of the specification. It's also very invasive since we have various bits of the framework (QueryProperty, QueryPropertyAttribute) surfacing in our domain code.

These days simply everyone has a generic repository and Luke is no different. His repository chains a custom query provider before the LINQ-to-SQL query provider that knows how to insert the specification expressions into the expression tree. We can use the repository like this:

[Test]
public void TotalQueryUsingRepository()
{
var repository = new RepositoryDatabase<Order>(dataContext);

var total = repository.AsQueryable().Select(order => order.Total).First();
Assert.That(total, Is.EqualTo(expectedTotal));
}

Note how the LINQ expression is exactly the same as one we ran above which caused five select statements to be executed and the entire Order object graph to be loaded into memory. When we run this new test we get this SQL:

SELECT TOP (1) [t4].[value]
FROM [dbo].[Order] AS [t0]
OUTER APPLY (
SELECT SUM([t3].[value]) AS [value]
FROM (
    SELECT (CONVERT(Decimal(29,4),[t1].[Quantity])) * [t2].[Price] AS [value], [t1].[OrderId]
    FROM [dbo].[OrderLine] AS [t1]
    INNER JOIN [dbo].[Product] AS [t2] ON [t2].[Id] = [t1].[ProductId]
    ) AS [t3]
WHERE [t3].[OrderId] = [t0].[Id]
) AS [t4]

A single select statement that returns a scalar value for the total. It's very nice, and with the caveats above it's by far the nicest solution to this problem that I've seen yet.

Friday, August 01, 2008

ADO.NET Data Services: Creating a custom Data Context #3: Updating

This follows part 1 where I create a custom data context and part 2 where I create a client application to talk to it.

For your custom data context to allow for updates you have to implement IUpdatable.

ado_net_im_iupdatable

This interface has a number of cryptic methods and it wasn't at all clear at first how to write an implementation for it. I'm sure there must be some documentation somewhere but I couldn't find it. I resorted to writing trace writes in the empty methods and firing inserts and updates at my web service. You can then use Sysinternals' DebugView to watch what happens.

First of all lets try an insert:

public void AddANewTeacher()
{
  Console.WriteLine("\r\nAddANewTeacher");

  var frankyChicken = new Teacher
                        {
                            ID = 3,
                            Name = "Franky Chicken"
                        };
  service.AddObject("Teachers", frankyChicken);
  var response = service.SaveChanges();
}

We get this result:

ado_net_im_debug_insert

So you can see that first all the existing Teachers are returned, then a new Teacher instance is created, it's properties set, SaveChanges is called and then ResolveResource. For my simple in-memory implementation I just added the new Teacher to my static list of teachers:

public object CreateResource(string containerName, string fullTypeName)
{
  Trace.WriteLine(string.Format("CreateResource('{0}', '{1}')", containerName, fullTypeName));

  var type = Type.GetType(fullTypeName);
  var resource = Activator.CreateInstance(type);

  switch (containerName)
  {
      case "Teachers":
          Root.Teachers.Add((Teacher)resource);
          break;
      case "Courses":
          Root.Courses.Add((Course)resource);
          break;
      default:
          throw new ApplicationException("Unknown containerName");
  }

  return resource;
}

public void SetValue(object targetResource, string propertyName, object propertyValue)
{
  Trace.WriteLine(string.Format("SetValue('{0}', '{1}', '{2})", targetResource, propertyName, propertyValue));

  Type type = targetResource.GetType();
  var property = type.GetProperty(propertyName);
  property.SetValue(targetResource, propertyValue, null);
}

Next let's try an update:

public void UpdateATeacher()
{
  Console.WriteLine("\r\nUpdateATeacher");

  var fredJones = service.Teachers.Where(t => t.ID == 2).Single();

  fredJones.Name = "Fred B Jones";

  service.UpdateObject(fredJones);
  service.SaveChanges();
}

We get this result:

ado_net_im_debug_update

This time the Teacher to be updated is returned by GetResource then SetValue is called to update the Name property. Finally SaveChanges and ResolveResources are called again.

The GetResource implementation is straight from Shawn Wildermuth's LINQ to SQL implementation.

public object GetResource(IQueryable query, string fullTypeName)
{
  Trace.WriteLine(string.Format("GetResource('query', '{0}')", fullTypeName));

  // Get the first result
  var results = (IEnumerable)query;
  object returnValue = null;
  foreach (object result in results)
  {
      if (returnValue != null) break;
      returnValue = result;
  }

  // Check the Typename if needed
  if (fullTypeName != null)
  {
      if (fullTypeName != returnValue.GetType().FullName)
      {
          throw new DataServiceException("Incorrect Type Returned");
      }
  }

  // Return the resource
  return returnValue;
}

Now all I have to do is work on creating relationships between entities, possibly more on this next week.

Code is here:

http://static.mikehadlow.com/Mike.DataServices.InMemory.zip

ADO.NET Data Services: Creating a custom Data Context #2: The Client

In part 1 I showed how to create a simple in-memory custom data context for ADO.NET Data Services. Creating a managed client is also very simple. First we need to provide a similar domain model to our server. In this case the classes are identical except that now Teacher has a List<Course> rather than a simple array (Course[]) as it's Courses property:

using System.Collections.Generic;

namespace Mike.DataServices.Client.Model
{
   public class Teacher
   {
       public int ID { get; set; }
       public string Name { get; set; }
       public List<Course> Courses { get; set; }
   }
}

Next I wrote a class to extend DataServiceContext with properties for Teachers and Courses that are both DataServiceQuery<T>. Both DataServiceContext and DataServiceQuery<T> live in the System.Data.Services.Client assembly. You don't have to create this class, but it makes the subsequent use of the DataServiceContext simpler. You can also use use the 'Add Service Reference' menu item, but I don't like the very verbose code that this generates.

using System;
using System.Data.Services.Client;
using Mike.DataServices.Client.Model;

namespace Mike.DataServices.Client
{
   public class SchoolServiceProxy : DataServiceContext
   {
       private const string url = "http://localhost:4246/SchoolService.svc";

       public SchoolServiceProxy() : base(new Uri(url))
       {
       }

       public DataServiceQuery<Teacher> Teachers
       {
           get
           {
               return CreateQuery<Teacher>("Teachers");
           }
       }

       public DataServiceQuery<Course> Courses
       {
           get
           {
               return CreateQuery<Course>("Courses");
           }
       }
   }
}

Here's a simple console program that outputs teacher John Smith and his courses and then the complete list of courses. The nice thing is that DataServiceQuery<T> implements IQueryable<T> so we can write LINQ queries against our RESTfull service.

using System;
using System.Linq;
using System.Data.Services.Client;
using Mike.DataServices.Client.Model;

namespace Mike.DataServices.Client
{
   class Program
   {
       readonly SchoolServiceProxy service = new SchoolServiceProxy();

       static void Main(string[] args)
       {
           var program = new Program();
           program.GetJohnSmith();
           program.GetAllCourses();
       }

       public void GetJohnSmith()
       {
           Console.WriteLine("\r\nGetJohnSmith");

           var teachers = service.Teachers.Where(c => c.Name == "John Smith");

           foreach (var teacher in teachers)
           {
               Console.WriteLine("Teacher: {0}", teacher.Name);

               // N+1 issue here
               service.LoadProperty(teacher, "Courses");

               foreach (var course in teacher.Courses)
               {
                   Console.WriteLine("\tCourse: {0}", course.Name);
               }
           }
       }

       public void GetAllCourses()
       {
           Console.WriteLine("\r\nGetAllCourses");

           var courses = service.Courses;

           foreach (var course in courses)
           {
               Console.WriteLine("Course: {0}", course.Name);
           }
       }
   }
}

We get this ouput:

ado_net_im_console

Code is here:

http://static.mikehadlow.com/Mike.DataServices.InMemory.zip

ADO.NET Data Services: Creating a custom Data Context #1

Yesterday I wrote a quick overview of ADO.NET Data Services. We saw how it exposes a RESTfull API on top of any IQueryable<T> data source. The IQueryable<T> interface is of course at the core of any LINQ enabled data service. It's very easy to write your own custom Data Context if you already have a data source that supports IQueryable<T>. It's worth remembering that anything that provides IEnumerable<T> can be converted to IQueryable<T> by the AsQueryable() extension method, which means we can simply export an in-memory object graph in a RESTfull fashion with ADO.NET Data Services. That's what I'm going to show how to do today.

I got these techniques from an excellent MSDN Magazine article by Elisa Flasko and Mike Flasko, Expose And Consume Data in A Web Services World.

The first thing we need to do is provide a Domain Model to export. Here is an extremely simple example, two classes: Teacher and Course. Note that each entity must have an ID property that the Data Service can recognize as its primary key.

ado_net_im_model

For a read-only service (I'll show insert, update and delete in part 2) you simply need a data context that exports the entities of the domain model as IQueryable<T> properties:

using System.Linq;
using Mike.DataServices.Model;

namespace Mike.DataServices
{
   public class SchoolDataContext
   {
       private static Teacher[] teachers;
       private static Course[] courses;

       public SchoolDataContext()
       {
           var johnSmith = new Teacher
                                   {
                                       ID = 1,
                                       Name = "John Smith"
                                   };
           var fredJones = new Teacher
                                   {
                                       ID = 2,
                                       Name = "Fred Jones"
                                   };

           var programming101 = new Course
                                    {
                                        ID = 1,
                                        Name = "programming 101",
                                        Teacher = johnSmith
                                    };
           var howToMakeAnything = new Course
                                       {
                                           ID = 2,
                                           Name = "How to make anything",
                                           Teacher = johnSmith
                                       };
           johnSmith.Courses = new[] {programming101, howToMakeAnything};

           var yourInnerFish = new Course
                                   {
                                       ID = 3,
                                       Name = "Your inner fish",
                                       Teacher = fredJones
                                   };
           fredJones.Courses = new[] {yourInnerFish};

           teachers = new[] {johnSmith, fredJones};
           courses = new[] {programming101, howToMakeAnything, yourInnerFish};
       }

       public IQueryable<Teacher> Teachers
       {
           get { return teachers.AsQueryable(); }
       }

       public IQueryable<Course> Courses
       {
           get { return courses.AsQueryable(); }
       }
   }
}

Note that we're building our object graph in the constructor in this demo. In a realistic implementation you'd probably have your application create its model somewhere else.

Now we simply have to set the type parameter of the DataService to our data context (DataService<SchoolDataContext>):

using System.Data.Services;

namespace Mike.DataServices
{
   [System.ServiceModel.ServiceBehavior(IncludeExceptionDetailInFaults = true)]
   public class SchoolService : DataService<SchoolDataContext>
   {
       public static void InitializeService(IDataServiceConfiguration config)
       {
           config.SetEntitySetAccessRule("*", EntitySetRights.AllRead);
       }
   }
}

And we can query our model via our RESTfull API:

ado_net_im_ie_root

And here's all the courses that the first teacher teaches:

ado_net_im_ie_teachers_courses

Code is here:

http://static.mikehadlow.com/Mike.DataServices.InMemory.zip

Thursday, June 05, 2008

LINQ to CSV

I thought it would be nice to be able to produce a CSV file by doing something like this:

string ordersCsv = orderRepository.GetAll().Select(o => new 
{ 
    OrderId = o.OrderId,
    Email = o.Email,
    OrderStatus = o.OrderStatus.Name,
    CreatedDate = o.CreatedDate,
    Total = o.Basket.Total
}).AsCsv();

So here's an extension method to do just that:

public static string AsCsv<T>(this IEnumerable<T> items)
    where T : class
{
    var csvBuilder = new StringBuilder();
    var properties = typeof (T).GetProperties();
    foreach (T item in items)
    {
        string line = properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray().Join(",");
        csvBuilder.AppendLine(line);
    }
    return csvBuilder.ToString();
}

private static string ToCsvValue<T>(this T item)
{
    if (item is string)
    {
        return "\"{0}\"".With(item.ToString().Replace("\"", "\\\""));
    }
    double dummy;
    if (double.TryParse(item.ToString(), out dummy))
    {
        return "{0}".With(item);
    }
    return "\"{0}\"".With(item);
}

It's work with anything that implements IEnumerable<T>, that includes the results of LINQ-to-SQL queries , arrays, List<T> and pretty much any kind of collection. Here's it's unit test:

[TestFixture]
public class EnumerableExtensionsTests
{
    [Test]
    public void GetCsv_ShouldRenderCorrectCsv()
    {
        IEnumerable<Thing> things = new List<Thing>()
            {
                new Thing
                    {
                        Id = 12,
                        Name = "Thing one",
                        Date = new DateTime(2008, 4, 20),
                        Child = new Child
                                    {
                                        Name = "Max"
                                    }
                    },
                new Thing
                    {
                        Id = 13,
                        Name = "Thing two",
                        Date = new DateTime(2008, 5, 20),
                        Child = new Child
                                    {
                                        Name = "Robbie"
                                    }
                    }
            };

        string csv = things.Select(t => new { Id = t.Id, Name = t.Name, Date = t.Date, Child = t.Child.Name }).AsCsv();

        Assert.That(csv, Is.EqualTo(expectedCsv));
    }

    const string expectedCsv = 
@"12,""Thing one"",""20/04/2008 00:00:00"",""Max""
13,""Thing two"",""20/05/2008 00:00:00"",""Robbie""
";

    public class Thing
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public DateTime Date { get; set; }
        public Child Child { get; set; }
    }

    public class Child
    {
        public string Name { get; set; }
    }
}

Saturday, May 31, 2008

Interface + Extension Methods = Mixin

Mixins are a language idea similar to multiple inheritance, but rather than extending a class a mixin works by defining an interface plus various functionality associated with that interface. It avoids the pathologies of multiple inheritance and is a flexible way of providing add-on functionality.

In C# we can create a mixin with a combination of an interface plus extension methods. LINQ is the canonical example of this with two core interfaces IEnumerable<T> and IQueriable<T> and a collection of extension methods on those interfaces.

It's easy to create your own mixins. For example, say I want to provide location functionality to various entities. I can define an interface like this:

public interface ILocatable
{
   int Longitude { get; set; }
   int Latitude { get; set; }
}

And then some extension methods for that interface:

public static class LocatableExtensions
{
   public static void MoveNorth(this ILocatable locatable, int degrees)
   {
       // ...
   }

   public static void MoveWest(this ILocatable locatable, int degrees)
   {
       // ..
   }
}

Now, if we've got some entities like:

public class Ship : ILocatable
{
   public int Longitude { get; set; }
   public int Latitude { get; set; }
}

public class Car : ILocatable
{
   public int Longitude { get; set; }
   public int Latitude { get; set; }
}

We can use the mixin like this:

Ship ship = new Ship(); ship.MoveNorth(10); Car car = new Car(); car.MoveWest(23);

It's a very nice pattern for adding capabilities to entities.

Thursday, March 20, 2008

Using the IRepository pattern with LINQ to SQL

27th March 2009. It's now a year since I wrote this post. Thanks to some comments by Janus2007 I've realised that it needs updating. I've replaced the code with the current version of the Suteki Shop LINQ generic repository. There are a number of changes in the way it works. The most obvious, and one that I should have updated a long time ago is the GetAll method returns  an IQueryable<T> rather than an array. I actually changed this soon after I wrote the post, but totally forgot about the naive implementation given here.

The other major change is marking the SubmitChanges methods as obsolete. Jeremy Skinner, who has been doing some excellent work on Suteki Shop has pushed this change. UoW (DataContext) management is now handled by attributes on action methods.

Please have a look at the Suteki Shop code to see the generic repository in action:

LINQ to SQL is a quantum leap in productivity for most mainstream .NET developers. Some folks may have been using NHibernate or some other ORM tool for years, but my experience in a number of .NET shops has been that the majority of developers still hand code their data access layer. LINQ is going to bring some fundamental changes to the way we architect our applications. Especially being able to write query style syntax directly in C# against both a SQL Server database and against in memory object graphs begs some interesting questions about application architecture.

So where is our point of separation between data access and domain? Surely I'm not recommending that we abandon a layered architecture and write all our data access directly into our domain classes?

My current project is based on the new MVC Framework. I've been using LINQ to SQL for data access as well as an IoC container (Windsor) and NUnit plus Rhino Mocks for testing. For my data access layer I've used the IRepository pattern popularized by Ayende in his excellent MSDN article on IoC and DI. My Repository looks like this:

using System;
using System.Linq;
using System.Linq.Expressions;
using System.Data.Linq;
using Suteki.Common.Extensions;
namespace Suteki.Common.Repositories
{
    public interface IRepository<T> where T : class
    {
        T GetById(int id);
        IQueryable<T> GetAll();
        void InsertOnSubmit(T entity);
        void DeleteOnSubmit(T entity);
		[Obsolete("Units of Work should be managed externally to the Repository.")]
        void SubmitChanges();
    }
    public interface IRepository
    {
        object GetById(int id);
        IQueryable GetAll();
        void InsertOnSubmit(object entity);
        void DeleteOnSubmit(object entity);
		[Obsolete("Units of Work should be managed externally to the Repository.")]
        void SubmitChanges();
    }
    public class Repository<T> : IRepository<T>, IRepository where T : class
    {
        readonly DataContext dataContext;
        public Repository(IDataContextProvider dataContextProvider)
        {
            dataContext = dataContextProvider.DataContext;
        }
        public virtual T GetById(int id)
        {
            var itemParameter = Expression.Parameter(typeof(T), "item");
            var whereExpression = Expression.Lambda<Func<T, bool>>
                (
                Expression.Equal(
                    Expression.Property(
                        itemParameter,
                        typeof(T).GetPrimaryKey().Name
                        ),
                    Expression.Constant(id)
                    ),
                new[] { itemParameter }
                );
            return GetAll().Where(whereExpression).Single();
        }
        public virtual IQueryable<T> GetAll()
        {
            return dataContext.GetTable<T>();
        }
        public virtual void InsertOnSubmit(T entity)
        {
            GetTable().InsertOnSubmit(entity);
        }
        public virtual void DeleteOnSubmit(T entity)
        {
            GetTable().DeleteOnSubmit(entity);
        }
        public virtual void SubmitChanges()
        {
            dataContext.SubmitChanges();
        }
        public virtual ITable GetTable()
        {
            return dataContext.GetTable<T>();
        }
        IQueryable IRepository.GetAll()
        {
            return GetAll();
        }
        void IRepository.InsertOnSubmit(object entity)
        {
            InsertOnSubmit((T)entity);
        }
        void IRepository.DeleteOnSubmit(object entity)
        {
            DeleteOnSubmit((T)entity);
        }
        object IRepository.GetById(int id)
        {
            return GetById(id);
        }
    }
}

As you can see, this generic repository insulates the rest of the application from the LINQ to SQL DataContext and provides basic data access methods for any domain class. Here's an example of it being used in a simple controller.

using System.Web.Mvc;
using Suteki.Common.Binders;
using Suteki.Common.Filters;
using Suteki.Common.Repositories;
using Suteki.Common.Validation;
using Suteki.Shop.Filters;
using Suteki.Shop.Services;
using Suteki.Shop.ViewData;
using Suteki.Shop.Repositories;
using MvcContrib;
namespace Suteki.Shop.Controllers
{
	[AdministratorsOnly]
    public class UserController : ControllerBase
    {
        readonly IRepository<User> userRepository;
        readonly IRepository<Role> roleRepository;
    	private readonly IUserService userService;
    	public UserController(IRepository<User> userRepository, IRepository<Role> roleRepository, IUserService userService)
        {
            this.userRepository = userRepository;
            this.roleRepository = roleRepository;
        	this.userService = userService;
        }
        public ActionResult Index()
        {
            var users = userRepository.GetAll().Editable();
            return View("Index", ShopView.Data.WithUsers(users));
        }
        public ActionResult New()
        {
            return View("Edit", EditViewData.WithUser(Shop.User.DefaultUser));
        }
		[AcceptVerbs(HttpVerbs.Post), UnitOfWork]
		public ActionResult New(User user, string password)
		{
			if(! string.IsNullOrEmpty(password))
			{
				user.Password = userService.HashPassword(password);
			}
			try
			{
				user.Validate();
			}
			catch(ValidationException ex)
			{
				ex.CopyToModelState(ModelState, "user");
				return View("Edit", EditViewData.WithUser(user));
			}
			userRepository.InsertOnSubmit(user);
			Message = "User has been added.";
			return this.RedirectToAction(c => c.Index());
		}
        public ActionResult Edit(int id)
        {
            User user = userRepository.GetById(id);
            return View("Edit", EditViewData.WithUser(user));
        }
		[AcceptVerbs(HttpVerbs.Post), UnitOfWork]
		public ActionResult Edit([DataBind] User user, string password)
		{
			if(! string.IsNullOrEmpty(password))
			{
				user.Password = userService.HashPassword(password);
			}
			try
			{
				user.Validate();
			}
			catch (ValidationException validationException) 
			{
				validationException.CopyToModelState(ModelState, "user");
				return View("Edit", EditViewData.WithUser(user));
			}
			return View("Edit", EditViewData.WithUser(user).WithMessage("Changes have been saved")); 
		}
        public ShopViewData EditViewData
        {
            get
            {
                return ShopView.Data.WithRoles(roleRepository.GetAll());
            }
        }
    }
}

Because I'm using an IoC container I don't have to do any more than request an instance of IRepository<User> in the constructor and because the Windsor Container understands generics I only have a single configuration entry for all my generic repositories:

<?xml version="1.0"?>
<configuration>
  <!-- windsor configuration. 
  This is a web application, all components must have a lifesytle of 'transient' or 'preWebRequest' -->
  <components>
    <!-- repositories -->
    <!-- data context provider (this must have a lifestyle of 'perWebRequest' to allow the same data context
    to be used by all repositories) -->
    <component
      id="datacontextprovider"
      service="Suteki.Common.Repositories.IDataContextProvider, Suteki.Common"
      type="Suteki.Common.Repositories.DataContextProvider, Suteki.Common"
      lifestyle="perWebRequest"
     />
	<component
		id="menu.repository" 
		service="Suteki.Common.Repositories.IRepository`1[[Suteki.Shop.Menu, Suteki.Shop]], Suteki.Common"
		type="Suteki.Shop.Models.MenuRepository, Suteki.Shop"
		lifestyle="transient"
		/>
		
    <component
      id="generic.repository"
      service="Suteki.Common.Repositories.IRepository`1, Suteki.Common"
      type="Suteki.Common.Repositories.Repository`1, Suteki.Common"
      lifestyle="transient" />
    
	....
  </components>
  
</configuration>

The IoC Container also provides the DataContext. Note that the DataContext's lifestyle is perWebRequest. This means that a single DataContext is shared between all the repositories in a single request.

To test my UserController I can pass an object graph from a mock IRepository<User>.

[Test]
public void IndexShouldDisplayListOfUsers()
{
    User[] users = new User[] { };
    UserListViewData viewData = null;

    using (mocks.Record())
    {
        Expect.Call(userRepository.GetAll()).Return(users);

        userController.RenderView(null, null);
        LastCall.Callback(new Func<string, string, object, bool>((v, m, vd) => 
        {
            viewData = (UserListViewData)vd;
            return true;
        }));
    }

    using (mocks.Playback())
    {
        userController.Index();
        Assert.AreSame(users, viewData.Users);
    }
}

Established layered architecture patterns insulate the domain model (business objects) of a database from the data access code by layering that code into a data access layer that provides services for persisting and de-persisting objects from and to the database. This layered approach becomes essential as soon as you start doing Test Driven Development which requires you to test you code in isolation from your database.

So is LINQ data access code? I don't think so. Because the syntax for querying in-memory object graphs is identical to that for querying the database it makes sense to place LINQ queries in your domain layer. During testing, the component under test can work with in memory object graphs, but when integrated with the data access layer those same queries become SQL queries to a database.

Here's a simple example. I've got the canonical Customer->Orders->OrderLines business entities. Now say I get an Customer from my IRepository<Customer>, I can then query my Customer's Orders using LINQ inside my controller:

Customer customer = customerRepository.GetById(customerId);
int numberOfOrders = customer.Orders.Count(); // LINQ

I unit test this by returning a fully formed Customer with Orders from my mock customerRepository, but when I run this code against a concrete repository LINQ to SQL will construct a SQL statement, something like:

SELECT COUNT(*) FROM Order WHERE CustomerId = 34.

Saturday, March 15, 2008

LINQ to String

LinqInAction

I'm currently reading the excellent Linq in Action by Marguerie, Eichert and Wooley. It's great exposition of all things Linq and there are lots really well explained examples. Something which tickled me, which I hadn't realized before is that you can use Linq over strings. Well of course, since System.String implements IEnumerable<T>. It gives you some interesting alternatives to standard string functions:

string text = "The quick brown fox jumped over the lazy dog.";

// substring
text.Skip(4).Take(5).Write();   // "quick"

// remove characters
text.Where(c => char.IsLetter(c)).Write(); // "Thequickbrownfoxjumpedoverthelazydog."

string map = "abcdefg";

// strip out only the characters in map
text.Join(map, c1 => c1, c2 => c2, (c1, c2) => c1).Write(); // "ecbfedeeadg"

// does text contain q?
text.Contains('q').Write(); // true

The 'Write()' at the end of each expression is just an extension method on IEnumerable<T> that writes to the console.

OK, so most of these aren't that useful and there are built in string functions to do most of these (except the mapping I think). The cool thing is that you can write a little function, called 'Chars' in my example, that exposes a file stream as IEnumerable<T>. This means you can do the same tricks on a file and since we're just building up a decorator chain of enumerators the entire file isn't loaded into memory, only one character at a time. In this example, showing the 'substring' again, the file will stop being read after the first nine characters.

string text = "The quick brown fox jumped over the lazy dog.";
string myDocuments = System.Environment.GetFolderPath(Environment.SpecialFolder.Personal);
string path = Path.Combine(myDocuments, "someText.txt");

File.WriteAllText(path, text);

using (StreamReader reader = File.OpenText(path))
{
    reader.Chars().Skip(4).Take(5).Write();
}

Here's the code for the 'Chars' function.

public static class StreamExtensions
{
    public static IEnumerable<char> Chars(this StreamReader reader)
    {
        if (reader == null) throw new ArgumentNullException("reader");

        char[] buffer = new char[1];
        while(reader.Read(buffer, 0, 1) > 0)
        {
            yield return buffer[0];
        }
    }
}

In the book the authors show a similar function that reads a file line by line and they use it to enumerate over a csv file. The resulting syntax is extremely neat, but I'll let you buy the book and see it for yourself:)

Sunday, February 03, 2008

Never write a for loop again! Fun with Linq style extension methods.

One of the things I like about Ruby is the range operator. In most C style languages in order to create a list of  numbers you would usually use a for loop like this:

List<int> numbers = new List<int>();
for (int i = 0; i < 10; i++)
{
    numbers.Add(i);
}
int[] myArray = numbers.ToArray();

But in Ruby you just write this:

myArray = o..9.to_a

But now with extension methods and custom iterators we can do the same thing in C#. Here's a little extension method 'To':

public static IEnumerable<int> To(this int initialValue, int maxValue)
{
    for (int i = initialValue; i <= maxValue; i++)
    {
        yield return i;
    }
}

You can use it like this:

1.To(10).WriteAll();

1 2 3 4 5 6 7 8 9 10

Note the WriteAll() method, that's an extension method too, it simply writes each item in the list to the console:

public static void WriteAll<T>(this IEnumerable<T> values)
{
    foreach (T value in values)
    {
        Console.Write("{0} ", value);
    }
    Console.WriteLine();
}

You can mix your custom extension methods with the built in Linq methods, let's count up to fifty in steps of ten:

1.To(5).Select(i => i * 10).WriteAll();

10 20 30 40 50

Or maybe just output some even numbers:

1.To(20).Where(i => i % 2 == 0).WriteAll();

2 4 6 8 10 12 14 16 18 20

Here's another extension method 'Each', it just applies a Lambda expression to each value:

public delegate void Func<T>(T value);

public static void Each<T>(this IEnumerable<T> values, Func<T> function)
{
    foreach (T value in values)
    {
        function(value);
    }
}

Let's use it to output a ten by ten square of zero to ninety nine:

0.To(9).Each(i => 0.To(9).Select(j => (i * 10) + j).WriteAll());

0 1 2 3 4 5 6 7 8 9 
10 11 12 13 14 15 16 17 18 19 
20 21 22 23 24 25 26 27 28 29 
30 31 32 33 34 35 36 37 38 39 
40 41 42 43 44 45 46 47 48 49 
50 51 52 53 54 55 56 57 58 59 
60 61 62 63 64 65 66 67 68 69 
70 71 72 73 74 75 76 77 78 79 
80 81 82 83 84 85 86 87 88 89 
90 91 92 93 94 95 96 97 98 99 

Now here's something really cool, and more than a little dangerous. Because chaining extension methods of IEnumerable<T> means that you're effectively building a decorator chain of enumerators, you don't actually execute every iteration unless you ask for it. This means we can write infinite loop generators and then bound them by only asking for some members. Best to demonstrate. Here's a method that returns an infinite number of integers (not really, it'll end with an exception at int.MaxValue):

public static IEnumerable<int> Integers
{
    get
    {
        int i = 0;
        while (true)
        {
            yield return i++;
        }
    }
}

We can use it with the built in Linq method 'Take'. For example here we are printing out zero to nine:

Numbers.Integers.Take(10).WriteAll();

0 1 2 3 4 5 6 7 8 9

And here is Five to Fifteen:

Numbers.Integers.Skip(5).Take(11).WriteAll();

5 6 7 8 9 10 11 12 13 14 15

Why is it dangerous? If you put any operation in the chain that simply iterates all the values you'll get an infinite loop. So you couldn't say:

Numbers.Integers.Reverse.Take(10).WriteAll();

Let's wrap up with an infinite Fibonacci series:

public static IEnumerable<int> Fibonacci
{
    get
    {
        int a = 0;
        int b = 1;
        int t = 0;

        yield return a;
        yield return b;

        while (true)
        {
            yield return t = a + b;
            a = b;
            b = t;
        }
    }
}

And use it to print out the first ten Fibonacci numbers:

Numbers.Fibonacci.Take(10).WriteAll();

0 1 1 2 3 5 8 13 21 34

I'm really enjoying learning the possibilities of Linq style extension methods. It makes C# feel much more like a functional language and let's you write in a nicer declarative style. If I've tickled your interest, check out Wes Dyer's blog. Here he writes an ASCII art program using Linq. Nice!