Tuesday, September 27, 2011

Some Thoughts On Service Oriented Architecture (Part 2)

I’ve been writing a high-level ‘architectural vision’ document for my current clients. I thought it might be nice to republish bits of it here. This is part 2. The first part is here.

My Client has a core product that is heavily customised for each customer. In this post we look at the different kinds of components that make up this architecture. How some are common services that any make up the core product, and how other components might be bespoke pieces for a particular customer. We also examine the difference between workflow, services and endpoints.

It is very important that we make a clear distinction between components that we write as part of the product and bespoke components that we write for a particular customer. We should not put customer specific code into product components and we should not replicate common product code in customer specific pieces.

Because we are favouring small single-purpose components over large multi-purpose monolithic applications, it should be easy for us to differentiate between product and customer pieces.

There are three main kinds of components that make up a working system. Services, workflow and endpoints. The diagram below shows how they communicate via EasyNetQ, our open-source infrastructure layer. The green parts are product pieces. The blue parts are bespoke customer specific pieces.



Services are components that implement a piece of the core product. An example of a service is a component called Renderer that takes templates and data and does a kind of mail-merge. Because Renderer is a service it should never contain any client specific code. Of course customer requirements might mean that enhancements need to be made to Renderer, but these enhancements should always be done with the understanding that Renderer is part of the product. We should be able to deploy the enhanced Renderer to all our customers without the enhancement affecting them.

Services (in fact all components) should maintain their own state using a service specific database. This database should not be shared with other services. The service should communicate via EasyNetQ with other services and not use a shared database as a back-channel. In the case of an updated Renderer, templates would be stored in Renderer’s local database. Any new or updated templates would arrive as messages via EasyNetQ. Each data item to be rendered would also arrive as a message, and once the data has been rendered, the document should also be published via EasyNetQ.

The core point here is that each service should have a clear API, defined by the message types that it subscribes to and publishes. We should be able to fully exercise a component via messages independently of other services. Because the service’s database is only used by the service, we should be able to flexibly modify its schema in response to changing requirements, without having to worry about the impact that will have on other parts of the system.

It’s important that services do not implement workflow. As we’ll see in the next section, a core feature of this architecture is that workflow and services are separate. Render, for example, should not make decisions about what happens to a document after it is rendered, or implement batching logic. These are separate concerns.


Workflow components are customer specific components that describe what happens in response to a specific business trigger. They also implement customer specific business rules. For an airline, an example would be workflow that is triggered by a flight event, say a delay. When the component receives the delay message, it might first retrieve the manifest for the flight by sending a request to a manifest service, then render a message telling each passenger about the delay by sending a render request to renderer, then finally send that message by email by publishing an email request. It would typically implement business rules describing when a delay is considered important etc.

By separating workflow from services, we can flexibly implement customer requirements by creating custom workflows without having to customise our services. We can deliver bespoke customer solutions on a common product platform.

We call these workflow pieces, ‘Sagas’, this is a commonly used term in the industry for a long-running business process. Because sagas all need a common infrastructure for hosting, EasyNetQ includes a ‘SagaHost’. SagaHost is a Windows service that hosts sagas, just like it says on the box. This means that the sagas themselves are written as simple assemblies that can be xcopy deployed.

Sagas will usually require a database to store their state. Once again, this should be saga specific and not a database shared by other services. However a single customer workflow might well consist of several distinct sagas, it makes sense for these to be thought of as a unit. These may well share a single database.


Endpoints are components that communicate with the outside world. They are a bridge between our internal AMQP messaging infrastructure and our external HTTP API. The only way into and out of our product should be via this API. We want to be able to integrate with diverse customer systems, but these integration pieces should be implemented as bridges between the customer system and our official API, rather than as bespoke pieces that publish or subscribe directly to the message bus.

Endpoints come in two flavours, externally triggered and internally triggered. An externally triggered endpoint is where communication is initiated by the customer. An example of this would be flight event. These components are best implemented as web services that simply wait to be called and then publish an appropriate message using EasyNetQ.

An internally triggered endpoint is where communication is triggered by an internal event. An example of this would be the completion of a workflow with the final step being an update of a customer system. The API would be implemented as a Windows Service that subscribes to the update message using EasyNetQ and implements an HTTP client that makes a web service request to a configured endpoint.

The Importance of Testability

A core requirement for any component is that it should be testable. It should be possible to test Services and Workflow (Sagas) simply by sending them messages and checking that they respond with the correct messages. Because ‘back-channel’ communication, especially via shared databases, is not allowed we can treat these components as black-boxes that always respond with the same output to the same input.

Endpoints are slightly more complicated to test. It should be possible to send an externally triggered endpoint a request and watch for the message that’s published. An internally triggered endpoint should make a web request when it receives the correct message.

Developers should provide automated tests to QA for any modification to a component.

Monday, September 26, 2011

Some Thoughts On Service Oriented Architecture

I’ve been writing a high-level ‘architectural vision’ document for my current clients. I thought it might be nice to republish bits of it here. This is the section that makes a justification a service oriented architecture based on messaging.

I’ve taken out anything client-specific.

SOA is one of those snake-oil consultancy terms that seem to mean different things depending on who you talk to. My own views on SOA have been formed from three main sources. Firstly there’s the bible on SOA, Hohpe & Woolf’s Enterprise Integration Patterns. This really is essential reading for anyone attempting to get systems to work together. Next there’s the work of Udi Dahan. He the author of the excellent NServiceBus messaging framework for .NET. I’ve attended his course, and his blog is a fantastic mine of knowledge on all things SOA. Lastly there is the fantastic series of blog posts by Bill Poole on SOA. Just check JBOWS is Bad for a taster.

So what is SOA? Software has a complexity problem. There appears to be a geometric relationship between the complexity of a monolithic system and its stability and maintainability. So a system that is twice as complex as another one will be maybe four times more expensive to maintain and a quarter as stable. There is also the human and organisational side of software. Individual teams tend to build or buy their own solutions. The organisation then needs to find ways for these disparate systems to share information and workflow. Small single purpose systems are always a better choice than large monolithic ones. If we build our systems as components, we can build and maintain them independently. SOA is a set of design patterns that guide us in building and integrating these mini-application pieces.

What is a component?

A component in SOA terms is very different from a component in Object-Oriented terms. A component in OO terms is a logical component that is not independently compiled and deployed. A component or ‘service’ in SOA is a physical component that is an independently built and deployed stand-alone application. It is usually designed as a Windows service or possibly as a web service or an executable. It may or may not have a UI, but it will have some mechanism to communicate with other services.

Each component/service should have the following characteristics:

  • Encapsulated. The service should not share its internal state with the outside world. The most common way people break this rule is by having services share a single database. The service should only change state in response to input from its public interface.
  • Contract. The service should have a clear contract with the outside world. Typically this will be a web-service API and/or the message types that it publishes and consumes.
  • Single Purpose. A component should have one job within the system as a whole. Rendering PDFs for example.
  • Context Free. A component should not be dependent on other components, it should only depend on contracts. For example, if my business process component relies on getting a flight manifest from somewhere, it should simply be able to publish a flight manifest request, it shouldn’t expect a specific flight manifest service to be present.
  • Independently deployable. I should be able to deploy the component/service independently.
  • Independently testable. It should be possible to test the service in isolation without having other services in the system running.

How do components communicate?

Now that we’ve defined the characteristics of a component in our SOA, the next challenge is to select the technology and patterns that they use to communicate with each other. We can list the things that we want from our communication technology:

  • Components should only be able to communicate via a well defined contract - logically-decoupled. They should not be able to dip into each other’s internal state.
  • Components should not have to be configured to communicate with specific endpoints – decoupled configuration. It should be possible to deploy a new service without having to reconfigure the services that it talks to.
  • The communication technology should support ‘temporal-decoupling’. This means that all the services do not have to be running at the same time for the system to work.
  • The communication should be low latency. There should be a minimal delay between a component sending a message and the consumer receiving it.
  • The communication should be based on open standards.

Let’s consider how some common communication technologies meet these criteria…

  logically decoupled decoupled configuration temporal decoupling low latency open standards
File Transfer yes yes yes no yes
Shared Database no yes yes no no
RPC (remoting COM+) yes no no yes no
SOAP Web Services yes no no yes yes
Message Queue (MSMQ) yes no yes yes no
Pub/Sub Messaging (AMQP) yes yes yes yes yes


From the table we can see that there’s only one technology that ticks all our boxes, and that is pub/sub messaging based on an open standard like AMQP. AMQP is a bit like HTTP for messaging, an open wire-level protocol for brokers and their clients. The leading AMQP implementation is RabbitMQ, and this is technology we’ve chosen as our core messaging platform.

There is a caveat however. For communicating with external clients and their systems, building endpoints with open-standards outweighs all other considerations. Now AMQP is indeed an open standard, but it’s a relatively new one that’s only supported by a handful of products. To be confident that we can interoperate with a wide variety of 3rd party systems over the internet we need a ubiquitous technology, so for all external communication we will use HTTP based web-services.

So, AMQP internally, HTTP externally.

Wednesday, September 14, 2011

Thoughts on Windows 8 (part 2)

Back in June I wrote some thoughts on Windows 8 after the initial announcement. Now that we’ve got more details from the Build conference, I thought I’d do a little update.

Microsoft have climbed down and made a significant concession to WPF/Silverlight/.NET devs. Gone is the previous message that Metro applications will only be written in HTML/Javascript, developers can now choose which technology they want to use on the new platform. There still seems to be a bias towards HTML/Javascript judging my the number of sessions on each however, and it seems like MS would prefer developers to go down the HTML/Javascript route. How much this double headed personality effects the development experience is yet to be seen.

Somebody high up in MS must have banged some heads together to get the Windows and Developer Divisions talking to each other. They'd become two opposing camps after the fallout from the failed Vista WinFX experiment. Now Windows is forced to support XAML/.NET and Dev-Division arm-wrestled into supporting the HTML/Javascript model in Visual Studio and Blend. Once again the depth of this rapprochement will be the deciding factor when it comes to getting a consistent message across to us developers.

The message is still clear though; Javascript has won the latest round of the language wars, whatever you think about it as a language, it's becoming as ubiquitous as C. But .NET developers are going to have to be dragged kicking and screaming to this party.

The big question is still 'will it work?' Will Windows 8 be enough to get MS back into the game? There are two main problems I can see:

1. Microsoft has a strategic problem. They make money by selling operating systems. Their two main competitors, Google and Apple, don't. Will they be able to make Windows 8 financially attractive to tablet developers when they can get Android licence free and they are competing on price with the dominant iPad? Having said that, Android has been struggling on tablets, so there's still an opportunity for Microsoft to get traction in on that form factor.

2. Windows 8 is a hybrid. There's the traditional Windows mouse-and-keyboard UI that they have to keep, and there's the new Metro UI that they want everyone to develop for. What's the experience going to be like for a tablet user when they end up in Windows classic, or for the desktop user when they switch to Metro? The development experience for both sets of UI is going to be very different too, almost like developing for two different platforms. It will be interesting if Microsoft sells a business version of Windows 8 without Metro, or indeed a tablet version without the classic UI.

Despite having raised all these questions, I do think Microsoft have a workable strategy for Windows 8, and it’s going  to be an exciting time over the next few years to see how it pans out. There is still a window (sorry) of opportunity in the tablet form factor for Microsoft to challenge Apple id they can get this right. Let’s hope they can.

Tuesday, September 13, 2011

Why Write a .NET API For RabbitMQ?

Anyone who reads this blog knows that my current focus is writing a .NET API for RabbitMQ which I’ve named EasyNetQ. This is being paid for by my excellent clients 15Below who build high volume messaging solutions for the airline industry. EasyNetQ is a core part of their strategy moving forwards, so it’s going to get plenty of real-world use.
One question I’ve been asked quite a lot, is ‘why?’. Why am I building a .NET API when one is already available; the C# AMQP client from RabbitHQ? Think of AMQP as the HTTP of messaging. It’s a relatively low-level protocol. You typically wouldn’t build a web application directly against a low-level HTTP API such as System.Net.WebRequest, instead you would use a higher level toolkit such as WCF or ASP.NET MVC. Think of EasyNetQ as the ASP.NET MVC of AMQP.
AMQP is designed to be cross platform and language agnostic. It is also designed to flexibly support a wide range of messaging patterns based on the Exchange/Binding/Queue model. It’s great having this flexibility, but with flexibility comes complexity. It means that you will need to write a significant amount of code in order to implement a RabbitMQ client. Typically this code would include:
  • Implementing messaging patterns such as Publish/Subscribe or Request/Response. Although, to be fair, the .NET client does provide some support here.
  • Implement a routing strategy. How will you design your exchange-queue bindings, and how will you route messages between producers and consumers?
  • Implement message serialization/deserialization. How will you convert the binary representation of messages in AMQP to something your programming language understands?
  • Implement a consumer thread for subscriptions. You will need to have a dedicated consumer loop waiting for messages you have subscribed to. How will you deal with multiple subscribers, or transient subscribers, like those waiting for responses from a request?
  • Implement a versioning strategy for your messages. What happens when your message schema needs to change in response to business requirements?
  • Implement subscriber reconnection. If the connection is disrupted or the RabbitMQ server bounces, how do you detect it and make sure all your subscriptions are rebuilt?
  • Understand and implement quality of service settings. What settings do you need to make to ensure that you have a reliable client.
  • Implement an error handling strategy. What should your client do if it receives a malformed message, or if an unexpected exception is thrown?
  • Implement monitoring tools. How will you monitor your client applications so that you are alerted if there are any problems?
With EasyNetQ, you get all these out-of-the-box. You loose some of the flexibility in exchange for a model based on .NET types for routing, but it saves an awful lot of code.

Monday, September 12, 2011

Restarting RabbitMQ With Running EasyNetQ Clients

Here’s a screenshot of one of my tests of EasyNetQ (my easy-to-use .NET API for RabbitMQ). I’m running two publishing applications (top) and two subscribing applications (bottom), all publishing and subscribing to the same queue.  We’re getting a throughput of around 5000 messages / second on my local machine. Once they’re all humming along nicely, I bounce the RabbitMQ service. As you can see,  some messages get logged and once the RabbitMQ service comes back, things recover nicely and all the applications continue publishing and subscribing as before. I think that’s pretty sweet :)
The publisher and subscriber console apps are both in the EasyNetQ repository on GitHub:

Friday, August 26, 2011

How to stop System.Uri un-escaping forward slash characters

Sometimes you want to construct a URI that has an escaped forward slash. For example, the RabbitMQ Management API requires that you encode the default rabbit VirtualHost ‘/’ as ‘%2f’. Here is the URL to get the details of a queue:

But if I try use WebRequest or WebClient the ‘%2f’ is un-escaped to a ‘/’, so the URL becomes:

And I get a 404 not found back :(

Both WebRequest and WebClient use System.Uri internally. It’s easy to demonstrate this behaviour with the following code:

var uri = new Uri(url);
Console.Out.WriteLine("uri = {0}", uri.PathAndQuery);
// outputs /api/queues///EasyNetQ_Default_Error_Queue

A bit of digging in the System.Uri code thanks to the excellent ReSharper 6.0, and help from this Stack Overflow question, shows that it’s possible to reset some flags and stop this behaviour. Here’s my LeaveDotsAndSlashesEscaped method (it’s .NET 4.0 specific):

private void LeaveDotsAndSlashesEscaped()
var getSyntaxMethod =
typeof (UriParser).GetMethod("GetSyntax", BindingFlags.Static | BindingFlags.NonPublic);
if (getSyntaxMethod == null)
throw new MissingMethodException("UriParser", "GetSyntax");

var uriParser = getSyntaxMethod.Invoke(null, new object[] { "http" });

var setUpdatableFlagsMethod =
uriParser.GetType().GetMethod("SetUpdatableFlags", BindingFlags.Instance | BindingFlags.NonPublic);
if (setUpdatableFlagsMethod == null)
throw new MissingMethodException("UriParser", "SetUpdatableFlags");

setUpdatableFlagsMethod.Invoke(uriParser, new object[] {0});

The usual caveats of poking into system assemblies with reflection apply. Don’t expect this to work with any other version of .NET than 4.0.
Now if we re-run our test…
const string url = "";
var uri = new Uri(url);
Console.Out.WriteLine("uri = {0}", uri.PathAndQuery);
// outputs /api/queues/%2f/EasyNetQ_Default_Error_Queue

If you have a Web.config or App.config file you can also try this configuration setting, which should do the same thing (from this SO question) but I haven’t tried it personally:
<add name="http" genericUriParserOptions="DontUnescapePathDotsAndSlashes" />

Monday, July 18, 2011

An Action Cache

Do you ever find yourself in a loop calling a method that expects an Action or a Func as an argument? Here’s an example from an EasyNetQ test method where I’m doing just that:

[Test, Explicit("Needs a Rabbit instance on localhost to work")]
public void Should_be_able_to_do_simple_request_response_lots()
for (int i = 0; i < 1000; i++)
var request = new TestRequestMessage { Text = "Hello from the client! " + i.ToString() };
bus.Request<TestRequestMessage, TestResponseMessage>(request, response =>
Console.WriteLine("Got response: '{0}'", response.Text));


My initial naive implementation of IBus.Request set up a new response subscription each time Request was called. Obviously this is inefficient. It would be much nicer if I could identify when Request is called more than once with the same callback and re-use the subscription.

The question I had was: how can I uniquely identify each callback? It turns out that action.Method.GetHashcode() reliably identifies a unique action. I can demonstrate this with the following code:

public class UniquelyIdentifyDelegate
readonly IDictionary<int, Action> actionCache = new Dictionary<int, Action>();

public void DemonstrateActionCache()
for (var i=0; i < 3; i++)
RunAction(() => Console.Out.WriteLine("Hello from A {0}", i));
RunAction(() => Console.Out.WriteLine("Hello from B {0}", i));


public void RunAction(Action action)
Console.Out.WriteLine("Mehod = {0}, Cache Size = {1}", action.Method.GetHashCode(), actionCache.Count);
if (!actionCache.ContainsKey(action.Method.GetHashCode()))
actionCache.Add(action.Method.GetHashCode(), action);

var actionFromCache = actionCache[action.Method.GetHashCode()];


Here, I’m creating an action cache keyed on the action method’s hashcode. Then I’m calling RunAction a few times with two distinct action delegates. Note that they also close over a variable, i, from the outer scope.

Running DemonstrateActionCache() outputs the expected result:

Mehod = 59022676, Cache Size = 0
Hello from A 0
Mehod = 62968415, Cache Size = 1
Hello from B 0

Mehod = 59022676, Cache Size = 2
Hello from A 1
Mehod = 62968415, Cache Size = 2
Hello from B 1

Mehod = 59022676, Cache Size = 2
Hello from A 2
Mehod = 62968415, Cache Size = 2
Hello from B 2

Rather nice I think :)

Task Parallel Library: How To Write a Simple Delay Task

I just had a need for a delay task. A simple method that I can call to create a task that will turn a Func<T> into a Task<T> that will execute after a given delay.

The starting point for any Task creation based on an external asynchronous operation, like a Timer callback, is the TaskCompletionSource class.  It provides methods to transition the task it creates to different states. You call SetResult when the operation is completes, SetException if the operation fails, and SetCancelled if you want to cancel the task.

Here’s my RunDelayed method:

private static Task<T> RunDelayed<T>(int millisecondsDelay, Func<T> func)
if (func == null)
throw new ArgumentNullException("func");
if (millisecondsDelay < 0)
throw new ArgumentOutOfRangeException("millisecondsDelay");

var taskCompletionSource = new TaskCompletionSource<T>();

var timer = new Timer(self =>
var result = func();
catch (Exception exception)
timer.Change(millisecondsDelay, millisecondsDelay);

return taskCompletionSource.Task;

I simply create a new TaskCompletionSource and a Timer where the callback calls SetResult with the result of the given Func<T>. If the Func<T> throws, we simply catch the exception and call SetException. Finally we start the timer and return the Task.

You would use it like this:

var task = RunDelayed(1000, () => "Hello World!");
task.ContinueWith(t =>
// 'Hello World' is output a second later on a threadpool thread.

You can use the same technique to turn any asynchronous operation into a Task.

Note however if your operation exposes an APM API, it’s much easier to use the Task.Factory.FromAsync method.

Thursday, July 14, 2011

EasyNetQ: How Should a Messaging Client Handle Errors?

EasyNetQ is my simple .NET API for RabbitMQ.

I’ve started thinking about the best patterns for implementing error handling in EasyNetQ. One of the aims of EasyNetQ is to remove as many infrastructure concerns from the application developer as possible. This means that the API should correctly handle any exceptions that bubble up from the application layer.

One of the core requirements is that we shouldn’t lose messages when the application throws. The question then becomes: where should the message, that the application was consuming when it threw, go? There seem to be three choices:

  1. Put the failed message back on the queue it was consumed from.
  2. Put the failed message on an error queue.
  3. A combination of 1 and 2.

Option 1 has the benefit that it’s the out-of-the-box behaviour of AMQP. In the case of EasyNetQ, I would simply catch any exceptions, log them, and just send a noAck command back to RabbitMQ. Rabbit would put the message at the back of the queue and then resend it when it got to the front.

Another advantage of this technique is that it gives competing consumers the opportunity to process the message. If you have more than one consumer on a queue, Rabbit will send the messages to them in turn, so this is out-of-the-box.

The drawback of this method is that there’s the possibility of the queue filling up with failed messages. The consumer would just be cycling around throwing exceptions and any messages that it might be able to to consume would be slowed down by having to wait their turn amongst a long queue of failed messages.

Another problem is that it’s difficult to manually inspect the messages and selectively delete or retry them.

Option 2 is harder to implement. When an error occurs I would wrap the failed message in a special error message wrapper. This can include details about the type and location of the exception and other information such as stack traces. I would then publish the error message to an error exchange. Each consumer queue should have a matching error exchange. This gives the opportunity to bind generic error queues to all error exchanges, but also to have special case error consumers for particular queues.

I would need to write an error queue consumer to store the messages in a database. I would then need to provide the user with some way to inspect the messages alongside the error that caused them to arrive in the error queue so that they could make a ignore/retry decision.

I could also implement some kind of wait-and-retry function on the error queue, but that would also add additional complexity.

It has the advantage that the original queue remains clear of failing messages. Failed messages and the error condition that caused the failure can be inspected together, and failed messages can be manually ignored or retried.

With the failed messages sitting in a database, it would also be simple to create a mechanism where those messages could be replayed on a developer machine to aid in debugging.

A combination of 1 and 2. I’m moving towards thinking that a combination of 1 & 2 might be the best strategy. When a message fails initially, we simply noAck it and it goes back to the queue. AMQP provides a Redelivered flag, so when the messages is consumed a second time we can be aware that it’s a retry. Unfortunately there doesn’t seem to be a retry count in AMQP, so the best we can do is allow for a single retry. This has the benefit that it gives a competing consumer a chance to process the message.

No retry count is a problem. One option some people use is to roll their own ‘nack’ mechanism. In this case, when an error occurs in the consumer, rather than sending a ‘nack’ to Rabbit and relying on the built-in behaviour, the client ‘acks’ the message to remove it from the queue, and then re-publishes it via the default exchange back to the originating queue. Doing this gives the client access to the message and allows a ‘retry count’ header to be set.

After the single retry we fall back to Option 2. The message is passed to the error queue on the second failure.

I would be very interested in hearing how other people have implemented error handling with AMQP/RabbitMQ.

Updated based on feedback on the 15th July

Wednesday, July 13, 2011

MEF DirectoryCatalog Fails to Load Assemblies

I had an interesting problem with the Managed Extensibility Framework yesterday. I’m using the DirectoryCatalog to load assemblies from a given directory. Pretty standard stuff. When I tested my host on my developer machine, it got the works on my machine badge, but when I ran the host on one of our servers, it ignored all the assemblies.

Nothing loaded …

Hmm …

It turns out, after much digging and help from my Twitter crew,  that the assembly loader that MEF’s DirectoryCatalog uses ignores any files that have a URL Zone set. I described these zones in detail in my previous post here:


Because we copy our plugins from a file share, Windows was marking them as belonging to the Intranet Zone. Thus the odd only-when-deployed behaviour.

How you deal with this depends on whether you think that files marked in this way represent a security threat or not. If you do, the best policy is to detect any assemblies in your DirectoryCatalogue directory that have a Zone set and log them. You can do that with the System.Security.Policy.Zone class:

var zone = Zone.CreateFromUrl("file:///C:/temp/ZoneTest.doc");
if (zone.SecurityZone != SecurityZone.MyComputer)
Console.WriteLine("File is blocked");
Console.Out.WriteLine("zone.SecurityZone = {0}", zone.SecurityZone);

If you don’t consider files copied from elsewhere a security concern, but rather a feature of your operating procedure, then you can clear the Zone flags from all the assemblies in the directory with the help of Richard Deeming’s Trinet.Core.IO.Ntfs library. I wrote a little class using this:

public class UrlZoneService
public static void ClearUrlZonesInDirectory(string directoryPath)
foreach (var filePath in Directory.EnumerateFiles(directoryPath))
var fileInfo = new FileInfo(filePath);

I just run this before initiating my DirectoryCatalogue and now network copied assemblies load as expected.

Detecting and Changing a File’s Internet Zone in .NET: Alternate Data Streams

I spent most of yesterday investigating some weird behaviour in MEF, which I’ll discuss in another post. I was saved by Twitter in the guise of @Grumpydev, @jordanterrell and @SQLChap who came to the rescue and led me down a very interesting rabbit hole, to a world of URL Zones and Alternate Data Streams. Thanks chaps!

If you download a file from the internet on Windows 2003 or later, right click, and select properties, you’ll see something like this:


The file is ‘blocked’ which means that you will get various dialogues if you try to say, run an executable with this flag set.

Any file on NTFS can have a ‘Zone’ as the flag is called. The values are described in this enumeration:

typedef enum tagURLZONE {

The Zone is not standard security information stored in the file’s ACL. Instead it uses a little known feature of NTFS, ‘Alternate Data Streams’ (ADS).

Sysinternals provide a command line utility streams.exe that you can use to inspect and remove ADSs, including the Zone flag, on a file or a whole directory tree of files.

You can access a file’s Zone in .NET by using the System.Security.Policy.Zone class. Like this:

var zone = Zone.CreateFromUrl("file:///C:/temp/ZoneTest.doc");
if (zone.SecurityZone != SecurityZone.MyComputer)
Console.WriteLine("File is blocked");
Console.Out.WriteLine("zone.SecurityZone = {0}", zone.SecurityZone);

If you want to create, view and delate ADSs in .NET you will need to resort to pInvoke, there is no support for them in the BCL. Luckily for us, Richard Deeming, has done the work for us and created a set of classes that wrap the NTFS API. You can read about it here and get the code from GitHub here.

Using Richard’s library, you can list the ADSs for a file and their values like this:

var fileInfo = new FileInfo(path);

foreach (var alternateDataStream in fileInfo.ListAlternateDataStreams())
Console.WriteLine("{0} - {1}", alternateDataStream.Name, alternateDataStream.Size);

// Read the "Zone.Identifier" stream, if it exists:
if (fileInfo.AlternateDataStreamExists("Zone.Identifier"))
Console.WriteLine("Found zone identifier stream:");

var s = fileInfo.GetAlternateDataStream("Zone.Identifier",FileMode.Open);
using (TextReader reader = s.OpenText())
Console.WriteLine("No zone identifier stream found.");

When I run this against a file downloaded from the internet I get this output:

Zone.Identifier - 26
Found zone identifier stream:

You can see that the ZoneId = 3, so this file’s Zone is URLZONE_INTERNET.

You can delete an ADS like this:

var fileInfo = new FileInfo(path);

And lastly you can set the ZoneId like this. Here I’m changing a file to have a internet zone:

var fileInfo = new FileInfo(path);

var ads = new AlternateDataStreamInfo(path, "Zone.Identifier", null, false);
using(var stream = ads.OpenWrite())
using(var writer = new StreamWriter(stream))

ADSs are very interesting, and open up a whole load of possibilities. Imagine storing application specific metadata in an ADS for example. I’d be very interested to hear if anyone has used them in this way.

Monday, July 11, 2011

RabbitMQ Subscriptions with the DotNet Client

RabbitMQ comes with a nice .NET client called, appropriately enough, ‘RabbitMQ DotNet Client’. It does a good job of implementing the AMQP protocol in .NET and comes with excellent documentation, which is good because there are some interesting subtleties in its usage. This is because AMQP is designed with flexibility in mind and supports a mind boggling array of possible messaging patterns. But as with any API, with flexibility comes complexity.

The aim of EasyNetQ, my simple messaging API for RabbitMQ on .NET, is to hide much of this complexity and provide a very simple to use interface. But in order to make it simple I have had to take away much of the flexibility of AMQP and instead provide a strongly opinionated view of one way of using RabbitMQ with .NET.

Today I’m going to discuss how Subscriptions work with the RabbitMQ DotNet Client  (RDC) and some of the choices that I’ve made in EasyNetQ.

You create a subscription using the RDC with the AMQP command ‘basic consume’. You pass in the name of the queue you want to consume from.

channel.BasicConsume(ackNackQueue, noAck, consumer);

If you use the default QueueingBasicConsumer, the RabbitMQ server then takes messages from the queue you specified and sends them over the network to the RDC. The RDC has a dedicated worker thread that listens to a TCP socket and pulls the messages off as they arrive and places them on a shared thread-safe queue. The client application, in my case EasyNetQ, pulls messages off the shared queue on its own thread and processes them as required. Once it has processed the message it can acknowledge that it has completed by sending an AMQP ‘basic ack’ command. At that point the RabbitMQ server removes the message from its queue.


Now, what happens if messages are arriving faster than the user application can process them? The shared queue will gradually fill up with messages and eventually the process will run out of memory. That’s a bad thing. To fix this, you can limit the number of messages that RabbitMQ will send to the RDC before they are acknowledged with the Quality of Service prefetchCount setting.

channel.BasicQos(0, prefetchCount, false);

The default value for prefetchCount is zero, which means that there is no limit. If you set prefetchCount to any other positive value, that will be the maximum number of messages that the RDC’s queue will hold at any one time. Setting the prefectchCount to a reasonably high number will allow RabbitMQ to more efficiently stream messages across the network.

What happens if the shared queue is full of messages and my client application crashes? Won’t all the messages be lost? No, because messages are only removed from the RabbitMQ queue when the user application sends the basic ack message. The messages queued in the RDC’s shared queue are not acknowledged and so will not yet have been removed from the RabbitMQ queue.

However, if when you call ‘basic consume’ you pass in true for ‘noAck’ then the messages will be removed from the RabbitMQ queue as they are transmitted across the network. You would use this setting if you’re not worried about loosing some messages, but need them to be transmitted as efficiently as possible.

For EasyNetQ, I’ve made the default settings as follows: 1000 messages for the prefetchCount and noAck to be false. I’m assuming that most users will value reliability over performance. Eventually I hope to provide some dial with setting like ‘high throughput, low reliability’, ‘low throughput, high reliability’, but for now I’m going for reliability.

I’d be very interested to hear from anyone who’s using RabbitMQ with .NET and how they have configured these settings.

Sunday, July 10, 2011

What is a Closure?

This question came up at the last Brighton ALT.NET Beers. It proved almost impossible to discuss in words without seeing some code, so here’s my attempt to explain closures in C#. Wikipedia says:

In computer science, a closure (also lexical closure, function closure or function value) is a function together with a referencing environment for the nonlocal names (free variables) of that function. Such a function is said to be "closed over" its free variables. The referencing environment binds the nonlocal names to the corresponding variables in scope at the time the closure is created, additionally extending their lifetime to at least as long as the lifetime of the closure itself.

So a closure is a function that ‘captures’ or ‘closes over’ variables that it references from the scope in which it was created. Yes, hard to picture, but actually much easier to understand when you see some code.

var x = 1;

Action action = () =>
var y = 2;
var result = x + y;
Console.Out.WriteLine("result = {0}", result);


Here we first define a variable ‘x’ with a value of 1. We then define an anonymous function delegate (a lambda expression) of type Action. Action takes no parameters and returns no result, but if you look at the definition of ‘action’, you can see that ‘x’ is used. It is ‘captured’ or ‘closed over’ and automatically added to action’s environment.

When we execute action it prints out the expected result. Note that the original ‘x’ can be out of scope by the time we execute action and it will still work.

It’s interesting to look at ‘action’ in the debugger. We can see that the C# compiler has created a Target class for us and populated it with x:


Closures (along with higher order functions) are incredibly useful. If you’ve ever done any serious Javascript programming you’ll know that they can be used to replace much of the functionality of object oriented languages like C#. I wrote an example playing with this idea in C# a while back.

As usual, John Skeet covers closures in far more detail. Check this chapter from C# in Depth for more information, including the common pitfalls you can run into.

Tuesday, July 05, 2011

The First Rule of Threading: You Don’t Need Threads!

I’ve recently been introduced to a code base that illustrates a very common threading anti-pattern. Say you’ve got a batch of data that you need to process, but processing each item takes a significant amount of time. Doing each item sequentially means that the entire batch takes an unacceptably long time. A naive approach to solving this problem is to create a new thread to process each item. Something like this:

foreach (var item in batch)
var itemToProcess = item;
var thread = new Thread(_ => ProcessItem(itemToProcess));

The problem with this is that each thread takes significant resources to setup and maintain. If there are hundreds of items in the batch we could find ourselves short of memory.

It’s worth considering why ProcessItem takes so long. Most business applications don’t do processor intensive work. If you’re not protein folding, the reason your process is talking a long time is usually because it’s waiting on IO – communicating with the database or web services somewhere, or reading and writing files. Remember, IO operations aren’t somewhat slower than processor bound ones, they are many many orders of magnitude slower. As Gustavo Duarte says in his excellent post What Your Computer Does While You Wait:

Reading from L1 cache is like grabbing a piece of paper from your desk (3 seconds), L2 cache is picking up a book from a nearby shelf (14 seconds), and main system memory is taking a 4-minute walk down the hall to buy a Twix bar. Keeping with the office analogy, waiting for a hard drive seek is like leaving the building to roam the earth for one year and three months.

You don’t need to keep a thread around while you’re waiting for an IO operation to complete. Windows will look after the IO operation for you, so long as you use the correct API. If you are writing these kinds of batch operations, you should always favour asynchronous IO over spawning threads. Most (but not all unfortunately) IO operations in the Base Class Library (BCL) have asynchronous versions based on the Asynchronous Programming Model (APM). So, for example:

string MyIoOperation(string arg)

Would have an equivalent pair of APM methods:

IAsyncResult BeginMyIoOperation(string arg, AsyncCallback callback, object state);
string EndMyIoOperation(IAsyncResult);

You typically ignore the return value from BeginXXX and call the EndXXX inside a delegate you provide for the AsyncCallback:

BeginMyIoOperation("Hello World", asyncResult => 
var result = EndMyIoOperation(asyncResult);
}, null);

Your main thread doesn’t block when you call BeginMyIoOperation, so you can run hundreds of them in short order. Eventually your IO operations will complete and the callback you defined will be run on a worker thread in the CLR’s thread pool. Profiling your application will show that only a handful of threads are used while your hundreds of IO operations happily run in parallel. Much nicer!

Of course all this will become much easier with the async features of C# 5, but that’s no excuse not to do the right thing today with the APM.

Wednesday, June 29, 2011

I Don’t Have Time for Unit Tests

I’ve helped several organisations adopt Test Driven Development (TDD). The initial worry that almost everyone has, is that it will hurt productivity. It seems intuitively correct, because of course, the developer has to write the unit tests as well as the production code. However when you actually look at how developers spend their time, actually writing code is a small part of it. Check out this study by Peter Hallam from the Visual Studio team at Microsoft:


According to Peter, developers actually spend their days like this (while not reading Code Rant of course :):

  • Writing new code 5%
  • Modifying existing code 25%
  • Understanding Code 70%

If some technique allowed you to half the modifying/understanding parts of the job by doubling the new-code part, you’d now be taking only ~60% of the time you previously took to deliver a feature, almost doubling your productivity.

TDD is that technique. In my experience the productivity gains come from:

  • Allows safe modifications. If you break something when you modify some code, the unit tests fail.
  • Shortening the iteration cycle between writing/running code. No more need to step through your application to part where your new code gets exercised.
  • Mistakes in your code are shallow and obvious. No more need to step through code in the debugger, wondering which part of your application is broken.
  • Code is self-documenting. The unit tests explicitly show how the author expected the code to be used.
  • Code is decoupled. You can’t do TDD without decoupling. This alone makes the code easier to understand as a unit and much safer to modify.

Note that I’m just talking about feature-delivery productivity here. I haven’t mentioned the huge gains in stability and drop in bug-counts that you also get.

Now I don’t deny that TDD is a fundamentally different way of working that takes time to learn. Undoubtedly productivity may drop while a developer is getting up to speed. But in my experience the argument that a developer doing TDD is slower than a developer who doesn’t do it is simply not true.

Monday, June 27, 2011

RabbitMQ, Subscription, and Bouncing Servers in EasyNetQ

If you are a regular reader of my blog, you’ll know that I’m currently working on a .NET friendly API for RabbitMQ, EasyNetQ. EasyNetQ is opinionated software. It takes away much of the complexity of AMQP and replaces it with a simple interface that relies on the .NET type system for routing messages.

One of the things that I want to remove from the ‘application space’ and push down into the API is all the plumbing for reporting and handling error conditions. One side of this to provide infrastructure to record and handle exceptions thrown by applications that use EasyNetQ. I’ll be covering this in a future post. The other consideration, and the one I want to address in this post, is how EasyNetQ should gracefully handle network connection or server failure.

The Fallacies of Distributed Computing tell us that, no matter how reliable RabbitMQ and the Erlang platform might be, there will still be times when a RabbitMQ server will go away for whatever reason.

One of the challenges of programming against a messaging system as compared with a relational database, is the length of time that the application holds connections open. A typical database connection is opened, some operation is run over it – select, insert, update, etc – and then it’s closed. Messaging system subscriptions, however, require that the client, or subscriber, holds an open connection for the lifetime of the application.

If you simply program against the low level C# AMQP API provided by RabbitHQ to create a simple subscription, you’ll notice that after a RabbitMQ server bounce, the subscription no longer works. This is because the channel you opened to subscribe to the queue, and the consumption loops attached to them, are no longer valid. You need to detect the closed channel and then attempt to rebuild the subscription once the server is available again.

The excellent RabbitMQ in Action by Videla and Williams describes how to do this in chapter 6, ‘Writing code that survives failure’. Here’s their Python code example:


EasyNetQ needs to do something similar, but as a generic solution so that all subscribers automatically get re-subscribed after a server bounce.

Here’s how it works.

Firstly, all subscriptions are created in a closure:

public void Subscribe<T>(string subscriptionId, Action<T> onMessage)
if (onMessage == null)
throw new ArgumentNullException("onMessage");

var typeName = serializeType(typeof(T));
var subscriptionQueue = string.Format("{0}_{1}", subscriptionId, typeName);

Action subscribeAction = () =>
var channel = connection.CreateModel();
DeclarePublishExchange(channel, typeName);

var queue = channel.QueueDeclare(
subscriptionQueue, // queue
true, // durable
false, // exclusive
false, // autoDelete
null); // arguments

channel.QueueBind(queue, typeName, typeName);

var consumer = consumerFactory.CreateConsumer(channel,
(consumerTag, deliveryTag, redelivered, exchange, routingKey, properties, body) =>
var message = serializer.BytesToMessage<T>(body);

subscriptionQueue, // queue
true, // noAck
consumer.ConsumerTag, // consumerTag
consumer); // consumer


The connection.AddSubscriptionAction(subscribeAction) line passes the closure to a PersistentConnection class that wraps an AMQP connection and provides all the disconnect detection and re-subscription code. Here’s AddSubscriptionAction:

public void AddSubscriptionAction(Action subscriptionAction)
if (IsConnected) subscriptionAction();

If there’s an open connection, it runs the subscription straight away. It also stores the subscription closure in a List<Action>.

When the connection gets closed for whatever reason, the AMQP ConnectionShutdown event fires which runs the OnConnectionShutdown method:

void OnConnectionShutdown(IConnection _, ShutdownEventArgs reason)
if (disposed) return;
if (Disconnected != null) Disconnected();


We wait for a little while, and then try to reconnect:

void TryToConnect()
ThreadPool.QueueUserWorkItem(state =>
while (connection == null || !connection.IsOpen)
connection = connectionFactory.CreateConnection();
connection.ConnectionShutdown += OnConnectionShutdown;

if (Connected != null) Connected();
catch (RabbitMQ.Client.Exceptions.BrokerUnreachableException)
foreach (var subscribeAction in subscribeActions)

This spins up a thread that simply loops trying to connect back to the server. Once the connection is established, it runs all the stored subscribe closures (subscribeActions).

In my tests, this solution has worked very nicely. My clients automatically re-subscribe to the same queues and continue to receive messages. One of the main motivations to writing this post, however, was to try and elicit feedback, so if you’ve used RabbitMQ with .NET, I’d love to hear about your experiences and especially any comments about my code or how you solved this problem.

The EasyNetQ code is up on GitHub. It’s still very early days and is in no way production ready. You have been warned.

Thursday, June 02, 2011

Some Thoughts on Windows 8

Microsoft is the rabbit caught in Apple’s headlights… and about to be run over by the Google juggernaut. Microsoft’s income comes from two major sources, Windows and Office. The need to maintain the stream of licence fees for these two products is at the very core of everything Microsoft does. Windows has three major groups of customers: consumer PCs, business PCs and business servers. Microsoft is the incumbent in all three markets, it can’t grow any more by taking market share, its income can only increase at the rate of growth for these markets as a whole. The only way Microsoft can break out of this static lock is by creating new markets for its products. But it must be careful not to injure it’s existing Windows franchises in attempts to get market share elsewhere.

But it’s a tough world to be selling operating system licences. Unfortunately for Microsoft, the situation is far from static. It’s core source of income is under threat. The iPad has created a new class of consumer product that is eroding Microsoft’s market for consumer PCs. Many people have no need for the full power of a desktop PC. For browsing, reading and writing email and watching YouTube, an iPad, or one of the many Android competitors is perfectly adequate. Android in particular is becoming more and more able to do PC like tasks with every release. For many people a PC is overcomplicated and unreliable.

Windows is also under threat in business server rooms. The main challenge here is from cloud based services. Why employ expensive people to look after servers running Exchange when you can simply sign up for Google Apps? For many small and medium businesses, cloud based services are a very attractive alternative to running in-house IT.

Windows is probably safest of all on the business desktop. There’s no real alternative for running productivity applications like Office. However, with many line-of-business applications becoming cloud based, there is a risk that a Chrome OS style browser-only desktop might look attractive to some business. Also when everyone’s got an iPad or an Android tablet at home, having a similar device at work will start to make more sense.

So is Windows 8 an answer to any of these challenges? It’s obviously designed to answer the first, the erosion of the consumer PC market by iPad and friends. Fundamentally it looks like a simple UI layer, derived from WP7, stuck on top of Windows 7. Microsoft’s strategy seems to be to offer a simple touch-UI for ‘consumer’ tasks, but which allows you to switch back to Windows 7 for running desktop applications like Office. John Gruber makes the point here, that simply skinning Windows 7 for tablets is probably a mistake.

Microsoft is not the leader in the tablet market, it’s playing catch-up from quite a long way behind. Will consumers be willing to pay the Windows tax when they can simply buy a more mature iPad or Android device? Microsoft can’t start offering it for free like Google does with Android because they would immediately kill one of their main sources of income. There is no way they can do anything other than ask people to pay more, for what will, at least initially, be an inferior device. It doesn’t strike me as a winning strategy.

The Windows 8 developer story is somewhat odd to say the least. No mention of WPF or Silverlight. Instead developers are being asked to build apps for the Windows 8 touch UI in Javascript and HTML. There are more Javascript developers out there than Silverlight ones, but the people who already care about Microsoft platforms have put considerable investment into WPF and Silverlight. The Windows 8 announcement is a real slap-down for them. Just take a look at the anger on Channel 9. It’s also a snub to Microsoft’s developer division who have put considerable effort into building .NET. Is this a hangover from the Vista debacle? Was WinFX such a failure that the Windows team now want nothing to do with .NET? If so, what’s the point of the developer division?

It’s a confusing time for us developers. I think Microsoft still have a very strong position on the business desktop. If you are building line-of-business applications for a living with .NET, there’s probably still some millage in that. But the feeling is very much that .NET is now middle age, like many of its developers. No matter how nice the technology is, and it is very nice, it’s part of a platform that’s perceived to be part of the past, not the future.

But if you are building for the consumer market, or the cloud, then Microsoft is merely a niche player. It’s been obvious for some time that everyone should have Javascript as a core skill. The Windows 8 announcement reinforces that. It’s also clear that UNIX/Linux operating systems (including iOS and Android) are now ubiquitous in the same way that TCP/IP is. You’d probably want to make sure you know how to find your way around a UNIX system. For building server side applications for the cloud the field is wide open. Ruby, Python & Javascript are all well established and for some exotic applications, things like Erlang, Scala and Haskell are very interesting. Personally I’m hoping, optimistically I think, that Mono and Xamarin are a success. The excellent .NET platform deserves more than to be chained to Windows’ declining market share.

Monday, May 30, 2011

Dependency Injection Haskell Style

Today I was thinking about dependency injection and Haskell. If we think about how an IoC container in a language like C# works, there are several pieces:

  1. Services are described by types (usually interfaces).
  2. Components (classes) describe their dependencies with the types of their constructor arguments.
  3. The components in turn describe the services they provide by implementing service interfaces.
  4. Service types are registered against implementation types using some API provided by the IoC container.
  5. A clever piece of framework (the IoC container), that when asked for a service (described by an interface), creates the entire dependency chain and then passes it the caller.

The important point is that we don’t have to manually wire up the dependency chain. We simply build our software in terms of service contracts and implementations, register them with the IoC container which then magically does the wiring up for us.

If you dig into the source code for any of the .NET IoC containers (such as Windsor, StructureMap, Autofac, Ninject etc) you’ll see they do an awful lot of reflection to work out the dependency graph. I remember reading (or hearing someone say) that reflection is often used to make up for inadequacies in C#’s type system, so I started experimenting to see if Haskell could provide the decoupling of registration and dependency graph building without the need for an IoC container. And indeed it can.

First let’s define some services:

-- this could talk to a database
type GetReport = Int -> IO Report
-- this could talk to an email API
type SendReport = Report -> IO ()
-- this takes a report id and does something with it
type ProcessReport = Int -> IO ()

Now let’s define some implementations:

-- getReport simply creates a new report with the given id
getReport :: GetReport
getReport id =
return $ Report id "Hello"

-- sendReport simply prints the report
sendReport :: SendReport
sendReport report = putStr $ show report

-- processReport uses a GetReport and a SendReport to process a report
processReport :: GetReport -> SendReport -> ProcessReport
processReport get send id = do
r <- get id
send r

Partial function application is equivalent to dependency injection in OO. Here our processReport’s dependencies are given as the first two arguments of the processReport function.

Now let’s define a type class with a ‘resolve’ member. The resolve member isn’t a function as such, it’s just a label for whatever ‘a’ happens to be when we define instances of the type class:

class Resolvable a where
resolve :: a

Now let’s make each of our services an instance of ‘Resolvable’, and ‘register’ the implementation for each service:

instance Resolvable GetReport where
resolve = getReport

instance Resolvable SendReport where
resolve = sendReport

instance Resolvable ProcessReport where
resolve = processReport resolve resolve

Note that we partially apply processReport with two resolve calls that will provide implementations of the service types.

The whole caboodle compiles and we can use resolve to grab a ProcessReport implementation with its dependencies provided:

> let p = resolve :: ProcessReport
> p 23
Report 23 "Hello"

All the functionality of an IoC container without an IoC container. Wonderful.

So, to sum up, we’ve simply registered implementations against services and let the Haskell type system build the dependency graph for us. The added benefit we get here over reflection based IoC containers in C# and Java, is that this is all typed checked at compile time. No need to run it to find missing dependencies.

Please bear in mind that I’m a total Haskell novice and this is probably something that real Haskell programmers would never do. But it’s an interesting illustration of the power of the Haskell type system, and how a lot of what we do with reflection in C# is simply to augment the limitations of the language.