Code rant

Monday, September 24, 2007

The joys of System.Diagnostics.StackFrame

I see code like this an awful lot where a logging function or whatever is passed a string literal of the current method name:

public void TheOldMethodUnderTest()
{
    Log("TheOldMethodUnderTest");
}

private void Log(string methodName)
{
    Console.WriteLine("The name of the method was: {0}", methodName);
}

I've also seen a lot of code like this where the actual method name and the string literal have become out of sync, either because of refactoring, or more commonly, the cut-and-paste method of code reuse. Fortunately there's a really easy way of getting the current stack trace from the System.Diagnostics.StackFrame class. The code below does exactly the same thing as the code above, but the Log() function works out for itself what its caller's name is:

public void TheMethodUnderTest()
{
    Log();
}

private void Log()
{
    System.Diagnostics.StackFrame stackFrame = new System.Diagnostics.StackFrame(1, false);
    System.Reflection.MethodBase method = stackFrame.GetMethod();
    Console.WriteLine("The name of the method was: {0}", method.Name);
}

Doing this kind of thing is a legitimate requirment in many applications, but if you find yourself having to put a lot of boiler-plate code in every method because of cross cutting concerns such as logging, it's worth looking at some kind of AOP framework that provides dynamic proxies that you can intercept.

Thursday, September 13, 2007

MIX 07 UK

I had a great time in London over the last couple of days at MIX 07 UK. Most of the excitement was about Silverlight, but there was also a lot of coverage of VS2008. I've been reading a lot about Silverlight recently, but this was the first time I'd seen a good demonstration of it in use. It's obviously Microsoft's Flash killer and it fills a big gap in RIA application development where you'd currently have to use Flash. For .net developers like me it opens up some great opportunities, so it can only be good. I guess its success will primarily hinge on how well Microsoft drives adoption. Scott Guthrie explained that they only have to persuade about 15 of the top sites to use it and it will become ubiquitous. One thing that really struck me, that I hadn't properly grepped before was how core XAML is to the current MS development stack. I really liked the way you could use the Expression * tools to move XAML projects between designers and developers. Those tools (especially Blend) are yet another thing I need to play with.

I went to the second of the VS2008 sessions, but it was pretty much covering stuff I already knew, although it was nice to see all the LINQ to SQL stuff presented. One thing that was new was the Rails style scaffolding stuff. I don't remember what it was called, but MS have blatantly got their eyes on the Ruby camp. Only for the best I think.

I was also really looking forward to the 'Why IronRuby', and 'IronPython et al' sessions. Dynamic languages are generating some real excitement these days and was very keen to get some insight into how Microsoft sees them playing with .net. 'Why IronRuby' by Dave Verwer was a bit of a disappointment, not because Dave didn't do an excellent job introducing Ruby, but I was rather hoping for a more philosophical discussion about dynamic languages in the .net world and how well they play with the existing MS technology stack. I had to leave before the questions, so maybe there was some interesting discussions? I hoped to grab Dave for a chat later on, but I couldn't track him down. Michael Foord's (aka Fuzzyman) IronPython session was much more what I wanted. Michael spent a lot of time digging into the intricacies of building dynamic languages on top of the CLR, the DLR and some of the cool stuff you could do with IronPython and Silverlight. I'm kinda seeing what the Python and Ruby people are saying about the benefits of dynamism, but I'm still not quite ready to really embrace it yet. The opportunity will come when the IronRuby implementation has been fully developed in a year or so, but I must stay, the DSL story from the Ruby camp is very compelling.

My favorite session was the 'A nice cup of tea and a sit down', a panel discussion with Scott Guthrie. As you probably know, Scott is the general manager of the .net development group and his blog is always my first stop to keep up with what's happening at MS. It's not everyday that you get to fire some questions at one of the core .net guys, so I really enjoyed myself.

I asked Scott about TDD at Microsoft and how they saw TDD fitting in with their products. I was especially keen to hear what he thought about the problems with mocking the BCL.

He told us about a new MVC framework that's in the pipeline for ASP.NET that sounded very intriguing and how they were consciously trying to keep it interoperable with open source tools. I was especially gratified to hear him mention Windsor and RhinoMocks.

It's a shame there wasn't more time, I really wanted to ask him about functional programming and LINQ and how explicitly MS are going to encourage more declarative programming styles in the BCL and documentation. Right now the message seems to be that LINQ is query and the functional declarative stuff is only mentioned if you're prepared to dig deep.

Saturday, September 01, 2007

Microsoft and Innovation

I love reading Scott Hanselman's blog. I missed a great post back in May, 'Is Microsoft losing the Alpha Geeks?' In it he talks about his recent visit to RailsConf where he was very impressed by the energy of the Ruby-on-Rails (RoR) community. Unless you've been avoiding the blogosphere for the last couple of years, you'll know all about the incredible buzz around RoR. It's certainly taking a lot of the most respected people in the development community along with it, not least of all Martin Fowler and his company Thoughtworks. So Scott asks if Microsoft has lost it's way; is it 'losing the Alpha Geeks?' So what's an Alpha Geek? Let's assume for the sake of argument we're talking about the guys who actually move the industry forward, not just the people who really have a deep understanding of the technology, but the people who invent it. I don't think Microsoft are loosing them, because I don't think they ever had them in the first place. Let's step back and think about innovation for a while. How does technology, or even in the broader sense, culture, change and advance? What kind of conditions promote innovation and which stifle it? The great thinker in this field was the economist Joseph Schumpeter, he coined the phrase 'creative destruction', to describe the driving force of innovation in a capitalist economy. Essentially the rate of innovation is proportional to the number of people able to innovate; the number of minds allowed to tackle a particular question and the ability of individuals or groups to attempt to profit from their idea without a severe cost of failure. Capitalism thrives on huge numbers of small companies and entrepreneurs constantly trying and failing. A Darwinian survival of the fittest where the vast majority do not succeed. Indeed it goes beyond capitalism, Jarred Diamond's seminal work 'Guns, Germs and Steel', brilliantly shows how human development progressed in direct proportion to the size of human communities and the speed at which technological developments could be propagated. When you get bored of 'learn WPF in 21 days' it's must read! The software industry is a pure manifestation of Schumpeterian economics. The barriers to entry are almost non-existant; anyone with an idea can have a go. It doesn't really make a lot of difference if you're a highly paid Microsoft employee or a Gujarati teenager with a computer, the quality of your ideas is the only thing that matters. In fact there are two good reasons why the Gujarati teenager has an advantage: Firstly he doesn't have to persuade his line manager that his idea is a good one before he's allowed to explore it; he can have a go at any hair brained wackiness he likes. His idea doesn't have to play nicely with any particular corporate strategy or favorite technology. Secondly, and conversely, he has to persuade the whole world that his idea is a good one. The Microsoft guy, once he's persuaded the company, will have the whole weight of Microsoft's marketing behind him and millions of Microsoft shops around the world will use his technology even if it's sub-optimal. But because the Gujarati teenager has none of this support his idea will only thrive if it's a really really good one. So Microsoft's 1000's of highly paid geeks verses who knows how many millions of the rest of the world's programmers? It's no competition, the cutting edge will always be outside of Microsoft or any other corporation for that matter. I think Microsoft collectively knows this and the company has become a lot more responsive to good ideas coming out of the world wide software community. The new openness of public betas and shared source initiatives are evidence of this. Microsoft has also become much better at picking the best talents from the community, just take the example of John Lam. So what does this mean for the average Microsoft Mort like you and me? Well, if you're interested in the state of the art and you want to see what the software world might look like in the next five or ten years, then you've really got to raise your eyes from the MSDN front page. You'll almost never see anything truly innovative there. And even when you do see something new there, there's no guarantee at all that it's a good idea. It hasn't been filtered by the community yet. When Microsoft has adopted something, it's not the cutting edge any more it's the mainstream. Take an interest in what's happening in the rest of the software world and occasionally try out something that's nothing to do with Redmond. Ruby on Rails obviously has something good going for it or there wouldn't be a whole lot of very clever people banging on about it all the time. Now this doesn't mean that you or I should suddenly abandon Visual Studio and everything we know for a Linux box, Emacs and a command line, but it does mean that it's worth keeping an eye on similar things that might be evolving in the .net ecosystem, like Subsonic or Monorail. And don't assume that ASP.NET is the only and best web development framework. I hope this doesn't sound like I'm Microsoft bashing, indeed my whole career (and it's been a relatively prosperous and happy one so far) is built around Microsoft. I'm very happy with the Borg's new found openness and these days there's such a flourishing open source community around the Microsoft tools, especially .net, that you don't have to wait for corporate adoption before being able to integrate the best modern development techniques into your work.

Thursday, August 30, 2007

Testing Will Challenge Your Conventions

I've just stumbled upon a fantastic blog post Testing Will Challenge Your Conventions, by Tim Ottinger. In it he lists a number of ways that doing TDD fundamentally changes the way you code:

Interfaces suddenly seem like a good idea.
You stop using singletons and static methods.
Private makes less sense than it used to.
You pass dependencies in the constructor (the so called 'fat constructor')
Smaller methods are the norm.
You read tests before the code; it better explains the intention.
Usability trumps cleverness.
Premature performance optimisation is bad (YAGNI)...
... but test performance is crucial.
Dependency is bad; you avoid 'God classes' and 'global hubs'.
'Clever' is dead. 'Clever' is hard to refactor. 'Clever' is hard to isolate, hard to internalize, hard to phrase in tests.
Your IDE is only good if it allows you to do quick write/build/test cycles.

As he says, all this stuff is good practice, but the genius of TDD is that it drives you to do it. It gives you your first excuse to write component oriented, highly cohesive, loosely coupled code. Without it, these good practices just seem like academic hot air, right up to the point that you discover your application is spaghetti. As I blogged recently, I've had an epiphany where, in my mind, two distinct concerns, TDD and component oriented architectures have merged. Although I've been interested in component architectures for a while they've always seemed to be quite heavyweight beasts, and to be honest, I've never used them for application development. In the case of the MS IComponent framework that was the right call, the benefits would have been outweighed by ugliness of the framework. But with a lightweight, non-intrusive, component framework like Castle Windsor it makes perfect sense. What's interesting though is that it's just another example of how I've been lead to better programming techniques simply by doing TDD. What started as an easier way of writing test harnesses has had more impact on my development as a programmer than any other practice I've adopted.

Tuesday, August 07, 2007

Masters, Journeymen, and Apprentices

I've really enjoyed reading this series of blog posts by Fred George, Masters, Journeymen, and Apprentices. In it, he says what everyone knows, but few teams seem to accept: all programmers are not created equal. He divides programmers into Masters, Journeymen and Appentices. I must say, that reading his descriptions, even the lowest level on his scale, the Apprentice, is more skilled at OO development than most of the developers I encounter. He's another thoughtworker, a company that consistently seems to attract the best programmers, so it's not really surprising that there's such a high bar. The truth is that your average Microsoft programmer in your average company's development shop knows almost no OO at all. I don't consider myself a great programmer, I'm probably at his Journeyman level, but I'm consistently the only guy who's doing OO programming (as Fred George describes it) at the majority of my clients. I think it's really down to the fact that it's very hard to measure the ability of a programmer at programming and so it's ignored by management. There tends to be no concept of growing programming skills by mentoring or training. This is especially true of OO skills. I would say that being skilled at OO provides a step change in any programmers productivity, but this is rarely measured or encouraged. In fact OO programming practices are relatively uncommon in most Microsoft shops and are often dismissed as an academic waste of time. I was told by one 'architect' that we were doing OO programming because we were using C#, regardless of the fact that most of his procedures where hundreds of lines long and had a cyclic complexity of about a million:) Sure you can read lots of books on technologies and do lots of Microsoft exams, but that's about APIs and technology, the actual art itself is often totally neglected. And it is an art. With no way to measure ability most management simply measure seniority by length of service. What I often find is that the quality of the whole team is governed by the quality of the senior developer(s), the guy who does the recruitment interviews. If he's good then then he tends to be able to recognise good people and recruit them, if he's not so good then the team too tends to be poor. But even if he's good it's a struggle, you have to be prepared to turn away a lot of people with all the right qualifications and pay good money to get the few people out there who really know OO. That's a hard sell to management who far too often see developers as plug and play components. It's quite depressing really. I constantly hope that I'm going to get to work with a Master programmer because I know that's the only way I have any hope of ever attaining that level myself, but apart from the odd great team (only two I can actually think of in my career), it's a vain hope.

Monday, August 06, 2007

The Castle Project's Windsor Container and why I might need it.

I've recently had a bit of a eureka moment after reading about the Castle project. I've been hearing about Castle for a while now, but it's taken until now for me to grep what it's all about and why I need it. A quick look at the home page left me a little bemused. There's a thing called "MicroKernel" that's described as "A lightweight inversion of control container core". Now I know all about inversion of control, it's an essential ingredient to building scalable, testable applications, but what is an 'inversion of control container'? And why would I need one? Below "MicroKernel" is "Windsor Container". Nice, "Windsor Castle", get it? It's obviously aimed at the UK programming community. Windsor is described as "Augments the MicroKernel with features demanded by most enterprise projects", but I still don't see why I would need one. After seeing the Castle project mentioned for about the fifth time by bloggers who have far more brains and sense than me, I decided to knuckle down and RTFM by reading Introducing Castle by Hamilton Verissimo who's one of the main developers. Now let me take you on a little coding journey to show you how Castle, and more specifically Windsor, solves an problem that's become more and more apparent to me, mainly because I've adopted test driven development (TDD). I'm going to use a similar example to Hamilton's, but try and drive out the need for Windsor from a test driven perspective. OK, for my little example, say I've got a reporting class, 'SimpleReporter', that creates a report and then emails it. The code that uses it might look something like this:

SimpleReporter reporter = new SimpleReporter();
reporter.SendReport();

And the SendReport method might look like this:

public void SendReport()
{
  ReportBuilder builder = new ReportBuilder();
  Email email = builder.CreateReport();
  EmailSender sender = new EmailSender();
  sender.Send(email);
}

It creates uses a class called ReportBuilder to create an email report and then another class called EmailSender to send it. This is the kind of thing I used to write before I got into TDD. It works, but in order to test it I've got start up my application and use the UI to create and send a report. I then have to check my emails to see if the report has arrived. I'm not testing just my SimpleReporter class, I'm testing the SimpleReporter, the ReportBuilder and anything it relies on, the EmailSender, my SMTP server, my email client, my application's UI and probably a relational database and my data access layer. In short I'm testing a large chunk of my application and infrastructure. If something goes wrong I'm into a serious debugging session trying to work out why my Email hasn't arrived. I can't automate this test easily so I'm not going to do it very often and if a bug is introduced by a change to some other part of the system I'm not going to notice it immediately. Now I would use inversion of control and dependency injection with a mock object framework like NMock to just test my Reporter and nothing else. I blogged about NMock a while back. Here's the new Reporter class, note how I pass a report builder and emailSender in the contructor and how I'm only referencing interfaces not concrete classes.

public class Reporter
{
  IReportBuilder _reportBuilder;
  IEmailSender _emailSender;

  public Reporter(IReportBuilder reportBuilder, IEmailSender emailSender)
  {
      _reportBuilder = reportBuilder;
      _emailSender = emailSender;
  }

  public void SendReport()
  {
      Email email = _reportBuilder.CreateReport();
      _emailSender.Send(email);
  }
}

And here's its NUnit test.

[Test]
public void ReporterTest()
{
  string theText = "Hello, I'm the report!";

  Mockery mocks = new Mockery();
  IReportBuilder reportBuilder = mocks.NewMock();
  IEmailSender emailSender = mocks.NewMock();

  Email email = new Email(theText);

  Expect.Once.On(reportBuilder).Method("CreateReport").Will(Return.Value(email));
  Expect.Once.On(emailSender).Method("Send").With(email);

  Reporter reporter = new Reporter(reportBuilder, emailSender);
  reporter.SendReport();

  mocks.VerifyAllExpectationsHaveBeenMet();
}

NMock provides me with mock object instances to pass to the Reporter class' constructor. I don't even need to have writen concrete implementations of ReportBuilder or EmailSender at this stage. I can test that the Reporter class does the correct thing with reportBuilder and emailSender and I'm not testing anything else about the application at this stage. Most importantly this is an automated test that can be run in an instant along with every other automated test for my application. If I make a change that breaks Reporter I'll find out instantly. Discovering TDD has made me a much better and productive developer. Not only are my apps more robust, they are also better architected because testing forces you to do good software design. However, I've noticed a rather unfortunate side effect of doing things this way, in the finished product we'll have to provide concrete instances of EmailSender and ReportBuilder.

ReportBuilder reportBuilder = new ReportBuilder();
EmailSender emailSender = new EmailSender();
Reporter reporter = new Reporter(reportBuilder, emailSender);

Now that doesn't look too bad, but this is a really really simple example. In a real application there could be hundreds of classes that need to be knitted together like this and it becomes a complex piece of code, not only to maintain, but to decide where in your app you should do it because the class(es) that do the knitting have to know about everything. An alternative that I experimented with was to have a default contructor for each class that provided the concrete instances alongside the one that injected the dependency, so our Reporter class would now look like this:

public class Reporter
{
  IReportBuilder _reportBuilder;
  IEmailSender _emailSender;

  public Reporter(IReportBuilder reportBuilder, IEmailSender emailSender)
  {
      _reportBuilder = reportBuilder;
      _emailSender = emailSender;
  }

  public Reporter()
  {
      _reportBuilder = new ReportBuilder();
      _emailSender = new EmailSender();
  }

  public void SendReport()
  {
      Email email = _reportBuilder.CreateReport();
      _emailSender.Send(email);
  }
}

But this feels really dirty. I've now got a direct dependency from the client back to the server thus breaking the Inversion of Control priciple. I can't easily use a different kind of reportBuilder or emailSender without recompiling the Reporter. Another possibility would be to use the provider pattern that's a core part of .NET, but it's not really intended for this scenario and to use it for every service class in your application would be total overkill. Of course the .NET framework also provides a component architecture, but it's also quite heavy for what we want here since it requires that each component implement IComponent and provide the plumbing to publish its service. What I need is something that will magically provide concrete instances of the interfaces that each class requires without me having to maintain a vast wiring exercise somewhere in the startup code of my application. Enter Windsor! Here's how you could use it with our Reporter example:

WindsorContainer container = new WindsorContainer();
container.AddComponent("reportBuilder", typeof(IReportBuilder), typeof(ReportBuilder));
container.AddComponent("emailSender", typeof(IEmailSender), typeof(EmailSender));
container.AddComponent("reporter", typeof(Reporter));

Reporter reporter = container.Resolve<Reporter>();
reporter.SendReport();

You create a new container instance and register each interface in your application along with the concrete class that you wish to represent it. You can get a new instance of your class by using the Resolve method. Here we get an instance of Reporter and call SendReport(). Nowhere in the code did we have to explicitly pass instances of ReportBuilder or EmailSender to Reporter or even know that Reporter required those instances. Windsor also allows you to describe your components in a configuration file rather than hard coding all those AddComponent statements. You can see the benefits. Say I wanted to write another class that needed to use IEmailSender. I would just write that class with an IEmailSender as one of its constructor parameters and add it to the container. The container would then automatically provide an IEmailSender instance. In fact, by default it would provide the same IEmailSender instance that it supplies to any other class that required this interface, although that behavior can be modified. Now, say I want to use ExchangeEmailSender rather than my existing EmailSender throughout my application, I only have to change the one configuration entry (or line of setup code) to do it, rather than searching through my application looking for every place where an IEmailSender is passed as a constructor parameter. If you want to pass ExchangeEmailSender to some components and EmailSender to others, then that can be configured too. There's much more to the Windsor container than just the inversion of control container, it also provides Aspect Oriented Programming features, automatically providing interface proxies so that you can intercept method calls and provide cross cutting services. This opens up all kinds of exciting possibilities, the obvious ones being logging, auditing, transactions and security, and I hope to blog about it some more in the future. I haven't used any of this in anger yet so I can't say what the drawbacks might be, but I can imagine that maintaining the component config file for a large project could get a bit tiresome. The advantages are obvious. It really is lightweight component based development where you provide service components that can be consumed by other components without ever having to have any direct dependencies. I'd really like to see it used in a major application. It also begs the question: shouldn't this stuff be provided by your programming language? The CLR has the potential to do a lot of this quite easily. It already knows about all the types that an application requires, it knows about interface implementation, it can intercept method calls. It's easy to imagine a future version of C# where component based programming is provided out of the box.

Thursday, July 26, 2007

Jeremy Miller's build your own CAB series

I've been really enjoying Jeremy Miller's build you own CAB series of blog posts. He gives a great introduction to good fat client UI programming techniques. You hear plenty about MVC and MVP, but it's rare to get such a detailed introduction with lots of code.

Thursday, July 19, 2007

How to structure Visual Studio solutions

Because I work as a freelancer, I get to see a lot of different .NET development shops. One of the things that continually surprises and frustrates me is how poorly many teams organise their solutions. By this I mean the way they split their application into projects (and thus assemblies), the way they group those projects into solutions, naming conventions and the way they source control those solutions.

Microsoft’s Patterns and Practices group has specific guidance about how you should organise your solutions here and you can’t go too badly wrong by following their advice. Unfortunately some of the following articles about build practices are way out of date and really should be updated, but that’s for another post, right now I want to give a brief list of do and don’ts on organising solutions.

The most important thing to get right is to understand your team and the systems it builds and maintains. What source is in your control and what is outside it. It’s very important that you don’t build artificial silos within your team that make sharing and reusing code difficult. Ideally, all the source your team writes should be in a single solution file with all the projects referenced with project references. Not doing this is the single biggest source of solution problems I’ve seen. Never use file references for internal assemblies! It’s such a headache making sure you’ve got the correct version of an internal assembly especially when it’s busy being developed.

I’ve often seen the situation when two projects in the same solution both file reference the same internal assembly, but different versions of it so when you build the solution you get errors because visual studio complains that it will have to overwrite one version of an assembly with another. Also there’s the issue of where you reference those assemblies from. If you reference the build server’s (you do have a build server right?) latest built assemblies, you can often find that your local build breaks when another developer checks in and builds a new version of the assemblies that you’re referencing. On the other hand I’ve worked on teams where I’ve had to manually maintain assemblies and file references. All this is bad and unnecessary.

When I arrive to work with your team I should be able to get the latest code from source control open the solution file and hit F5 and the build should work first time. Never ever put your stuff in the GAC. The only excuse is if you are forced to by COM interop issues. It’s always a bad decision and will give you endless build and deployment nightmares.

What about third party assemblies you reference, or assemblies from other parts of your company that your team has no control over? If it’s a .NET API to a non .NET product and you’re going to have to run its installer on your deployment target and the installer places the assembly in the GAC, then it’s probably best to just go with that. For pure .NET assemblies that can be xcopy deployed, the best think is to treat them like other binary resources and put them in your repository along with the source. You can either put them in their own solution folder or include them in each project that references them as a project item (as suggested by the P&P document above). The only problem with the latter approach is that it can be a headache when you want to move to a newer version and you have to hunt down all the projects that reference it.

OK, so we have a single solution with all the team’s projects in it. How do we name and organise those projects within a solution? As far as naming goes there’s a simple rule that will make your life much much easier: Project Name = Assembly Name = Assembly File Name = Root Namespace = Project Folder Name.

For example, say you’ve got a root namespace like this: MyCompany.MyApplication.DataAccess, then the project name should also be: MyCompany.MyApplication.DataAccess, in a folder called: MyCompany.MyApplication.DataAccess. The assembly name should be: MyCompany.MyApplication.DataAccess, and the assembly’s file name should be: MyCompany.MyApplication.DataAccess.dll

I would expect the solution name to be MyCompany.MyApplication or maybe MyCompany.TeamName if your team’s solution file holds a number of different applications. You do share code between your team’s apps don’t you? On disk you should keep things flat. The solution should go in a folder with the same name as the solution (MyCompany.MyApplication) and all the projects should go in child folders of the solution folder. This is how VS likes it and it’s pointless fighting the power.

One place where you do want to fight the power is with web projects. Don’t let VS put them in a virtual directory under WWWRoot. Create a blank solution file first, then create a folder under the solution directory for your web project. Create a virtual directory using inetmgr that points to the project folder. Last of all, create the web project and ask visual studio to put it in the virtual directory you just created. This is much easier with VS 2005 + and the problem has mostly disappeared because you can use the cassini web server to run a web project from wherever it happens to be on disk.

Your source repository hierarchy should exactly match the solution structure on disk. Not doing this is a recipe for disaster. Don’t be tempted to use the file linking feature in Source Safe to share source files between different solutions, it causes endless headaches whenever someone adds a new source file to a project, checks the project file in but doesn’t add the new file to every location the project’s linked to. As for source safe, although it’s the default option for every Microsoft shop, it’s also probably the worst SCM tool out there. Really consider using something more modern. I haven’t had the opportunity to use the new Team System source control tool, but I have used Subversion on one project and it was like moving from a Trabant to a BMW. However that one experience with Subversion was the only one in my long career as a Microsoft developer, everyone else uses Source Safe. A great pity and definitely a subject for a future post.

Wednesday, July 18, 2007

Serializing lots of different objects into a single file

Here's a neat trick I discovered a while back that I thought I'd share. Us .NET programmers are always doing serialization for one reason or another. The built in BCL binary serializer, System.Runtime.Serialization.Formatters.Binary.BinaryFormatter is a really easy way of persisting objects to disk or any other kind of binary stream. What I didn't realise until I discovered this trick is that you can serialize one object after another onto a single stream and then read them back one by one. You can create a file, serialize some objects to it, close it, then open it again and append a few more. Also the objects don't have to be the same type, the BinaryFormatter just reads to the next object boundary and then returns the object cast as object. Also you don't have to read all the objects back into memory at once. So long as you remember the position of the last object you deserialized, you can just continue at some later date. This is really efficient if you've got huge collections of things you want to store and process. Of course if you want to serialize a lot of independent objects (or object graphs) you could always insert them into some data structure like an ArrayList and then serialize the ArrayList, but this means creating all the objects in memory at once and reading them all back into memory at once which is fine with small collections, but isn't a good strategy for larger amounts of data. Here's a little demo. The meat of it is the functions WriteAnimalToFile and ReadAnimalFromFile. WriteAnimalToFile opens a file, writes one Animal object to it and then closes the file. In the demo we do this for 10 different types of Animal, note that Animal is an abstract base class that's specialized by Cat and Dog. ReadAnimalFromFile opens a file, seeks to the given position, reads one animal back and then closes it. It returns the animal and the new position. In the demo we read back all the animals we created with WriteAnimalToFile. Note that in ReadAnimalFromFile we don't have to tell the BinaryFormatter what kind of object to expect, it just reads to the next object boundry. If the position is at the end of the file, we just return null.

using System;
using System.IO;
using NUnit.Framework;

namespace SerializerTest
{
 [TestFixture]
 public class SerializerTests
 {
        [Test]
        public void SerializeLotsOfObjects()
        {
            // get the path for the file we're going to serialize into
            string path = @"c:\SerializedObjects.ser";

            // create the file we're going to use
            using(File.Create(path)){}

            // create some animals
            for(int i=0; i<10; i++)
            {
                Animal animal;
                string name = string.Format("Animal_{0}", i);
                int age = 5+i;

                // make even numbers dogs, odd numbers cats
                if((i % 2) == 0)
                {
                    bool trained = ((i % 3) == 0);
                    animal = new Dog(name, age, trained);
                }
                else
                {
                    int lives = 9-i;
                    animal = new Cat(name, age, lives);
                }

                // write each animal to a file
                WriteAnimalToFile(path, animal);
            }

            // read the animals back one by one.
            long position = 0;
            while(true)
            {
                Animal animal = ReadAnimalFromFile(path, ref position);
                if(animal == null) break;
                Console.WriteLine(animal.Introduce());
            }
        }

        private void WriteAnimalToFile(string path, Animal animal)
        {
            // create a new formatter instance
            System.Runtime.Serialization.Formatters.Binary.BinaryFormatter formatter = 
                new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();

            // open a filestream
            using(FileStream stream = new FileStream(path, FileMode.Append, FileAccess.Write))
            {
                formatter.Serialize(stream, animal);
            }
        }

        private Animal ReadAnimalFromFile(string path, ref long position)
        {
            // create a new formatter instance
            System.Runtime.Serialization.Formatters.Binary.BinaryFormatter formatter = 
                new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
            
            // read the animal as position back
            Animal animal = null;
            using(FileStream stream = new FileStream(path, FileMode.Open, FileAccess.Read))
            {
                if(position < stream.Length)
                {
                    stream.Seek(position, SeekOrigin.Begin);
                    animal = (Animal)formatter.Deserialize(stream);
                    position = stream.Position;
                }
            }
            return animal;
        }
 }

    [Serializable]
    public abstract class Animal
    {
        string _name;
        int _age;

        public Animal(string name, int age)
        {
            _name = name;
            _age = age;
        }

        public abstract string Introduce();

        public string Name{ get { return _name; } }
        public int Age{ get { return _age; } }
    }

    [Serializable]
    public class Dog : Animal
    {
        bool _isTrained;

        public Dog(string name, int age, bool isTrained) : base(name, age)
        {
            _isTrained = isTrained;
        }

        public override string Introduce()
        {
            return string.Format("I am a dog called {0}, age {1}, {2}trained.", Name, Age, 
                (_isTrained ? "": "not "));
        }

        public bool IsTrained{ get { return _isTrained; } }
    }

    [Serializable]
    public class Cat : Animal
    {
        int _lives;

        public Cat(string name, int age, int lives) : base(name, age)
        {
            _lives = lives;
        }

        public override string Introduce()
        {
            return string.Format("I am a cat called {0}, age {1}, with {2} lives", 
                Name, Age, _lives);
        }

        public int Lives{ get { return _lives; } }
    }
}

The output should look like this:

I am a dog called Animal_0, age 5, trained.
I am a cat called Animal_1, age 6, with 8 lives
I am a dog called Animal_2, age 7, not trained.
I am a cat called Animal_3, age 8, with 6 lives
I am a dog called Animal_4, age 9, not trained.
I am a cat called Animal_5, age 10, with 4 lives
I am a dog called Animal_6, age 11, trained.
I am a cat called Animal_7, age 12, with 2 lives
I am a dog called Animal_8, age 13, not trained.
I am a cat called Animal_9, age 14, with 0 lives

Note that you can't do this trick with the XML Serializer since we have to specify the type we're expecting. Also a single file with multiple XML documents would be malformed.

Monday, June 18, 2007

Four ways of doing a test with a result.

I've been writing a framework for processing emails. These emails can be rejected for various reasons, but when they're rejected we also need to get the reason for the rejection too. It's a common and simple scenario where you need test a boolean value and get some information, but what's the best way of going about it? Well if you're going to be returning more than one thing from a method, in our case the boolean result plus the reason, the easiest way to do it is to wrap them up in a result object. Here's my first attempt:

RejectResult result = email.IsRejected;
if(result.Rejected)
{
    DoSomethingWith(result.Reason);
}

But I don't really like this because the if statement looks like it's saying "if result is rejected". The result's not rejected the email is. I'd like the code to say "if email is rejected", so how about this, using out parameters:

Reason reason;
if(email.IsRejected(out reason))
{
    DoSomethingWith(reason);   
}

OK, this is a bit better, but now it's saying "if email is rejected reason", leaving out 'out' which is just syntax, which still isn't what I really want. Also there's a practical problem here that my favorite unit test mocking framework NMock2 makes a real meal out of out parameters. OK, so how about this one:

if(email.IsRejected)
{
    DoSomethingWith(email.RejectReason);
}

The if statement reads nicely now, but there's a serious practical problem in that we're not expecting email's state to change between testing for rejection and examining the RejectReason, this would be especially serious if email wasn't thread safe but even if it is it's relying on convention; what does RejectReason mean before IsRejected has been tested? Not wrapping the IsRejected and RejectReason in a single method of email is just plain bad. My favorite solution was suggested by NMock2. I really like its conversational style, 'Expect.Once.On(mymock).With(myparam).Will(Return.Value(myreturnval));', so how about this:

Reason reason;
if(email.IsRejected.For(out reason)) 
{
    DoSomethingWith(reason);
}

It reads really nicely "if email is rejected for reason", mocks nicely because you can just Mock RejectResult (returned from IsRejected) and is essentially the first and very obvious method above with only a small change to the RejectResult class (the addition of the For method). Here's the Email class and RejectResult class:

public class Email
{
    public RejectResult IsRejected
    {
        get{ return new RejectResult(true, new Reason("The reason this was rejected")); }
    }
}

public class RejectResult
{
    bool _rejected;
    Reason _reason;

    public RejectResult(bool rejected, Reason reason)
    {        
        this._rejected = rejected;
        this._reason = reason;
    }

    public bool For(out Reason reason)
    {
        reason = this._reason;
        return this._rejected;
    }
}

I love code that just explains itself without the need for copious comments!

Code rant

Monday, September 24, 2007

The joys of System.Diagnostics.StackFrame

Thursday, September 13, 2007

MIX 07 UK

Saturday, September 01, 2007

Microsoft and Innovation

Thursday, August 30, 2007

Testing Will Challenge Your Conventions

Tuesday, August 07, 2007

Masters, Journeymen, and Apprentices

Monday, August 06, 2007

The Castle Project's Windsor Container and why I might need it.

Thursday, July 26, 2007

Jeremy Miller's build your own CAB series

Thursday, July 19, 2007

How to structure Visual Studio solutions

Wednesday, July 18, 2007

Serializing lots of different objects into a single file

Monday, June 18, 2007

Four ways of doing a test with a result.

Code Rant

Hire Me!

Twitter

Blog Archive

Monday, September 24, 2007

Thursday, September 13, 2007

Saturday, September 01, 2007

Thursday, August 30, 2007

Tuesday, August 07, 2007

Monday, August 06, 2007

Thursday, July 26, 2007

Thursday, July 19, 2007

Wednesday, July 18, 2007

Monday, June 18, 2007

Code Rant

Subscribe To

Twitter

Blog Archive