Monday, October 29, 2007

What is Inversion of Control

Inversion of Control is one of the core principles of object oriented software design. I first came across it about five years ago from reading Robert Martin's wonderful book Agile Software Development. Uncle Bob calls it 'The Dependency Inversion Principle', but these days 'Inversion of Control' seems to be the preferred term. it's what Martin Fowler calls it in this article.

I like to think of Inversion of Control as the ghostly opposite to subroutines, but instead of making it possible to break out sub pieces of code from a larger piece (the literal meaning of 'sub-routine'), it enables you to break out the surrounding code; the framework that's left when all the little sub-routines are removed. Being able to break out that remaining skeleton and reuse it is a very powerful tool.

However, subroutines are built into pretty much every modern programming language but IoC is something you have to spin yourself using OO building blocks. In this post I'll show you how.

Even the most junior programmers are aware of the idea of splitting their code into separate functions or subroutines. Using functions gives us our most important tool to avoid repeating ourselves, because repeating code is very very bad.

Take this code as an example. In true pedagogical fashion it is overly simplified, but if you can imagine that where I've simply put Console.Writeline(something), there's actually the real code to do that thing. Anyway, you get the idea.

[Test]
public void InlineReporterTest()
{
  // create some reports
  List<Report> reports = new List<Report>();
  for (int i = 0; i < 3; i++)
  {
      reports.Add(new Report(string.Format("Report {0}", i)));
  }

  // send reports by Email
  foreach (Report report in reports)
  {
      // pretend to send an email here
      Console.WriteLine("Sending by email: {0}", report.Title);

      // pretend to log here
      Console.WriteLine("[Log Message] Sent Report: {0}", report.Title);
  }

  // send reports by SMS
  foreach (Report report in reports)
  {
      // pretend to send an SMS message here
      Console.WriteLine("Sending by SMS: {0}", report.Title);

      // pretend to log here
      Console.WriteLine("[Log Message] Sent Report: {0}", report.Title);
  }
}

You can see that there's plenty of repeated code here. The two foreach loops are practically identical. Any sane developer would factor out the common code into a subroutine. Here's a possibility:

[Test]
public void ProceduralReporterTest()
{
  // create some reports
  List<Report> reports = BuildReports();

  // send reports by Email
  SendReports(reports, ReportSendType.Email);

  // send reports by SMS
  SendReports(reports, ReportSendType.Sms);
}

private static List<Report> BuildReports()
{
  List<Report> reports = new List<Report>();
  for (int i = 0; i < 3; i++)
  {
      reports.Add(new Report(string.Format("Report {0}", i)));
  }
  return reports;
}

private static void SendReports(List<Report> reports, ReportSendType reportSendType)
{
  foreach (Report report in reports)
  {
      switch (reportSendType)
      {
          case ReportSendType.Sms:
              // pretend to send an SMS message here
              Console.WriteLine("Sending by SMS: {0}", report.Title);
              break;
          case ReportSendType.Email:
              // pretend to send an email here
              Console.WriteLine("Sending by email: {0}", report.Title);
              break;
      }
      // pretend to log here
      Console.WriteLine("[Log Message] Sent Report: {0}", report.Title);
  }
}

Now we only have one copy of the foreach loop and one copy of the logging code. We've made the foreach loop a little more complex by inserting a switch statement, but it's probably worth it to remove the significant amount of duplication we had before. Also, if there are other places in our program that need to send reports they can use that same SendReports subroutine. I've also factored out the creation of the reports list into a subroutine called BuildReports. What we are left with is a skeleton: some coordinating code that calls BuildReports and then passes the reports to SendReports twice. Once to send them as Emails and secondly to send them as SMS messages.

Let's take a little interlude now and talk about object orientation. In simple terms... no very simple terms, this is the idea that we can split out separate concerns into separate classes or components that know how to do one thing and one thing only. They carry their data around with them encapsulated from interference from the outside world. We're now going to refactor our example again, this time using classes:

[Test]
public void OOReporterTest()
{
  ReportBuilder reportBuilder = new ReportBuilder();
  List<Report> reports = reportBuilder.GetReports();

  ReportSender reportSender = new ReportSender();

  // send by email
  reportSender.Send(reports, ReportSendType.Email);

  // send by SMS
  reportSender.Send(reports, ReportSendType.Sms);
}

public class ReportBuilder
{
  public List<Report> GetReports()
  {
      List<Report> reports = new List<Report>();
      for (int i = 0; i < 3; i++)
      {
          reports.Add(new Report(string.Format("Report {0}", i)));
      }
      return reports;
  }
}

public class ReportSender
{
  public void Send(List<Report> reports, ReportSendType reportSendType)
  {
      foreach (Report report in reports)
      {
          switch (reportSendType)
          {
              case ReportSendType.Sms:
                  // pretend to send an SMS message here
                  Console.WriteLine("Sending by SMS: {0}", report.Title);
                  break;
              case ReportSendType.Email:
                  // pretend to send an email here
                  Console.WriteLine("Sending by email: {0}", report.Title);
                  break;
          }
          // pretend to log here
          Console.WriteLine("[Log Message] Sent Report: {0}", report.Title);
      }
  }
}

Now we've got two separate classes, one that's responsible for building reports and one that's responsible for sending them. In this simple example these classes have no state, so there's no benefit from encapsulation, but they can participate in inheritance hierarchies which would allow us to extend them without having to alter them, the famous open-closed principle.

But going back to the skeleton again; the client code; it is hard coded to take our ReportBuilder and call our ReportSender, once for emails and once for SMSs. Although we can reuse the ReportBuilder and the ReportSender we can't reuse the coordinating code. Another problem worth noting is that the ReportSender has an intimate knowledge of sending emails and SMS messages, if we wanted to add a third type of sender, we would have to add an extra member to the ReportSendType enumeration and alter the switch statement inside the ReportSender. This is where IoC comes in. With it we can factor out the calling code and remove the tight coupling between the ReportSender and the different sending methods.

The basic technique of IoC is to factor out the public contracts of our classes into interfaces. The public contracts being the public properties and methods of our classes that the outside world interacts with. Once we've factored the public contracts into interfaces we can make our client code rely on those interfaces rather than concrete instances. We can then 'inject' the concrete instances in the constructor of our coordinating class; this is known as Dependency Injection. In the example below we've factored the coordinating code into the Reporter class and then injected a concrete IReportBuilder and IReportSender.

[Test]
public void IoCReporterTest()
{
  IReportBuilder reportBuilder = new ReportBuilder();

  // send by email
  IReportSender emailReportSender = new EmailReportSender();
  Reporter reporter = new Reporter(reportBuilder, emailReportSender);
  reporter.Send();

  // send by SMS
  IReportSender smsReportSender = new SmsReportSender();
  reporter = new Reporter(reportBuilder, smsReportSender);
  reporter.Send();
}

public interface IReportBuilder
{
  List<Report> GetReports();
}

public interface IReportSender
{
  void Send(Report report);
}

public class EmailReportSender : IReportSender
{
  public void Send(Report report)
  {
      Console.WriteLine("Sending by email: {0}", report.Title);
  }
}

public class SmsReportSender : IReportSender
{
  public void Send(Report report)
  {
      Console.WriteLine("Sending by SMS: {0}", report.Title);
  }
}

public class Reporter
{
  IReportBuilder reportBuilder;
  IReportSender messageSender;

  public Reporter(IReportBuilder reportBuilder, IReportSender messageSender)
  {
      this.reportBuilder = reportBuilder;
      this.messageSender = messageSender;
  }

  public void Send()
  {
      List<Report> reports = reportBuilder.GetReports();

      foreach (Report report in reports)
      {
          messageSender.Send(report);
      }
  }
}

Notice how we can reuse the Reporter to send first emails then SMSs without having to specify a ReportSendType. The Reporter code itself is much simpler because we don't need the switch statement any more. If we wanted to add a third sending message, we would simply implement IReportSender a third time and inject it into ReportBuilder's constructor.

But, wait a minute, I've forgotten about the logging somewhere along the line! Never fear! Without having to recode Reporter I can use the Decorator pattern to create a logger that implements IReportSender and gets injected with another IReportSender in its constructor:

class ReportSendLogger : IReportSender
{
  IReportSender reportSender;
  ILogger logger;

  public ReportSendLogger(IReportSender reportSender, ILogger logger)
  {
      this.reportSender = reportSender;
      this.logger = logger;
  }

  public void Send(Report report)
  {
      reportSender.Send(report);
      logger.Write(string.Format("Sent report: {0}", report.Title));
  }
}

Now I can simply string a logger and an IReportSender together and I have logging again without having to change a thing in Reporter.

[Test]
public void ReporterTestWithLogging()
{
  IReportBuilder reportBuilder = new ReportBuilder();
  ILogger logger = new Logger();

  // send by email
  IReportSender emailReportSender = new ReportSendLogger(new EmailReportSender(), logger);
  Reporter reporter = new Reporter(reportBuilder, emailReportSender);
  reporter.Send();

  // send by SMS
  IReportSender smsReportSender = new ReportSendLogger(new SmsReportSender(), logger);
  reporter = new Reporter(reportBuilder, smsReportSender);
  reporter.Send();
}

So there we have it, the power of Inversion of Control. In my next post I'll show how IoC makes real unit testing possible: Inversion of Control, Unit Tests and Mocks.

7 comments:

Anonymous said...

Good post on IoC. I do believe however that you have mixed two different terms. Inversion of control which is often accomplished by using an IoC container and the "Dependency Inversion Principle". The dependency inversion principle as far as I understand it, is related to how we couple our types. According to Peter C. Martin: "Abstractions should not depend upon details. Details should depend upon abstractions"

Mike Hadlow said...

Hi Kim,

You've stolen my thunder a bit there, I was going to cover IoC containers in a future blog:)

Just to extend your Robert C. Martin quote:

"A. HIGH LEVEL MODULES SHOULD NOT DEPEND UPON LOW
LEVEL MODULES. BOTH SHOULD DEPEND UPON ABSTRACTIONS.

B. ABSTRACTIONS SHOULD NOT DEPEND UPON DETAILS. DETAILS
SHOULD DEPEND UPON ABSTRACTIONS.

One might question why I use the word “inversion”. Frankly, it is because more traditional
software development methods, such as Structured Analysis and Design, tend to create
software structures in which high level modules depend upon low level modules, and in
which abstractions depend upon details. Indeed one of the goals of these methods is to
de?ne the subprogram hierarchy that describes how the high level modules make calls to
the low level modules. Figure 1 is a good example of such a hierarchy. Thus, the depen-
dency structure of a well designed object oriented program is “inverted” with respect to
the dependency structure that normally results from traditional procedural methods.
Consider the implications of high level modules that depend upon low level modules.
It is the high level modules that contain the important policy decisions and business mod-
els of an application. It is these models that contain the identity of the application. Yet,
when these modules depend upon the lower level modules, then changes to the lower level
modules can have direct effects upon them; and can force them to change.
This predicament is absurd! It is the high level modules that ought to be forcing the
low level modules to change. It is the high level modules that should take precedence over
the lower level modules. High level modules simply should not depend upon low level
modules in any way."

Isn't this essentially the same as inversion of control? Using an IoC container usually involves a specific kind of IoC: Dependency Injection where the concrete instance is supplied either in the class's constructor or a public property. So how about this:

The Dependency Inversion Priciple is the general principle.

Inversion of Control is the technique.

Dependency Injection is a subset of Inversion of Control.

BTW, I had a browse of your blog. Very nice. Our interests are pretty much the same.

Anonymous said...

"Isn't this essentially the same as inversion of control?"
I guess you could look at it that way. When I think of "control" and "Inversion of Control" I think more about what is it the calling method has to do. Will the caller create the objects needed to do its job, or will they come from the outside.
When thinking about the "Dependency Inversion Principle" I think mostly about the dependencies between modules.

I'm not too much into nitpicking, but the terms dependency, inversion, control etc. are being used in so many closely related principles that you can get dizzy from less.

"BTW, I had a browse of your blog. Very nice. Our interests are pretty much the same."

Ah, that explains why I enjoy reading your blog! :-)

Mike Hadlow said...

Kim, I see what you mean about module or package level dependency inversion vs. class level Inversion of Control. That's quite a nice distinction.

I also agree with you that both terms are banded around without much distinction. I guess that's part of the fun of working with stuff as it emerges: the terminology takes a while to settle down into universally accepted meanings.

Many thanks, great points.

Anonymous said...

... or how to turn 9 lines of understandable code into 46 lines lines of obfuscation:-)

Mike Hadlow said...

slevdi,

That's a very good point. IoC does mean writing more code up front. You have to define interfaces and defining and assining the dependencies in the constructor fattens things up a bit too. There's also the initial construction of the class hierarchy where you create and assign the concrete instances.

Most good software practices are about coping with scalability, especially scalling complexity. That's hard to demonstrate in a simple example that can be presented in a short blog article. Unless you've had experience of the headaches inherent in building large scale software you'll find it hard to appreciate the benefits of stuff like this. I plan to do a follow up post on how IoC enables testing which is really its killer application. Also Inversion of Control containers can do all the wiring up for you which saves on a lot of the more complex code.

It's worth noting that you can do this stuff with dynamic langages (Ruby, Python, Javascript) with a much less code. Many people think that dynamic languages are the future of business applications, they may well be right.

Rinat Abdullin said...

That's an impressive article, Mike. It was a bit of inspiration to me when I've tried to come up with a short definition of IoC.

http://abdullin.com/wiki/inversion-of-control-ioc.html