Tuesday, June 18, 2013

Guest Post: Working Around F#/NuGet Problems

A first for me. A guest post by my excellent colleague Michael Newton.

Michael normally blogs at http://blog.mavnn.co.uk and works at 15below. He’s the build manager at 15below and has developed various work arounds for the F# NuGet issues we’ve been experiencing. I’ll hand over and let him explain …

In his first post Mike did an excellent job explaining the bugs we found when trying to add or update NuGet references in fsproj files.

Unfortunately, the confusion doesn’t stop there. It turns out that if you examine the NuGet code, the logic for updating project files is not in NuGet.Core (the shared dll that drives core functionality like how to unpack a nupkg file) but is re-implemented in each client. This means that you get different results if you are running commands from the command line client than if you are using the Visual Studio plugin or the hosted PowerShell console. The reason for this starts to become obvious once you realise that the command line client has no parameters for specifying which project and/or solution you are working against, whilst that information is either already available in the Visual Studio plugin or required via little dropdown boxes in the hosted PowerShell console.

So, between everyday usage, preparing to move some of our project references to NuGet and the needs of our continuous integration we now had the following requirements:

  1. Reliable installation of NuGet references to C#, VB.net and F# projects. Preferably with an option of doing it via the command line to help scripting the project reference to NuGet reference moves.
  2. Reliable upgrades of NuGet references in all project types. Again, a command line option would be useful.
  3. Reliable downgrades of NuGet references in all project types. It’s painful to try a new release/pre-release of a NuGet package across a solution and then discover that you have to manually downgrade all of the projects separately if you decide not to take the new version.
  4. Reliable removal of NuGet references turns out to be a requirement of reliable downgrades.
  5. Sane solution wide management of references. Due to the way project references work, we need an easy way to ensure that all of the projects in a solution use the same version of any particular NuGet reference, and to check that this will not case any version conflicts. So ideally, upgrade and downgrade commands will run against a solution.

Looking at our requirements in terms of what is already handled by NuGet and what is affected by the bugs that Mike discussed last time, we get:

  1. Very buggy for F# projects, fix relies on a fix to the underlying F# project type which is probably unlikely before Visual Studio version next. Also, command line installing that adds project references is not supported in nuget.exe by design.
  2. Again, broken for F# projects. Otherwise works.
  3. Not supported by design in any of the NuGet clients.
  4. Appears to work.
  5. Not supported by NuGet.

As we looked at the list, it became apparent that we were unlikely to see the F# bug fixed any time soon, and even if we did there would still be several areas of functionality that we would be missing but would help us greatly. The number of options that we needed that the mainline NuGet project does not support by design swung the balance for us from a work-around or bug patch in NuGet’s Visual Studio project handling to a full blown wrapper library.

So, the NuGetPlus project was born. As always, the name is a misnomer. Because naming things is hard. But the idea is to build a NuGet wrapper that provides the functionality above, and as a bonus extra for both us and the F# community, does not exhibit the annoying F# bugs from the previous post. Because the command line exe in NuGetPlus is only a very thin wrapper around the dll, it also allows you to easily call into the process from code and achieve results the same as running the command line program without having to write 100s of lines of supporting boiler plate. For those of you who have tried to use NuGet.Core directly, you’ll know that it’s a bit of an exercise in frustration actually mimicking the full behaviour of any of the clients.

It is very much still a work in progress. For example, it respects nuget.config files, but at the moment only makes use of the repository path and source list config options – we haven’t checked if we need to be supporting more. But it covers scenarios 1-4 above nicely, and we’re hoping to add 5 (solution level upgrade, downgrade and checking) fairly shortly. Although it has been developed on work time as functionality we desperately need in house, it is also a fully open source MIT licensed project that we are more than happy to receive pull requests for if there is functionality the community needs.

So whether you’re a F# NuGet user, or you just see the value of the additional functionality above, take it for a spin and let us know what you think.

Redis: Very Useful As a Distributed Lock

In a Service Oriented Architecture you sometimes need a distributed lock; an application lock across many servers to serialize access to some constrained resource. I’ve been looking at using Redis, via the excellent ServiceStack.Redis client library, for this.

It really is super simple. Here’s a little F# sample to show it in action:

   1: module Zorrillo.Runtime.ProxyAutomation.Tests.RedisSpike
   2:  
   3: open System
   4: open ServiceStack.Redis
   5:  
   6: let iTakeALock n = 
   7:     async {
   8:         use redis = new RedisClient("redis.local")
   9:         let lock = redis.AcquireLock("integration_test_lock", TimeSpan.FromSeconds(10.0))
  10:         printfn "Aquired lock for %i" n 
  11:         Threading.Thread.Sleep(100)
  12:         printfn "Disposing of lock for %i" n
  13:         lock.Dispose()
  14:     }
  15:  
  16: let ``should be able to save and retrieve from redis`` () =
  17:     
  18:     [for i in [0..9] -> iTakeALock i]
  19:         |> Async.Parallel
  20:         |> Async.RunSynchronously

The iTakeALock function creates an async task that uses the SeviceStack.Redis AquireLock function. It then pretends to do some work (Thread.Sleep(100)), and then releases the lock (lock.Dispose()).

Running 10 iTakeALocks in parallel (line 16 onwards) gives the following result:

Aquired lock for 2
Disposing of lock for 2
Aquired lock for 6
Disposing of lock for 6
Aquired lock for 0
Disposing of lock for 0
Aquired lock for 7
Disposing of lock for 7
Aquired lock for 9
Disposing of lock for 9
Aquired lock for 5
Disposing of lock for 5
Aquired lock for 3
Disposing of lock for 3
Aquired lock for 8
Disposing of lock for 8
Aquired lock for 1
Disposing of lock for 1
Aquired lock for 4
Disposing of lock for 4

Beautifully serialized access from parallel processes. Very nice.

Monday, June 17, 2013

Automating Nginx Reverse Proxy Configuration

proxyautomation

It’s really nice if you can decouple your external API from the details of application segregation and deployment.

In a previous post I explained some of the benefits of using a reverse proxy. On my current project we’ve building a distributed service oriented architecture that also exposes an HTTP API, and we’re using a reverse proxy to route requests addressed to our API to individual components. We have chosen the excellent Nginx web server to serve as our reverse proxy; it’s fast, reliable and easy to configure. We use it to aggregate multiple services exposing HTTP APIs into a single URL space. So, for example, when you type:

http://api.example.com/product/pinstripe_suit

It gets routed to:

http://10.0.1.101:8001/product/pinstripe_suit

But when you go to:

http://api.example.com/customer/103474783

It gets routed to

http://10.0.1.104:8003/customer/103474783

To the consumer of the API it appears that they are exploring a single URL space (http://api.example.com/blah/blah), but behind the scenes the different top level segments of the URL route to different back end servers. /product/… routes to 10.0.1.101:8001, but /customer/… routes to 10.0.1.104:8003.

We also want this to be self-configuring. So, say I want to create a new component of the system that records stock levels. Rather than extending an existing component, I want to be able to write a stand-alone executable or service that exposes an HTTP endpoint, have it be automatically deployed to one of the hosts in my cloud infrastructure, and have Nginx automatically route requests addressed http://api.example.com/stock/whatever to my new component.

We also want to load balance these back end services. We might want to deploy several instances of our new stock API and have Nginx automatically round robin between them.

We call each top level segment ( /stock, /product, /customer ) a claim. A component publishes an ‘AddApiClaim’ message over RabbitMQ when it comes on line. This message has 3 fields: ‘Claim', ‘ipAddress’, and ‘PortNumber’. We have a special component, ProxyAutomation, that subscribes to these messages and rewrites the Nginx configuration as required. It uses SSH and SCP to log into the Nginx server, transfer the various configuration files, and instruct Nginx to reload its configuration. We use the excellent SSH.NET library to automate this.

A really nice thing about Nginx configuration is wildcard includes. Take a look at our top level configuration file:

   1: ...
   2:  
   3: http {
   4:     include       /etc/nginx/mime.types;
   5:     default_type  application/octet-stream;
   6:  
   7:     log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
   8:                       '$status $body_bytes_sent "$http_referer" '
   9:                       '"$http_user_agent" "$http_x_forwarded_for"';
  10:  
  11:     access_log  /var/log/nginx/access.log  main;
  12:  
  13:     sendfile        on;
  14:     keepalive_timeout  65;
  15:  
  16:     include /etc/nginx/conf.d/*.conf;
  17: }

Line 16 says, take any *.conf file in the conf.d directory and add it here.

Inside conf.d is a single file for all api.example.com requests:

   1: include     /etc/nginx/conf.d/api.example.com.conf.d/upstream.*.conf;
   2:  
   3: server {
   4:     listen          80;
   5:     server_name     api.example.com;
   6:  
   7:     include         /etc/nginx/conf.d/api.example.com.conf.d/location.*.conf;
   8:  
   9:     location / {
  10:         root    /usr/share/nginx/api.example.com;
  11:         index   index.html index.htm;
  12:     }
  13: }

This is basically saying listen on port 80 for any requests with a host header ‘api.example.com’.

This has two includes. The first one at line 1, I’ll talk about later. At line 7 it says ‘take any file named location.*.conf in the subdirectory ‘api.example.com.conf.d’ and add it to the configuration. Our proxy automation component adds new components (AKA API claims) by dropping new location.*.conf files in this directory. For example, for our stock component it might create a file, ‘location.stock.conf’, like this:

   1: location /stock/ {
   2:     proxy_pass http://stock;
   3: }

This simply tells Nginx to proxy all requests addressed to api.example.com/stock/… to the upstream servers defined at ‘stock’. This is where the other include mentioned above comes in, ‘upstream.*.conf’. The proxy automation component also drops in a file named upstream.stock.conf that looks something like this:

   1: upstream stock {
   2:     server 10.0.0.23:8001;
   3:     server 10.0.0.23:8002;
   4: }

This tells Nginx to round-robin all requests to api.example.com/stock/ to the given sockets. In this example it’s two components on the same machine (10.0.0.23), one on port 8001 and the other on port 8002.

As instances of the stock component get deployed, new entries are added to upstream.stock.conf. Similarly, when components get uninstalled, the entry is removed. When the last entry is removed, the whole file is also deleted.

This infrastructure allows us to decouple infrastructure configuration from component deployment. We can scale the application up and down by simply adding new component instances as required. As a component developer, I don’t need to do any proxy configuration, just make sure my component publishes add and remove API claim messages and I’m good to go.

Thursday, June 13, 2013

NuGet Install Is Broken With F#

There’s a very nasty bug when you try and use NuGet to add a package reference to an F# project. It manifests itself when either the assembly that is being installed also has a version in the GAC or a different version already exists in the output directory.

First let’s reproduce the problem when a version of the assembly already exists in the GAC.

Create a new solution with an F# project.

Choose an assembly that you want to install from NuGet that also exists in the GAC on your machine. For ironic purposes I’m going to choose NuGet.Core for this example.

It’s in my GAC:

D:\>gacutil -l | find "NuGet.Core"
NuGet.Core, Version=1.0.11220.104, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL
NuGet.Core, Version=1.6.30117.9648, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL

You can see that the highest version in the GAC is version 1.6.30117.9648

Now let’s install NuGet.Core version 2.5.0 from the official NuGet source:

PM> Install-Package NuGet.Core -Version 2.5.0
Installing 'Nuget.Core 2.5.0'.
Successfully installed 'Nuget.Core 2.5.0'.
Adding 'Nuget.Core 2.5.0' to Mike.NuGetExperiments.FsProject.
Successfully added 'Nuget.Core 2.5.0' to Mike.NuGetExperiments.FsProject.

It correctly creates a packages directory, downloads the NuGet.Core package and creates a packages.config file:

D:\Source\Mike.NuGetExperiments\src>tree /F
D:.
│ Mike.NuGetExperiments.sln

├───Mike.NuGetExperiments.FsProject
│ │ Mike.NuGetExperiments.FsProject.fsproj
│ │ packages.config
│ │ Spike.fs
│ │
│ ├───bin
│ │ └───Debug
│ │
│ └───obj
│ └───Debug

└───packages
│ repositories.config

└───Nuget.Core.2.5.0
│ Nuget.Core.2.5.0.nupkg
│ Nuget.Core.2.5.0.nuspec

└───lib
└───net40-Client
NuGet.Core.dll

But when when I look at my fsproj file I see that it has incorrectly referenced the NuGet.Core version (1.6.30117.9648) from the GAC and there is no hint path pointing to the downloaded package.

<Reference Include="NuGet.Core, Version=1.6.30117.9648, Culture=neutral, PublicKeyToken=31bf3856ad364e35">
<Private>True</Private>
</Reference>

Next let’s reproduce the problem when a version of an assembly already exists in the output directory.

This time I’m going to EasyNetQ as my example DLL. First I’m going to take a recent version of EasyNetQ.dll, 0.10.1.92 and drop it into to the projects output directory (bin\Debug).

Next use NuGet to install an earlier version of the assembly:

Install-Package EasyNetQ -Version 0.9.2.76
Attempting to resolve dependency 'RabbitMQ.Client (= 3.0.2.0)'.
Attempting to resolve dependency 'Newtonsoft.Json (≥ 4.5)'.
Installing 'RabbitMQ.Client 3.0.2'.
Successfully installed 'RabbitMQ.Client 3.0.2'.
Installing 'Newtonsoft.Json 4.5.11'.
Successfully installed 'Newtonsoft.Json 4.5.11'.
Installing 'EasyNetQ 0.9.2.76'.
Successfully installed 'EasyNetQ 0.9.2.76'.
Adding 'RabbitMQ.Client 3.0.2' to Mike.NuGetExperiments.FsProject.
Successfully added 'RabbitMQ.Client 3.0.2' to Mike.NuGetExperiments.FsProject.
Adding 'Newtonsoft.Json 4.5.11' to Mike.NuGetExperiments.FsProject.
Successfully added 'Newtonsoft.Json 4.5.11' to Mike.NuGetExperiments.FsProject.
Adding 'EasyNetQ 0.9.2.76' to Mike.NuGetExperiments.FsProject.
Successfully added 'EasyNetQ 0.9.2.76' to Mike.NuGetExperiments.FsProject.

NuGet reports that everything went according to plan and that EasyNetQ 0.9.2.76 has been successfully added to my project.

Once again the packages directory was successfully created and the correct version of EasyNetQ has been downloaded. The packages.config file also has the correct version of EasyNetQ. I won’t show you the output from ‘tree’ again, it’s much the same as before.

Again, when I look at my fsproj file the version of EasyNetQ is incorrect, it’s 0.10.1.92, and again there’s no hint path:

<Reference Include="EasyNetQ, Version=0.10.1.92, Culture=neutral, PublicKeyToken=null">
<Private>True</Private>
</Reference>

Yup, NuGet install is most definitely broken with F#.

This bug makes using NuGet and F# together an exercise in frustration. Our team has wasted days attempting to get to the bottom of this.

It seems that it’s a well know problem. Just take a look at this workitem, reported over a year ago:

http://nuget.codeplex.com/workitem/2149

After much cursing of NuGet, the problem actually appears to be with the F# project system rather than with NuGet itself:

“F# knows about this behavior and they will release the fix”

Hmm, it hasn’t been fixed yet.

We had a dig around the NuGet code. The interesting piece is this file snippet (from NuGet.VisualStudio.VsProjectSystem):

   1: public virtual void AddReference(string referencePath, Stream stream)
   2: {
   3:     string name = Path.GetFileNameWithoutExtension(referencePath);
   4:     try
   5:     {
   6:         // Get the full path to the reference
   7:         string fullPath = PathUtility.GetAbsolutePath(Root, referencePath);
   8:         string assemblyPath = fullPath;
   9:  
  10:         ...
  11:  
  12:         // Add a reference to the project
  13:         dynamic reference = Project.Object.References.Add(assemblyPath);
  14:  
  15:         ...
  16:  
  17:         TrySetCopyLocal(reference);
  18:  
  19:         // This happens if the assembly appears in any of the search
  20:         // paths that VS uses to locate assembly references. Most commonly, 
  21:         // it happens if this assembly is in the GAC or in the output path.
  22:         if (!reference.Path.Equals(fullPath, StringComparison.OrdinalIgnoreCase))
  23:         {
  24:             // Get the msbuild project for this project
  25:             MsBuildProject buildProject = Project.AsMSBuildProject();
  26:  
  27:             if (buildProject != null)
  28:             {
  29:                 // Get the assembly name of the reference we are trying to add
  30:                 AssemblyName assemblyName = AssemblyName.GetAssemblyName(fullPath);
  31:  
  32:                 // Try to find the item for the assembly name
  33:                 MsBuildProjectItem item = 
  34:                     (from assemblyReferenceNode in buildProject.GetAssemblyReferences()
  35:                     where AssemblyNamesMatch(assemblyName, assemblyReferenceNode.Item2)
  36:                     select assemblyReferenceNode.Item1).FirstOrDefault();
  37:  
  38:                 if (item != null)
  39:                 {
  40:                     // Add the <HintPath> metadata item as a relative path
  41:                     item.SetMetadataValue("HintPath", referencePath);
  42:  
  43:                     // Save the project after we've modified it.
  44:                     Project.Save();
  45:                 }
  46:             }
  47:         }
  48:     }
  49:     catch (Exception e)
  50:     {
  51:         ...
  52:     }
  53: }

On line 13 NuGet calls out to the F# project system and asks it to add a reference to the assembly at the given path. We assume that the F# project system then does the wrong thing by searching for the assembly name anywhere in the GAC or the output directory rather than referencing the explicit assembly NuGet is asking it to reference.

Interestingly, it looks as if the NuGet team have attempted to code a work-around for this bug from line 22 onwards. Could this be why C# projects don’t exhibit this behaviour? Unfortunately the work around doesn’t work in the F# case. We think it’s because F# doesn’t respect assembly versions and will happily replace any requested assembly with another one so long as it’s got the same simple name. At line 33, no assemblies are found in the fsproj file because the ‘AssemblyNamesMatch’ function does an exact match using all four elements of the full assembly name (simple name, version, culture, and key) and of course the assembly that the F# project system has found and added has a different version.

So, come on F# team, pull your finger out and fix the Visual Studio F# project system. In the meantime, in my next post I’ll talk about some of things our team, and especially the excellent Michael Newton (@mavnn) has been doing to try and work around these problems.

Update: Micheal Newton has written a guest post to explain some of things we are doing to work around these problems.