Tuesday, August 07, 2012

Sprache – A Monadic Parser For C#

Recently I had a requirement to parse AMQP error messages. A typical message looks something like this:

The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, 
code=406, text="PRECONDITION_FAILED - parameters for queue 'my.redeclare.queue'
in vhost '/' not equivalent"
, classId=50, methodId=10, cause=

It starts off with the high-level message text ‘The AMQP operation was interrupted’, then a colon, then some comma separated values some of which are key value pairs.

I wanted to parse these into a ‘semantic model’ – an object graph that represents the structure of the error. Now I could have done some pretty nasty string manipulation; looking for the first colon, separating the rest by commas, looking for ‘=’ and separating out the key-value pairs, but the code would have been rather ugly to say the least. I could have used regular expressions, but once again I doubt very much if I would have been able to read the resulting expression if I revisited the code in a couple of weeks time.

Then I remembered Sprache, a little monadic parser by Nicholas Blumhardt that I’d encountered last year when I was writing about Monads. The lovely thing about Sprache is that you write your parser in readable C# code and build the sematic model directly in the parser code. It’s very easy to use and very readable. Nicholas has an excellent step-by-step post here that I’d strongly recommend reading.

I found it on NuGet, but whoever had put it up there had since disappeared. After contacting Nicholas I decided to adopt it, first moving the source to GitHub and setting up continuous deployment to NuGet via TeamCity.CodeBetter.com.

If you go to the NuGet.org Sprache page, you’ll see that its owners are myself and Nicholas and each push to the GitHub repository results in a new package upload (the last three all done today while I was getting it working ;).

So if you need to do some parsing, give Sprache a try, it’s much easier than writing your own parser from scratch, and unlike regex you can actually read the code after you’ve written it.

You can see my AMQP parser experiment here.

1 comment:

Beltiras said...

The link to your AMQP experiment ends in a 404. Moved or re-moved?