Thursday, February 10, 2011

Monads in C#-7. Turning our parser into a Monad

This is the seventh part of my Monad series. See the introduction for a list of all the posts.

You can find the code for this post on GitHub here.

Do you remember our parser delegate from part 6? If you haven’t read it yet, I’d recommend reviewing it now before reading the rest of this post.

public delegate Maybe<Tuple<T, string>> Parser<T>(string input);

Remember the pain we had to go through to combine these parsers? Let’s try a different approach by turning our parser into a Monad. Up to now we’ve just looked at turning classes or interfaces with a single type parameter into Monads, but we can do just the same with a delegate.

First let’s write a ToParser method. It’s pretty easy we just return a parser that returns our item without consuming any of the input string:

public static Parser<T> ToParser<T>(this T value)
{
return s => new Just<Tuple<T, string>>(Tuple.Create(value, s));
}

Next we need a Bind method. This is a little harder to write, but if we ‘follow the types’ it appears relatively painlessly. Let’s think about the signature of bind, remember this is the same for any Monad:

public static Parser<B> Bind<A, B>(this Parser<A> a, Func<A, Parser<B>> func){}

Now we’ll digest the signature step by step. First let’s look at the return type: Parser<B>.  Remember Parser<B> is a function from a string to a Maybe<Tuple<B, string>>, so we need to return a lambda for that function:

public static Parser<B> Bind<A, B>(this Parser<A> a, Func<A, Parser<B>> func)
{
return s =>
{
// ...
};
}

That lambda needs to return a Maybe<Tuple<B, string>>, we can get one of those by invoking func, getting the returned Parser<B> and invoking that:

public static Parser<B> Bind<A, B>(this Parser<A> a, Func<A, Parser<B>> func)
{
return s =>
{
var bParser = func( ??? );
return bParser( ??? );
};
}

We need an A to pass to func and we can get one of those by invoking the a parser, getting it’s return value and grabbing the A from that:

public static Parser<B> Bind<A, B>(this Parser<A> a, Func<A, Parser<B>> func)
{
return s =>
{
var aMaybe = a(s);
var aResult = aMaybe as Just<Tuple<A, string>>;

// short circuit if parse fails
if (aResult == null) return new Nothing<Tuple<B, string>>();

var aValue = aResult.Value.Item1;

var bParser = func(aValue);
return bParser( ??? );
};
}

Notice how we short circuit the parse if the a parser fails. This means that any combination of parsers will return Nothing if any one of them can’t parse its part of the input string.

Lastly we need a string value to feed to our bParser, which we can get from the a parser result:

public static Parser<B> Bind<A, B>(this Parser<A> a, Func<A, Parser<B>> func)
{
return s =>
{
var aMaybe = a(s);
var aResult = aMaybe as Just<Tuple<A, string>>;

// short circuit if parse fails
if (aResult == null) return new Nothing<Tuple<B, string>>();

var aValue = aResult.Value.Item1;
var sString = aResult.Value.Item2;

var bParser = func(aValue);
return bParser(sString);
};
}

That’s it, we have built a Monadic parser. To use it with Linq syntax we need to write a SelectMany, but that’s a mechanical operation because we already know what it looks like, it’s exactly the same as the Ident<T> SelectMany and the Maybe<T> SelectMany that we’ve already written, except with Parser<T> substituted:

public static Parser<C> SelectMany<A, B, C>(this Parser<A> a, Func<A, Parser<B>> func, Func<A, B, C> select)
{
return a.Bind(aval => func(aval).Bind(bval => select(aval, bval).ToParser()));
}

Now we can write our little Hello World parser in three lines of code:

var helloWorldParser =
from hello in "Hello".Find()
from world in "World".Find()
select new {Hello = hello, World = world};

var result = helloWorldParser("HelloWorld");

Console.WriteLine(result.AsString(x => x.Hello));
Console.WriteLine(result.AsString(x => x.World));

// outputs
// Hello
// World

That is a thing of beauty.

Hopefully you can start to see how Monads can let you simply build some very complex systems.

If you want to dig deeper into Monadic parsers, Nicholas Blumhardt has a very nice implementation called Sprache.

Happy Parsing!

7 comments:

Michael L Perry said...

Based on this example, it doesn't seem that a monadic parser could be as powerful as shift/reduce. I can maybe see it handling LL(1) grammars, but can a monadic parser do LALR?

Still, a useful example. I might choose this technique for those cases where Antler or Yacc would be overkill.

Mike Hadlow said...

Hi Michael,

Don't mistake me for someone who actually knows anything about parsers :)

I don't really understand the difference between LL(1) and LALR, but looking at the documentation for Parsec (an industrial strength monadic parser in Haskell) suggests that it can support LALR (infinite look ahead grammars right?).

http://legacy.cs.uu.nl/daan/download/parsec/parsec.html

I'd agree with you though, I think this technique could be very useful where you need a simple DSL, but for anything serious you'd probably be better off with a parser toolkit like ANTLR.

Sgeo said...

Keep in mind that your monad is free to implement more than just the requirements to get LINQ syntax working. Parsec implements a thing that takes two parsers and combines them into one, where if the first fails, the second one is tried. (Well, by default, not really usable for lookahead, but there's another function that makes it usable for such)

Sgeo said...

ddarius in #haskell said to post the following code:


var r = from as in "a".Find().Many()
let n = as.Length
from bs in "b".Find().Count(n)
from cs in "c".Find().Count(n)
select { as, bs, cs }

It's possible to write Many() and Count() for the monad, and have them do backtracking. The internals of the monadic parser might need to be different though. The monad shown in Mike Hadlow's example is not the only way to do monadic parsing, just a simple one.

Justin Chase said...

I've been reading on monads in C# for 3 days now and this post finally put it together for me. I have a pattern matching project (http://metasharp.codeplex.com) which as almost monadic but not quite and those areas where it isn't monadic it makes things a little rough, so I've been trying to see how to redo it in a monadic fashion. After trying this out in a million different ways I think I just realized that a Pattern Matching monad is actually a combination of two different monads: Maybe and IO. Instead of just using string I have to use Stream<S> and Bind to that as well. Figuring out how to use linq and select to do productions though really was the piece of the puzzle I was missing. I still have a lot to go but I think you helped me break through a barrier here! Thanks!

Unknown said...

I'm late to the party, but great blog post, learned a lot, and I had no idea how powerful LINQ syntax was. Thanks for writing!

I don't know why SelectMany takes the two funcs, seems like an interesting design decision (opposed to having two separate extension methods you'd define), but I'm glad you explained the how. As an exercise I gave the Reader Monad for Depenency Injection a shot in C# and it worked out quite well:

https://gist.github.com/4640678

Your parser Monad looks a lot like the State Monad interestingly enough ...

kozue said...

Wow! That is the coolest technique I've ever seen in C#. TBH I feel slightly ashamed that I didn't know you could extend the LINQ query syntax: I always use the method chains because I think it looks a bit out of place and thought it was totally superfluous - but this has totally changed my opinion :)