Autofill home page. authored by Ladislav Lhotka's avatar Ladislav Lhotka
Comparse – Monadic Parsing Library Comparse – Monadic Parsing Library
================================== ==================================
We've been all taught that creating a parser boils down to writing a context free grammar and then using some library like [Bison](http://www.gnu.org/software/bison/ "Bison"). By the way, such a library exists for JavaScript, too, it is called [Jison](http://zaach.github.io/jison/ "Jison"), and CoffeeScript uses it for parsing the source code. We've been all taught that creating a parser boils down to writing a
context free grammar and then using some library like
[Bison](http://www.gnu.org/software/bison/ "Bison"). By the way, such
a library exists for JavaScript, too, it is called
[Jison](http://zaach.github.io/jison/ "Jison"), and CoffeeScript uses
it for parsing the source code.
On the other hand, I like the approach of functional parsers such as [Parsec](http://legacy.cs.uu.nl/daan/parsec.html) or [attoparsec](https://github.com/bos/attoparsec "attoparsec"), both written in Haskell. They allow for building complex parsing functions by combining simpler ones in various ways. A parsing function, once written, can be easily reused in other contexts. On the other hand, I like the approach of functional parsers such as
[Parsec](http://legacy.cs.uu.nl/daan/parsec.html) or
[attoparsec](https://github.com/bos/attoparsec "attoparsec"), both
written in Haskell. They allow for building complex parsing functions
by combining simpler ones in various ways. A parsing function, once
written, can be easily reused in other contexts.
Even though JavaScript and CoffeeScript are not pure functional languages, they do implement the most important functional paradigm: functions are first-class data that can, for instance, be passed as arguments to higher-order functions. Douglas Crockford also [demonstrated](https://www.youtube.com/watch?v=b0EF0VTs9Dc "Monads and Gonads") how the powerful concept of [monads](http://en.wikipedia.org/wiki/Monad_(functional_programming) "Monad") can be implemented in JavaScript. Even though JavaScript and CoffeeScript are not pure functional
languages, they do implement the most important functional paradigm:
functions are first-class data that can, for instance, be passed as
arguments to higher-order functions. Douglas Crockford also
[demonstrated](https://www.youtube.com/watch?v=b0EF0VTs9Dc "Monads and
Gonads") how the powerful concept of
[monads](http://en.wikipedia.org/wiki/Monad_(functional_programming)
"Monad") can be implemented in JavaScript.
So I was wondering whether a decent monadic parser could be written in CoffeeScript. The result is Comparse, a monadic parsing library, which is described in the rest of this text. Frankly, I didn't expect to get something nearly as elegant as the Haskell parsers mentioned above, primarily because Haskell has special syntactic sugar for monad programming (the “do” notation). Much to my surprise, monadic functions and their combinations can be written quite nicely in the CoffeeScript syntax (but maybe I am just weird and biased). So I was wondering whether a decent monadic parser could be written in
CoffeeScript. The result is Comparse, a monadic parsing library, which
is described in the rest of this text. Frankly, I didn't expect to get
something nearly as elegant as the Haskell parsers mentioned above,
primarily because Haskell has special syntactic sugar for monad
programming (the “do” notation). Much to my surprise, monadic
functions and their combinations can be written quite nicely in the
CoffeeScript syntax (but maybe I am just weird and biased).
The implementation takes into account the lack of tail recursion optimization in JavaScript, so recursive functions are mostly rewritten using loops. Also, since JavaScript is _not_ a pure functional language, side effects and non-local variables are used in a few places for improving efficiency. The implementation takes into account the lack of tail recursion
optimization in JavaScript, so recursive functions are mostly
rewritten using loops. Also, since JavaScript is _not_ a pure
functional language, side effects and non-local variables are used in
a few places for improving efficiency.
* [Annotated source code](comparse) * [Annotated source code](comparse)