|
|
Comparse – Monadic Parsing Library
|
|
|
==================================
|
|
|
|
|
|
We've been all taught that creating a parser boils down to writing a context free grammar and then using some library like [Bison](http://www.gnu.org/software/bison/ "Bison"). By the way, such a library exists for JavaScript, too, it is called [Jison](http://zaach.github.io/jison/ "Jison"), and CoffeeScript uses it for parsing the source code.
|
|
|
We've been all taught that creating a parser boils down to writing a
|
|
|
context free grammar and then using some library like
|
|
|
[Bison](http://www.gnu.org/software/bison/ "Bison"). By the way, such
|
|
|
a library exists for JavaScript, too, it is called
|
|
|
[Jison](http://zaach.github.io/jison/ "Jison"), and CoffeeScript uses
|
|
|
it for parsing the source code.
|
|
|
|
|
|
On the other hand, I like the approach of functional parsers such as [Parsec](http://legacy.cs.uu.nl/daan/parsec.html) or [attoparsec](https://github.com/bos/attoparsec "attoparsec"), both written in Haskell. They allow for building complex parsing functions by combining simpler ones in various ways. A parsing function, once written, can be easily reused in other contexts.
|
|
|
On the other hand, I like the approach of functional parsers such as
|
|
|
[Parsec](http://legacy.cs.uu.nl/daan/parsec.html) or
|
|
|
[attoparsec](https://github.com/bos/attoparsec "attoparsec"), both
|
|
|
written in Haskell. They allow for building complex parsing functions
|
|
|
by combining simpler ones in various ways. A parsing function, once
|
|
|
written, can be easily reused in other contexts.
|
|
|
|
|
|
Even though JavaScript and CoffeeScript are not pure functional languages, they do implement the most important functional paradigm: functions are first-class data that can, for instance, be passed as arguments to higher-order functions. Douglas Crockford also [demonstrated](https://www.youtube.com/watch?v=b0EF0VTs9Dc "Monads and Gonads") how the powerful concept of [monads](http://en.wikipedia.org/wiki/Monad_(functional_programming) "Monad") can be implemented in JavaScript.
|
|
|
Even though JavaScript and CoffeeScript are not pure functional
|
|
|
languages, they do implement the most important functional paradigm:
|
|
|
functions are first-class data that can, for instance, be passed as
|
|
|
arguments to higher-order functions. Douglas Crockford also
|
|
|
[demonstrated](https://www.youtube.com/watch?v=b0EF0VTs9Dc "Monads and
|
|
|
Gonads") how the powerful concept of
|
|
|
[monads](http://en.wikipedia.org/wiki/Monad_(functional_programming)
|
|
|
"Monad") can be implemented in JavaScript.
|
|
|
|
|
|
So I was wondering whether a decent monadic parser could be written in CoffeeScript. The result is Comparse, a monadic parsing library, which is described in the rest of this text. Frankly, I didn't expect to get something nearly as elegant as the Haskell parsers mentioned above, primarily because Haskell has special syntactic sugar for monad programming (the “do” notation). Much to my surprise, monadic functions and their combinations can be written quite nicely in the CoffeeScript syntax (but maybe I am just weird and biased).
|
|
|
So I was wondering whether a decent monadic parser could be written in
|
|
|
CoffeeScript. The result is Comparse, a monadic parsing library, which
|
|
|
is described in the rest of this text. Frankly, I didn't expect to get
|
|
|
something nearly as elegant as the Haskell parsers mentioned above,
|
|
|
primarily because Haskell has special syntactic sugar for monad
|
|
|
programming (the “do” notation). Much to my surprise, monadic
|
|
|
functions and their combinations can be written quite nicely in the
|
|
|
CoffeeScript syntax (but maybe I am just weird and biased).
|
|
|
|
|
|
The implementation takes into account the lack of tail recursion optimization in JavaScript, so recursive functions are mostly rewritten using loops. Also, since JavaScript is _not_ a pure functional language, side effects and non-local variables are used in a few places for improving efficiency.
|
|
|
The implementation takes into account the lack of tail recursion
|
|
|
optimization in JavaScript, so recursive functions are mostly
|
|
|
rewritten using loops. Also, since JavaScript is _not_ a pure
|
|
|
functional language, side effects and non-local variables are used in
|
|
|
a few places for improving efficiency.
|
|
|
|
|
|
* [Annotated source code](comparse) |