Saturday, August 25, 2007

Erlang based Javascript compiler

There was recently a long thread on the Erlang mailing list about compiling Javascript to Erlang. Out of curiosity, how such an Erlang solution would compare to Rhino, which is a Javascript compiler for Java, I started to investigate in that field. First I tried my luck the traditional way with Lexer and Parser in Erlang. Thanks to Denis Loutrein, who had already written a LR Grammar file for yecc / leex, supporting a subset of Javascript, and who was so kind to share it with me, I had not to start at zero. Soon I realized that I did not "like" the complexity involved by that approach. Fortunately there exists another way, better suited to my needs, pioneered by Douglas Crockford: writing the parser in Javascript, using the simple but extremly efficient "Top Down Operator Precedence" method and then passing the parser as JSON object to Erlang, where the Javascript abstract syntax tree (AST) needs to be translated to an Erlang specific AST, which then can be compiled to Bytecode and loaded into the VM (or stored to a .beam file). I got very excited when I had a prototype working, which was running this simple JS function on the Erlang VM:
var foo = function (a) {
var b = a + 1;
return b;
};
First step was to adapt Crockford's parser to dojo, my preferred Javascript framework. Feeding the parser with the sample Javascript snippet form above results in an object, which can be JSON serialized to a pure AST representation:
{{"value",<<"=">>},
{"arity",<<"binary">>},
{"first",{{"value",<<"foo">>},{"arity",<<"name">>}}},
{"second",
{{"value",<<"function">>},
{"arity",<<"function">>},
{"first",[{{"value",<<"a">>},{"arity",<<"name">>}}]},
{"second",
[{{"value",<<"=">>},
{"arity",<<"binary">>},
{"first",
{{"value",<<"b">>},{"arity",<<"name">>}}},
{"second",
{{"value",<<"+">>},
{"arity",<<"binary">>},
{"first",
{{"value",<<"a">>},{"arity",<<"name">>}}},
{"second",
{{"value",1},{"arity",<<"literal">>}}}}}},
{{"value",<<"return">>},
{"arity",<<"statement">>},
{"first",
{{"value",<<"b">>},{"arity",<<"name">>}}}}]}}}}
On the Erlang side, now comes the tricky part. The JSON tuple this needs to be translated into a tuple which represents an Erlang AST. For the example above my translator produces the following:
 {function,1,
foo,
1,
[{clause,1,
[{var,1,'A'}],
[],
[{match,1,
{var,1,'B'},
{op,1,'+',{var,1,'A'},{integer,1,1}}},
{var,1,'B'}]}]}
Next, this tuple needs to be wrapped with some meta data attributes, such as module name and function exports:
[{attribute,1,module,footest},
{attribute,1,compile,export_all},
{function,1,
foo,
1,
[{clause,1,
[{var,1,'A'}],
[],
[{match,1,
{var,1,'B'},
{op,1,'+',{var,1,'A'},{integer,1,1}}},
{var,1,'B'}]}]},
{eof,1}]
Now we have the complete module representation, and with compile:forms/1 it can be turned into byte code for directly loading into the VM (code:load_binary/3) or writing to a .beam file. It is even possible to get the Erlang representation of the source code with erl_prettypr:format/1:

"-module(footest)."
"-compile(export_all)."
"foo(A) -> B = A + 1, B."

What's next ?

Parser and translator currently only support a subset of the Javascript language. Here there is a lot to do. Another issue is the functional nature of Erlang. As long as I write functional Javascript, everything is fine. But somehow I also need to handle the non-functional aspects of Javascript. And setting up a project at googlecode for this.

2 comments:

mac01021 said...

Hi. Nice work. Out of curiosity, did you ever put this up as a project on Google Code?

Thanks,

mac01021

Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!