Monday, October 03, 2005

Nicer parser notation

Continuing our parser improvements, we can follow Dominus's lead and use operator overloading to cut down the syntactical clutter:

Operators added to Parser

 public static Parser operator -(Parser a, Parser b)
 {
   return new Concatenate(a, b);
 }

 public static Parser operator |(Parser a, Parser b)
 {
   return new Alternate(a, b);
 }

 public static Parser operator >(Parser p, ValuesTransform map)
 {
   return new T(p, map);
 }

 // C# forces us to overload < and > in pairs
 public static Parser operator <(Parser p, ValuesTransform map)
 {
   return new T(p, map);
 }

Add a few more static helpers, and we have the following much nicer looking generator.

 static Parser MakeParser()
 {
   ParserStub exprstub = new ParserStub(new StubToReal(GetExpression));
   ParserStub termstub = new ParserStub(new StubToReal(GetTerm));
   ParserStub factstub = new ParserStub(new StubToReal(GetFactor));

   expression =
     termstub - _(Tokens.Operator, "+") - exprstub > TOpFirst |
     termstub;

   term =
     factstub - _(Tokens.Operator, "*") - termstub > TOpFirst |
     factstub;

   Parser open  = _(Tokens.Operator, "(");
   Parser close = _(Tokens.Operator, ")");
   Parser comma = _(Tokens.Operator, ",");

   Parser arglist =
     open - exprstub - new Star(comma - exprstub) - close;

   factor =
     _(Tokens.Identifier) - (arglist | new Nothing()) > TTagVarOrFunction |
     open - exprstub - close > TStripParens |
     _(Tokens.Integer);

   return exprstub - new EndOfInput() > TOnlyExpression;
 }

In Higher-Order Perl, Dominus overloads right-shift to generate a transformation instead of greater-than as above, but C# requires the second operand of >> to be an int. Note also that C# requires overloading both the less-than and greater-than operators if either is defined.

Using binary operators changes the structure of the AST. This has the effect of beating OpFirst with an ugly stick:

 static private ArrayList OpFirst(ArrayList values)
 {
   ArrayList result = new ArrayList(3);
   result.Add(((ArrayList)values[0])[1]);  // ugh
   result.Add(((ArrayList)values[0])[0]);
   result.Add(values[1]);
   return result;
 }

No comments: