Syntax Macros

Center
In the design of a language, one has to make a choice for either the compactness of the base language or the richness. Both choices have their pros and cons. A smaller core language is easier to implement for a compiler writer, whereas a richer language provides more expressiveness for a user of the language.

Syntax macros provide the best of both sides. The language of a compiler can be extended using macros to provide a rich set of notations for the language user. The macros map new concrete syntax on the original syntax of the core language. Hence, given an expressive core language, syntax macros provide the necessary syntactic sugar while leaving the core language unaltered.

Attribute Redefinitions

The idea of syntax macros is more or less based on a one-way thought, namely the translation from the concrete syntax of a domain specific language to the abstract syntax of a core language. However, there are scenarios which require a sort of inverse transformation, from the core language to the domain specific language. Examples of such functionality are error reporting and pretty printing.

More generally, one can think of the possibility to, besides giving the translation, redefine parts of the semantic analysis performed by the compiler of the core language. This version of the syntax macros takes as a starting point a description of the semantic analyser in the form of an attribute grammar. Besides defining new concrete syntax macro rules can also redefine attributes to customize the behaviour of the compiler for the new syntax. This effictively makes a compiler scriptable.

The implementation of syntax macros with attribute redefinitions basically takes the following approach: The AttributeGrammarSystem generates the code for the semantic functions, plus meta information about the attributes and semantic functions. A macro consists of mapping of a piece of new concrete syntax into the syntax of the core language, and optionally attribute redefinitions. The semantics of the core language is inherited by the new syntactic construct. The attribute redefinitions override this default behaviour and adapt the attribute computations for the new construct. The meta information generated by the AttributeGrammarSystem provide a link between the redefinitions and the generated semantic functions.

Examples

Our core language is a simple functional programming language with the following abstract syntax. The abstract syntax is defined using the AttributeGrammarSystem's DATA statements. They can be read as normal Haskell data definitions describing syntax trees for this language.

DATA Expr 
          | Literal Lit
          | Let String e:Expr body:Expr
          | Case Expr CaseAlts
          | Variable String
          | Lambda Pattern Expr
          | Apply f:Expr a:Expr
          
DATA Lit 
          | LitBool   Bool
          | LitInt    Int
          | LitString String

DATA Pattern 
          | Constructor String Patterns
          | PatVar  String
          | Underscore

DATA Patterns 
         | PCons Pattern Patterns
         | PNil

DATA CaseAlt | Alt Pattern Expr

DATA CaseAlts
          | CCons CaseAlt CaseAlts
          | CNil
After compiling the attribute grammar for this language , we can define its concrete syntax using syntax macros. A syntax macro defines a mapping from concrete syntax into the abstract syntax. The left part of a macro rule is a BNF-rule with additional identifiers that can be used in the abstract syntax fragment on the right. The following macro defines syntax for the "let"-construct, the concrete syntax is simply mapped on a Let node from the abstract syntax. No attribute redefinitions are needed because the Let construct defined in the core language.
        Expression ::= "let" x=Varid "=" e=Expression
                               "in"  b=Expression                          => Let x e b ;
Suppose we want to add parenthesized expressions to our language, and we want to pretty print such expressions including the parenthesis as the user typed them. The syntax macro for parenthesized expressions is the following.
        Factor     ::= "(" x=Expression ")"              => x ;
In the macro above the semantics of a parenthesized expression is simply the semantics of the expression. Hence the pretty printer will not print the enclosing parenthesis, when they were not really needed. To make sure the parenthesis are printed as the user typed them we add an attribute redefinition:
        Factor     ::= "(" x=Expression ")"              => x ;
                        { lhs.pp   = Text "(" >#< syn.pp >#< Text ")" ;
                          inh.prec = TopPrec;
                        }
In the redefinition above the "pp" attribute is redefined so parenthesis are always printed; additionally the precedence level is reset to the top-level precedence.

Our core language does not contain an if-then-else construct. The if-then-else construct can be translated to a case-expression:

if <cond> then <expr1> else <expr2>
translates to:
 case <cond> of { True -> <expr1> ; False -> <expr2> }
This translation can be written as the following syntax macro:
        Expression ::= "if" c=Expression
                            "then" t=Expression
                            "else" e=Expression => Case c
                                                              (CCons (Alt (Constructor "True"  PNil) t)
                                                              (CCons (Alt (Constructor "False" PNil) e)
                                                               CNil
                                                              ));
To prevent the if-then-else construct to be printed as a case, we supply the following attribute redefinitions. The redefinitions print two case arms respectively as a "then" and "else" branch. The keyword "if" is printed before the pretty printed condition and the redefined case arms.
            { lhs.pp = Text "if" >#< expr.pp
                        >-< caseAlts.pp ; 
               {}         
               {  
                {  lhs.pp = Text "  then" >#< expr.pp ;
                } 
                {  
                 { lhs.pp = Text "  else" >#< expr.pp ;
                 }
                } 
               }       
            }

If you want to play a bit with syntax macros you can download: sm_example.zip. This example is contains many macros for the language defined above, including syntax macros for list-comprehensions. The example requires the uust package which can be found in the CVS repository.

You can also get the uust package, using anonymous checkout:

cvs -d:pserver:anonymous@cvs.cs.uu.nl:/data/cvs-rep login
cvs -d:pserver:anonymous@cvs.cs.uu.nl:/data/cvs-rep checkout uust
Download the AttributeGrammarSystem if you want to make your own compiler with syntax macro support. Note this feature of the AttributeGrammarSystem is experimental.

Thesis

A detailed description of syntax macros and their implementation can be found in the master's thesis of Joost Rommes.

Topic attachments
I Attachment Action Size Date Who Comment
zipzip sm_example.zip manage 25.8 K 03 Oct 2003 - 16:23 ArthurBaars Simple compiler using syntax macros
pdfpdf thesis_rommes.pdf manage 389.3 K 03 Oct 2003 - 15:31 ArthurBaars Thesis of Joost Rommes