Center

Page

Web

Wiki

# Parsing Sheet

Proxima
The module `\$instantiation/src/ProxParser.hs` declares the parsers for all nonterminals (also called document nodes) that have a parsing presentation. The parsers are written using the uulib parsing-combinator library. The algorithm for parsing in Proxima and the interaction between the scanner and the parser is explained in the paper: Beyond ASCII - Parsing programs with graphical presentations (pdf). Slides for a research presentation of this paper are available at Publications. The parsing algorithm has undergone major changes, which are not yet reflected in the Dazzle and Helium editors. In the parsers for these editors, structurally-presented parts of the presentation are still explicitly parsed. The declaration form uses the new automatic structure recognition algorithm.

The top-level parser `recognizeEnrichedDoc` needs to be defined in `ProxParser`, as well as all the parsers that are mentioned in the presentation sheet. If a node has a structural presentation, its parser is simply `pStructural`, but for a parsing presentation, a parser need to be specified. These parsers are similar to ordinary combinator parsers, except that they operate on the Proxima `Token` datatype, and that they must store the `IDP` of every token that is parsed.

As an example, consider a `UserToken` datatype that contains a token for keywords:

```data UserToken = KeyTk String | ...
```

The `pToken` combinator constructs a parser for `KeyTk` tokens.

```pKey str  = pToken (KeyTk str)
```

An example parser that parses an if expression may look as follows:

```parseIfExp = (\tk1 b tk2 t tk3 e -> IfExp (getTokenIDP tk1) (getTokenIDP tk2) (getTokenIDP tk3) b t  e
<\$> pKey "if" <*> parseExp <*> pKey "then <*> parseExp <*> pKey "else" <*> parseExp
```

The difference with an ordinary combinator parser is that the returned value gets the IDP's of the parsed tokens. The fields for the IDP's are specified in the declaration for the `IfExp` constructor in `\$instantiation/src/DocumentType.prx`:

```data Exp = IfExp exp1:Exp exp2:Exp exp3:Exp              { idP0:IDP idP1:IDP idP2:IDP }
| ...
```

A special mechanism is available to parse presentations that do not contain all the necessary information the construct the enriched document. For example, in a source editor, this occurs when parsing the right-hand side of a collapsed function definition. Such missing information (also called interpretation extra state) can be recovered by employing `reuse<TYPE>` functions, which are generated for each nonterminal in the enriched document type. The information is taken from the previous value of the enriched document. The reuse mechanism is explained in Sections 4.2 and 7.2.3 of the Proxima PhD thesis.

-- MartijnSchrage - 05 Mar 2010