Non-CanonicalLALR(k)Parsing

Stc
Date: 2008-09-29

Time: *14.00*

Room: *BBL 430*

Speaker: Mark Snyder

Title: Non-Canonical LALR(k) Parsing (thesis defense)

Abstract

There are two basic styles of parsing - bottom-up parsing and top-down parsing. LALR(k) parsing is a widely adopted bottom-up parsing technique which is fast, space efficient and deterministic. Its one weakness is its lack of expressivity. In order to generate an LALR(k) parser for a typical programming language, one has to compromise the desired grammar, due to limitations in its lookahead mechanism.

Non-canonical LALR(k) parsing is an overlooked variant of bottom-up parsing which allows fragments of the parse tree to be built out of order. The ability of NLALR(k) parsers to build parse tree fragments "out of order" produces parsers which are strictly more powerful than - as well as more expressive than - their LALR(k) counterparts.

We present a practical analysis for constructing efficient NLALR(k) parsers, where previous efforts to develop practical analyses have been limited to a lookahead of k=1. The focus of this analysis is on maximizing the number of shift and reduce actions that are performed while resolving parse conflicts, as this increases the power of the parser and reduces the performance penalties that lookahead techniques typically incur.