tip TWiki Tip of the Day
SpreadSheetPlugin for dynamic content
Over 60 formulae are available through the SpreadSheetPlugin. For Example: $AVERAGE() $IF ... Read on Read more

Content Juan Cardona

Main

Title

Structuring tree transformations

Introduction

Problem definition

Attribute grammars (Knuth, 1968), AG, were developed to specify the semantic of context-free grammars; they have been successfully used to define compilers for languages. For functional languages like Haskell (Hughes, 1989; Jones, 2002; Hudak et al., 2007) the Utrecht Haskell Compiler (UHC) is completely attribute grammar based (Dijkstra et al., 2009).

To implement a complete compiler for a functional language like Haskell is not an easy task (Jones, 1992; Grune et al., 2000; Dijkstra et al., 2008; Dijkstra et al., 2009): many analyses and transformations have to take place before the source code of a program has been converted to object code for some architecture or operating system. In the UHC, the process of compilation takes place in several stages, each with a different purpose.

Constructing the UHC from attribute grammar based descriptions is done by the Utrecht Attribute Grammar Compiler UUAGC (Swierstra et al., 1999; Swierstra et al., 2008). The AG based descriptions describe the analysis of code and the transformation performed on the code. In these descriptions common idioms are implemented over and over again, and some of them are supported by the system. Those AG idioms faciliate the work with attribute grammars. Some those AG idioms are the implementation of copy rules and generalized copy rules which avoid much of boilerplate code for transfering attributes downward or upward, the SELF attributes that enable to obtain a copy of actual node's subtree during a traversal of the tree, the USE rules that enables to implement the fold or reduce (Hughes, 1989) pattern, and an unique numering pattern that avoids boilerplate code for controlling the generation of unique numbers (any type that can have a prec or succ function).

But those current AG idoms for UUAGC aren't enought, because the actual system requires more idiomatic extensions that are not yet implemented; examples are for instance the calculations of fixed points in the case of circularly defined attributes (Farrow, 1986; Magnusson, 2009), and collection attributes (Magnusson, 2007; Magnusson 2009). Those idoms are common and can be used to facilitate the re-writing of some current UHC's transformations.

UHC is based, as we already said, on lot of transformations, most of which are static program analyses; they are a common way to verify and optimize compiled programming languages, for instance, Haskell. There exist severals ways to implement those static program analyses: data-flow analysis, constraint based analysis, abstract interpretation and type and effect systems (Nielson, 2005; Aho, 2006).

The type and effect analysis, also called type based program analysis, is an approach that takes the type resulting from the inference or checking of type system, and augments it with some computational effect that describes a single aspect of the execution of the program such as the manipulation of dynamically allocated memory or the checking of exceptions in the type system. Such analyses have been used recently to describe new ways to implement already existed analyses like strictness analysis (Holdermans, 2008) and usage analysis (Hage, 2007). The problem with those analyses is that they are defined for toy languages instead for full-fledged languages like Haskell. Although this facilitates the analysis of the correctness of those analyses but, when we are going to translated them into a language like Haskell, things are getting complicated because of the large collection of details to be taken into account. As a result of this there is need for a framework that facilitates to conversion of the program analysis described for toy languages into the corresponding analyses for a full-fledged language like Haskell. This framework will be like a new AG idiom that will be added to UUAGC System.

Due that this project depends a lot on the structure of the UUAGC System and UHC, and those projects are constantly being developing and changing, this plan must be adapted to the new structure and design changes.

Project description

Methodology

  1. To get familiar with UUAGC system by implementing two new idioms:
    1. Calculating circular attributes by fixed-point evaluator.
    2. Collection attributes.
  2. To get familar with type base-analysis in UHC and UHC structure, by implementing on UHC (starting from the work described in the thesis of Holdermans (2010):
    1. Strictness analysis.
    2. Generic usage analysis with subeffect qualifiers.
    3. Sharing analysis.
  3. At the same time, while we're developing the analyses described on the previous item, we start to find the common patterns and define the new idioms inside the UUAGC system.
  4. Once, we have identified the new idioms, to re-implement the previous analyses again with the new idioms, we can compare the previous implementation with the new ones, using different metrics like: number lines, compilation time, execution time, etc. To analyze the result and make the corresponding fixes to idioms.

Goals and objectives

The main propose of this project is to develop and implement a framework that expedites the construction of program analysis depend on type-based analysis using attribute grammars in particular UUAGC. The area of program analysis is more encompassing and covers all kinds of compilers, but in the particular area of compilers for functional languages.

In order to reach the main goal of the project, we have the following sub-goals:

  • To extends the UUAGC with new tools that faciliates the implementation of several program analysis.
    • To extends the UUAGC with a framework that enables to calculate circular attribute grammars.
    • To extends the UUAGC with a framework that enables to calculate collection attributes.
  • To understand and implement several program analysis on UHC
    • To implement on UHC a strictness analysis based on backward analysis.
    • To implement on UHC an generic usage analysis with subeffect qualifiers.
    • To implement on UHC an sharing analysis.
  • From the below analyses implementation, to research, design and develop a framework that faciliates the implementations of those a new analysis.

Timeline

  Description Dates
1 To implement a framework based for calculating Fixed-Point-Finding Evaluator for circular attribute grammars on UUAG 2010, Jan to Mar
2 To implement a framework based for calculating Collection Attributes on UUAG. 2010, Apr to Jun
3 To implement a Strictness Analysis based on backward analysis on UHC 2010, 2010, Jul to Sept
4 To develop and implement a framework for manage program analysis base on type info for UUAG. 2010, Jul to 2011, Jun
5 To implement a Generic Usage Analysis with Subeffect Qualifiers on UHC. 2010, Oct to Dec
6 To implement a Sharing Analysis on UHC. 2011, Jan to Mar
7 To end the thesis and thesis defense 2011, Jul to Dec

References

  • Aho, Alfred V., Lam, Monica S., Sethi, Ravi, Ullman, Jeffrey D. Compiler: Principles, Techniques, and Tools. 2006.
  • Dijkstra, Atze, Fokker, Jeroen and Swierstra, S. Doaitse. The Architecture of the Utrecht Haskell Compiler. 2009.
  • Dijkstra, Atze, Fokker, Jeroen and Swierstra, S. Doaitse. The Structure of the Essential Haskell Compiler, or Coping with Compiler Complexity. 2008.
  • Farrow, Roney: Automatic Generation of Fixed-Point-Finding Evaluators for Circular, but Well-Defined, Attribute Grammars. 1986.
  • Grune, D., Bal, H. Jacobs, C., Langendoen, K. Modern Compiler Design. 2002.
  • Hage, Jurriaan, Holdermans, Stefan, Middelkoop, Arie. A Generic Usage Analysis with Subeffect Qualifiers. 2007.
  • Hankin, Chris. An Introduction to Lambda Calculi for Computer Scientists. 2004.
  • Holdermans, Stefan, Hage, Jurriaan. A Substructural Type System for Backward Strictness Analysis. 2008.
  • Jones, Simon L. Peyton. Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine - Version 2.5. 1992.
  • Knuth, Donald. Semantics of context-free languages. 1968.
  • Lonkhorst, Tom. Tauphi: A generic effect system for a Haskell Compiler. Master's These Proposal. 2009.
  • Magnusson, Eva, Ekman, Torbjörn, and Helin, Görel. Demand-driven evaluation of collection attributes. 2009.
  • Magnusson, Eva, Ekman, Torbjörn, and Helin, Görel. Extending Attribute Grammars with Collection Attributes. Evaluation and Applications. 2007.
  • Nielson, Flemming, Nielson, Hanne Riis, Hankin, Chris. Principles of Program Analysis. 2005.
  • Pierce, Bejanmin C. Types and Programming Languages. 2002.
  • Tolmach, Andrew, Chevalier, Tim, The GHC Team. An External Representation for GHC Core Language (For GHC 6.10). 2009.
  • Swierstra, S. Doaitse. Utrecht University Attribute Grammar System. 2008.
  • Swierstra, S. Doaitse, Azero, Pablo, Saraiva, Joao. Designing and Implementing Combinator Languages. 1999.

-- JuanCardona - 02 Dec 2009