Wouter Swierstra
Uhc
--
WouterSwierstra - 09 Sep 2003
Use case
To illustrate the problem, look at the following datatype for binary trees:
DATA Tree
| Node left, right : Tree
| Leaf Int
Now suppose we want to compute the minimum of all the values stored in the leaves. One way of doing this is introducing a single chained attribute representing the sum of all the leaves encountered during the traversal.
ATTR Tree [ | min : Int | ]
SEM Tree
| Leaf lhs.min = @int `min` @lhs.min
The UUAG system generates Haskell code from this attribute definition. While this works fine, there are several optimisations possible. For instance, the resulting Haskell code will decorate each node with an intermediate result, even if we're not really interested in it. Could a clever attribute grammar system reduce the amount of spurious intermediate results?
While the answer to this question is clearly beyond the scope of the project at hand, we explore several straightforward approaches. One such solution would be to pass along a pointer rather than the actual value. Each node would destructively update the value stored in the pointer and pass it along. This could be implemented by adding a more monadic flavour to the UUAG system. Rather than merely defining catamorphisms, the resulting Haskell should describe the different attributes which should be computed, grouped in a (recursive)
do statement. Pointers would correspond to a Haskell
STRef.
Attribute grammars with a more monadic flavour
First of all, we introduce new notation for attributes that will be represented as pointers. The following statement declares a chained attribute
min with initial value 1000 that should be translated to a pointer:
ATTR Tree [ | min : *Int {1000} | ]
Inherited and synthesized attributes are declared analogously.
Such attributes can be passed around in exactly the same fashion as regular attributes. For instance, the UUAG system should generate the following copy rules:
SEM Tree
| Node ltree.min = @lhs.min
rtree.min = @lhs.min
lhs.min = @rtree.min
Of course, it should also be possible to modify the value stored in a
STRef.This can be done using the
modifySTRef function as follows:
| Leaf lhs.min = modifySTRef @lhs.min (min @int)
It should be clear that introducing STRefs to the UUAG system means we will have to leave the pure (i.e. non-monadic) setting to some extent. More importantly however, the current implementation of the UUAG system leans heavily on Haskell's lazy evaluation. Synthesized attributes are computed in a large
let statement and subsequently tupled and returned. Lazy evaluation takes care of computing the order in which to evaluate the individual attributes. Once we enter the monadic world, we can no longer easily describe mutually recursive computations. Fortunately, there have been recent proposals to extend Haskell with recursive monadic computations in the form of the
mdo statement.
For now it suffices to view
mdo as a regular
do statement that wraps an additional fixpoint computation around any recursive monadic bindings. Keeping that in mind, we propose the following translation from attribute grammars to Haskell.
ATTR Tree [ gmin : Int | min : *int | resTree : Tree]
SEM T
| Leaf lhs.min = updateUA @lhs.min (min @int)
lhs.resTree = Tree_Leaf @lhs.gmin
Should translate to:
type T_T = T_T (forall s . Int -> STRef s Int -> ST s (STRef s Int, Tree))
sem_T (Leaf int) = \gmin minRef ->
mdo
modifySTRef minRef (min int)
let lhsOResTree = Tree Leaf gmin
return (lhsOMin,lhsOResTree)
There are a few problems that still need to be dealt with however. First of all, if the programmer tries to refer to the the value stored in the
min pointer, he will get the following type error:
Couldn’t match ‘Int’ against ‘STRef s Int’
Expected type: Int
Inferred type: STRef s Int
The question now arises if this is really a problem. If you are only interested in the final value, one could simply perform a
runST at root level and obtain any values you are interested in. However, consider the situation where you want to number all the leaves in a tree uniquely. One way to do that is by passing a single pointer to an initially zero integer down through the tree. Every leaf reads the value of the pointer, increments it, and passes back up a new leaf storing the freshly read value. To do this, you really need access to the values stored in the pointer. This shouldn't be too hard to implement - it is more a design decision if you really want to allow this.
Conclusions and Further Work
It should be fairly straightforward to implement these changes in the current UUAG system. However, there are quite some issues I haven't dealt with. In the current proposal, the
ST monad propagates upwards. At root level, for instance, we need to create a new
STRef to pass down. In order to do this, we need to enter the ST monad. It might be nice to minimize the amount of computation done in the monad, and simply restrict it to those values implemented by pointers. Another solution would be to wrap a
runST around semantic functions to hide the fact that computations take place in the
ST monad.
More importantly however, it is not immediately clear how to extend this system to deal with multiple pointers. The
mdo notation solves a lot of problems, but it is all too easy to generate looping computations. Furthermore, the order of computation still matters - suppose one pointer should be initialized to the result stored in another pointer; the initialization should take place after the computation, so it becomes important to order the computations somehow. It might also be interesting to look at how such ordering could be used to write a more efficient attribute grammar. Rather than recompute all values when some slight changes are made, it would be nice if computations could be ordered. Evaluation would become incremental. Perhaps clever function caching could really improve the efficiency of resulting code.
Reviews
A brief summary of some of the papers I've taken a look at:
- Clean User Manual - I read through section 4.3 on Uniqueness Types in Clean. It's a good introduction to Uniqueness Types, giving all the information you need to work with them, without getting into technical detail. If you've never heard of Uniqueness Types, this is probably a good place to start. It touches on issues such as term graph rewriting, attributed datatypes and uniqueness polymorphism, that are dealt with more formally in other papers.
- UUAGDocumentation - I refreshed my memory and had a good look at the new UUAG documentation.
- A recursive do for Haskell, Levent Erkok and John Launchbury, In Haskell Workshop'02, pp. 29-37. 2002. - The paper proposes a straightforward extension to deal with recursive monadic computations. It introduces pleasant syntactic sugar in the form of the mdo expressions. It seems the ideas in the paper can be used to define a straightforward translation to more monadic AG target code.
--
WouterSwierstra - 09 Nov 2003