You are here: UUCS>Top Web>WebHome (23 Aug 2012, JurriaanHage)EditAttach

Project Description

The TOP project deals with the development of tool and library support for building compilers which deal with improving user feedback during program analysis. We have plenty of experience with type inferencing higher-order polymorphic functional language, which we have also built into the Helium compiler which we use for educational purposes. The compiler supports the Haskell 98 language almost fully. The accepted proposal for the NWO Open 2004 Spring edition.

Motivation from type inferencing

When confronted with the user feedback given by compilers for higher-order functional languages, one often has to stop and think what is meant. An expert programmer is used to getting this kind of feedback from his compiler, but the messages do form a large hurdle for novice programmers. This problem is even larger in programming languages with high level of abstractions, because error messages tend to be cast in these abstractions, and not necessarily with the same syntactic sugar as they are offered to the programmer. We advocate staying as close as possible to the input source code (notwithstanding the extra effort this may bring for the compiler builder).

We do not stop there: every human being has his own style of programming, a different way of arriving at his solution. Now, if his solution is type-erroneous, we would like to be able to adapt the type inferencing process to the programming style. At this point this takes the form of passing certain parameters to the compiler.

Going even further, a programmer can adapt the type inference for certain sets of expressions, and provide tailor-made error messages. This is especially useful for embedded domain specific language, to override the messages which usually only reflect the underlying encoding in the language, and do not highlight the concepts from the domain.

Constraint based type inferencing in Helium

We have used our library for the type inferencer which is part of the Helium compiler. It has the following properties:
  • We generate precise position information and preserve type synonyms in error messages.
  • The programmer can choose the type inference strategy of his liking (M and W and other greedy variants, and the unbiased type graph based implementations have been implemented).
  • The type graph implementation uses quite a number of heuristics to decide what is the most likely source of the error.
  • A logging facility is available in Helium which has given us a large amount of correct and erroneous Haskell programs which can be used to improve our type inferencer. In the future these programs can also be used for benchmarking optimizations and many other purposes. The programs have been anonimized, but the relation between programs by the same programmer has been kept intact. Various questions can then be answered: Do our hints help? Are they used? It is easy to come up with many interesting questions. Currently we have about 300 MB of sources from a single functional programming course.
  • A major innovation is the ability for a programmer to develop his domain specific type rules for a combinator library he might be writing. In addition, he may specify that his experiences are that certain functions are often mixed up. As a result, a compiler may give the hint that (++) should be used instead of (:), because (++) happens to fit in the context.

The domain specific type inference rules are automatically checked for soundness, and a programmer does not have to be familiar with the process of type inferencing as it currently takes place within the compiler.

An article on this facility can be found in the ICFP '03 proceedings (reference below).

An extension of our work which copes with type classes and introduces special type class directives is described in a paper at PADL 2005. Some aspects have been included in the Helium compiler.

Going beyond type inferencing

At its most general, Top is a framework for constructing abstract interpretations which focuses on giving good feedback on why an abstract interpretation fails. In a mathematical notation this is usually made explicit by returning the top element of a (complete) lattice. Our work can be understood as how to refine this single top element into a number of elements, each of which has information about why abstract interpretation failed. Additionally, we want to offer heuristics for ending up in the 'right' top element. These heuristics can be generally applicable (like a majority heuristic which blames that part of the program for which we have the least evidence that it is correct), or highly problem domain specific (since students tend to confuse : and ++ in Haskell, we have a heuristic that detects this mistake).
Topic revision: r15 - 23 Aug 2012, JurriaanHage

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding UUCS? Send feedback