Web Home

Top

Project Description

The TOP project deals with the development of tool and library support for building compilers which deal with improving user feedback during program analysis. We have plenty of experience with type inferencing higher-order polymorphic functional language, which we have also built into the Helium compiler which we use for educational purposes. The compiler supports the Haskell 98 language almost fully. The accepted proposal for the NWO Open 2004 Spring edition.

Motivation from type inferencing

When confronted with the user feedback given by compilers for higher-order functional languages, one often has to stop and think what is meant. An expert programmer is used to getting this kind of feedback from his compiler, but the messages do form a large hurdle for novice programmers. This problem is even larger in programming languages with high level of abstractions, because error messages tend to be cast in these abstractions, and not necessarily with the same syntactic sugar as they are offered to the programmer. We advocate staying as close as possible to the input source code (notwithstanding the extra effort this may bring for the compiler builder).

We do not stop there: every human being has his own style of programming, a different way of arriving at his solution. Now, if his solution is type-erroneous, we would like to be able to adapt the type inferencing process to the programming style. At this point this takes the form of passing certain parameters to the compiler.

Going even further, a programmer can adapt the type inference for certain sets of expressions, and provide tailor-made error messages. This is especially useful for embedded domain specific language, to override the messages which usually only reflect the underlying encoding in the language, and do not highlight the concepts from the domain.

Constraint based type inferencing in Helium

We have used our library for the type inferencer which is part of the Helium compiler. It has the following properties:

  • We generate precise position information and preserve type synonyms in error messages.
  • The programmer can choose the type inference strategy of his liking (M and W and other greedy variants, and the unbiased type graph based implementations have been implemented).
  • The type graph implementation uses quite a number of heuristics to decide what is the most likely source of the error.
  • A logging facility is available in Helium which has given us a large amount of correct and erroneous Haskell programs which can be used to improve our type inferencer. In the future these programs can also be used for benchmarking optimizations and many other purposes. The programs have been anonimized, but the relation between programs by the same programmer has been kept intact. Various questions can then be answered: Do our hints help? Are they used? It is easy to come up with many interesting questions. Currently we have about 300 MB of sources from a single functional programming course.
  • A major innovation is the ability for a programmer to develop his domain specific type rules for a combinator library he might be writing. In addition, he may specify that his experiences are that certain functions are often mixed up. As a result, a compiler may give the hint that (++) should be used instead of (:), because (++) happens to fit in the context.

The domain specific type inference rules are automatically checked for soundness, and a programmer does not have to be familiar with the process of type inferencing as it currently takes place within the compiler.

An article on this facility can be found in the ICFP '03 proceedings (reference below).

An extension of our work which copes with type classes and introduces special type class directives is described in a paper at PADL 2005. Some aspects have been included in the Helium compiler.

Going beyond type inferencing

At its most general, Top is a framework for constructing abstract interpretations which focuses on giving good feedback on why an abstract interpretation fails. In a mathematical notation this is usually made explicit by returning the top element of a (complete) lattice. Our work can be understood as how to refine this single top element into a number of elements, each of which has information about why abstract interpretation failed. Additionally, we want to offer heuristics for ending up in the 'right' top element. These heuristics can be generally applicable (like a majority heuristic which blames that part of the program for which we have the least evidence that it is correct), or highly problem domain specific (since students tend to confuse : and ++ in Haskell, we have a heuristic that detects this mistake).

We have tried to tackle other analyses besides type inferencing using the Top library. An example is the Master Thesis of Lukas Spee on constraint-based strictness analysis. Extending our work to a more complete strictness analysis and other analyses such as

  • Soft typing of untyped languages
  • Binding time analysis of polymorphic languages
  • Strictness analysis
  • Uniqueness typing
  • Security analysis in a suitable context

is currently a priority.

The properties shared by these analyses are the following

  • The use of constraints and the decoupling of generating, ordering and solving of constraints.
  • Programmer annotations (which may or may not override results of the analysis)
  • In case of inconsistencies, intelligible feedback about possible reasons
  • Support for scripting those parts of the analysis for which there is no obvious choice how to handle things, e.g., when it is likely that any two programmers may differ in their preferences.

Other future topics

  • Actually building a type system using specialized type rules: the compiler builder provides a minimal implementation of Hindley-Milner, all the other language constructs are implemented by a very high level using a domain specific language for type rules similar to the specialized type rules of our ICFP 2003 paper.
  • Building an integrated programing environment for Haskell which utilizes the Proxima editor, the WxHaskell graphical toolkit developed by Daan Leijen, and a compiler supporting the level of feedback we have grown used to by using the Helium compiler. The focus here is on ease of use by novice programmers and high expressivity when it comes to developing interactive applications. Good interactive help should be available.