EHCTo Clean Core
The goal of this assignment is to implement code generation for the EH compiler
to Clean's ABC machine
| week || date || what |
| 13-19 sep || Read part of Functional Programming and Parallel Graph Rewriting about the ABC machine |
| 20-26 sep || Thu 23 sep || |Present basic stuff about the ABC-machine
| || Fri 24 sep || Meeting at KUN 13:00, toernooiveld 1, room A-4008 |
| || Zat 25 sep || put online information gathered from the meeting Fri 24 sep. |
| || Sun 26 sep || look through .c files to better understand code generation and |
| || || ultimately extracting (simple/naive) compilation schemes |
| 27 sep - 3 okt || Tue 28 sep || Present our findings so far, report our activity (problems, etc.) |
The generation of ABC from a certain Clean program proceeds as follows;
- Clean program gets parsed etc into an abstract syntax tree (ast), representing the program in the Clean Core language. (this language is undocumented)
- Multiple Calls are made to a C-library that is responsible for the generation of the ABC-code.
So the frontend of the compiler is written in the Clean language itself, but the code generation is done by a clump of c-code.
Our first approach:
- Try to mimic what this code generation library is doing and formalize this in the form of compilation schemes for the EH language.
This proved a very difficult task since the only documentation about how this translation from Clean to ABC-code proceeds, is in the form of tenthousands lines of c-code. Even for the smallest program thinkable in EH (namely the program consisting of one integer) produced a significant file with abc-instructions. There was too little documentation available (even about these instructions since a lot of them aren't mentioned in the book (Plasmeijer, M.J. and van Eekelen, M.C.J.D. Functional Programming and Parallel Graph Rewriting
)) to be able to derive compilation schemes. Another thing that made this approach of deriving schemes even harder was the fact that the generated ABC-code by the clean compiler was highly optimized. There was no easy way of turning off this optimization and force the compiler to generate dumb code (only strictness analysis could be turned off partially
). We depended on the following workmethod;
- make an EH example program
- make an equivalent Clean program
- let the clean compiler produce ABC-code for this ABC example
- look at the ABC file to try to understand the translation process.
Looking at the c-files which do this translation turned out to be no
option since this generation was spread out over 4/5 files containing tenthousands lines of code (there was 'no' logical seperation in multiple files so a lot of code calling to different files, etc, etc.)
Our second approach:
Instead of mimicing the translation process hidden in the c-library, we decided to look at an alternative route. Clean actually uses foreign functions to call this c-library and generate the code. I should note that there is an intermediate language above the ABC-code (which are basically stack manipulating instructions), called the Clean Core. This probably resembles a core lambda calculus with sugar like lets etc. We are not sure, because this language (Clean Core) is totally undocumented. Of course there has to be some Clean source files that contain the data definition of this language, but that is the only 'documentation' that exist about this language. The Clean program has to be shipped to the library at some point (in this Clean Core format). So we had to look at a couple of points;
Interfacing with a c-library
- Interfacing with a c-library from Haskell.
- Find out how to instrument the code generating library.
- Find out the form of the abstract syntax tree (AST) this library expects (the Clean Core language)
- Make a translation from EH AST's to Clean Core AST's
When using foreign entities in Haskell there are a couple of scenario's;
- call a foreign function
- access a foreign datastructure
- make a haskell function available in the foreign context
- make haskell data available in the foreign context
When calling a foreign function and providing a function as argument for example, the foreign context could make a callback. This requires more work since the calling of functions are done in two directions.
Let's first look at the simple scenario where we just call a c-function (which is our foreign context) from Haskell assuming no callback takes place. This is what you have to do to make it work;
module Main where
foreign import ccall "math.h sin" foreign_sin :: CDouble -> CDouble
main = do (putStr . show . foreign_sin) 0.0
$ ghc -fffi -c Main.hs
$ ghc -fffi Main.o -lm
This shows how to call a c-function from Haskell. A couple of things are remarkable. There has to be a mapping from Haskell types to C-types. Thats why in the Foreign module there are numerous types like CInt, CDouble, which are Haskell counterparts of Int and Double in C.
Interfacing with the code generating library
Since we now know how to call c-functions from Haskell. We have to look at how to instrument this specific c-library that generates code. There exists a .h file which associates Clean function types with C function prototypes. Then there is an actual Clean file that contains the code for all these Clean functions that basically call the C equivalents. The only difference and difficulty is that the Clean equivalent functions have more arguments. This makes it somewhat more difficult to understand whats going on.
We found out that backendinterface.icl
contains a lot of calls to the foreign functions. The main things where the compilation is done is compiler.icl
which uses the function backEndInterface
. This function generates the code in a couple of steps.
We exported the EHC repository
and imported it into a new repository for this project. This way we can change source files and fully enjoy the benefits of a version management system, without having to commit to the original EHC repository
. If necessary we can always checkout the original EHC repository
and merge it with our code if there are really critical updates in the original repository.
Our repository URL
- Plasmeijer, M.J. and van Eekelen, M.C.J.D. Functional Programming and Parallel Graph Rewriting, Addison-Wesley Publishing Company, 1993.
- The Haskell 98 Foreign Function Interface 1.0; An Addendum to the Haskell 98 Report
- Daan Leijen. PhD Thesis, The λ Abroad – A Functional Approach to Software Components, November 2003.
-- Niels van der Velden
- 14 Sep 2004