John Van Schie

Students
Name: John van Schie
Email: jcschie@cs.uu.nl
Homepage:
Mentor:

Planning

2005-2006

Period 1

Period 2

Period 3

Period 4

2006-2007

Period 1

Thesis Project

Topic/Area

Code generation for a pure lazy functional language.

Project

Project title:
Advisor: Atze Dijkstra, Jeroen Fokker, Doaitse Swierstra
Start date:Mon Nov 26 2007
End date:
URL: Grin2LLVM

Description

short description (abstract) of the thesis project; should be accompanied with a 10 page paper outlining the state of the art (literature), objectives, research approach

Possible Optimisations

  • Inlining
    • It gives a huge (~10%) advantage in GHC. Although in GHC it is more the other optimisations that use the opportunity then that the saving of a call/return instruction. [1]
    • GRIN performs late inlining although it is not implemented yet. This is needed before CSE can perform any real work. [2]
    • LLVM also performs inlining and this is implemented. But does it inline enough to compensate for the missing inlining phase of GRIN?

  • Specializing
    • GRIN tries to reduce the amount of indirect jumps and calls as much as possible. But branch predictors are still predicting and thus conditional jumps are still sources of pipeline stalls. Many conditional jumps arise from the compilation of Higher order functions (HOF). For example, a much used HOF like map, can have many possible functions as first paramete. This leads to a big case expression with many alternatives. If we are able to create specialized version of map with a much used first parameter, then we are able to save on the conditional jumps.
    • Difficulties with this approach is that it is only worthwhile to specialize functions that are used often with the parameters. If we create a specialized function for all, or too much, possible arguments of map, then the code will blow up and the performance will decrease.
    • But if we can find the hot functions and specialize them then a significant speedup should be possible [3]
    • Not much is documented about specialisation of lazy functional languages, so investigation could be worthwile.

  • Cache optimisation
    • As GRIN addresses one big performance killer in the indirect jumps, it seems natural to address the cache misses that might occur. Tools like Cachegrind [4] can give insight in the cache performance of the generated code. With this tool, possible problems in the code can be pinpointed. These problems could be traced back and solved by for example reordering of data or prefetching. Seems like a nice subject, mostly because it is something I do not have much experience with. This is also a drawback, because it creates a risk that it will just become a large trial and error experiment which is not very nice.

  • Garbage collection / Allocation
    • Our initial back-end will have a simple mark-and-sweep garbage collector. But more efficient garbage collectors are know; generational garbage collectors, incremental collectors [5][6], non-paging collectors [7]. We could take a look which garbage collector performs best for the nofib suite and or try to combine it with explicit free statements [8] in the code. The same holds for the memory allocator. Haskell programs tend to allocate more memory then programs written in other languages. Can we exploit this fact by writing a custom memory allocator for fast allocation?

  • Unboxing
    • Not really a back-end optimisation but can gain quite a lot when a lot of primitive data is used. One problem is that automatic unboxing requires a strictness analysis, which is not present.

Optimisation Possible impact Possible problems
Inlining +/- +
Specializing ++ -
Cache optimisation + +/-
GC / Alloc +/- ++
Unboxing ++ --

Concluding, I think I should implement Specialisation and describe the other 4 optimisations. This because Specialisation can gain us much, seems a nice challenge to get right and is not thoroughly investigated in the lazy functional language.

[1] Secrets of the Glasgow Haskell Compiler inliner, Simon Peyton Jones and Simon Marlow
[2] Code Optimisation Techniques for Lazy Function Languages, Urban Boquist
[3] Haskell With Go Faster Stripes, Neil Mitchell
[4] Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation, Nicholas Nethercote and Julian Seward
[5] Non-Stop Haskell, A.M. Cheadle, A.J. Field, S. Marlow, S.L. Peyton Jones and R.L. White
[6] Exploring the Barrier to Entry - Incremental Generational Garbage Collection for Haskell, A.M. Cheadle, A.J. Field, S. Marlow, S.L. Peyton Jones and R.L. White
[7] Garbage Collection Without Paging, Matthew hertz, Yi Feng and Emergy D. Berger
[8] Free-Me: A Static Analysis for Automatic Individual Object Reclamation, Samuel Z. Guyer, Kathryn S. McKinley? and Daniel Frampton

Thesis points

Thesis interest:

  • LLVM back-end
    • GC
    • Wat kost tijd?
    • Aansluiting bij FP-code
      • Dubbel werk?
      • Taakverdeling GRIN/LLVM

Nice points, but not for this thesis:

  • Incrementele analyse
    • Grin open/closed world
      • JIT
    • Incrementele type-analyse
  • Quantitatieve analyse
    • Profiling
    • Profiling-info gebruiken bij hercompilatie

Topic attachments
I Attachment Action Size Date Who Comment
gifgif check_mark.gif manage 0.5 K 09 Feb 2006 - 22:15 JohnVanSchie Check mark image