0 Cubs

Eifl

Future Work

  • Because of pointer tagging, small datatypes (isSmallFamily) begin with constructor tag 1, the rest with tag 0. We may be able to make small constructors start with 0 too without runtime cost. In that case, case analysis of small data types would presumably get faster (because a 0-test is said to be faster than other tests).
  • Currently, the tag-bits of pointers to big datatypes are only used for evaluatedness. We could use them for the few most common constructors instead. Which constructors are most common is of course again a job for HPC.

Interesting flags

-ddump-to-file

POI's

Looks like we could actually implement everything as a source-to-source transformation... (although we'd also have to implement source-to-source case-desugaring)

Prelude GHC.Base GHC.Exts> I# (getTag GT)
2


Interesting point in code: Building Case Analysis, file ./compiler/codeGen/CgUtils.hs

mk_switch <-- emitSwitch <-- (emitAlgReturnTarget (from CgInfoTbls) | cgInlinePrimop (from CgCase))

There are also a couple of TODOs in that code.

Another interesting point, in compiler/simplStg/SimplStg.lhs:

stg2stg :: DynFlags       -- includes spec of what stg-to-stg passes to do
           -> Module         -- module name (profiling only)
           -> [StgBinding]  -- input...
           -> IO ([(StgBinding, [(Id, [Id])])]  -- output program...
                   ,CollectedCCs)                   -- cost centre information (declared and used)

People agreeing with us

Workflows

Alternative 1

  • Compile a program, using counter information (profile)
  • Source-to-source translation: add counter information in 'magic comments'
  • Compile the program again, process the 'magic comments' (recompile)
    • Build a tree using the counter information (build tree)
    • Pass the tree to the code generation part
    • Use the tree to generate better code

Alternative 2

  • Compile a program, using counter information (profile)
  • Analyze the counter information, build a tree, save to a special file (build tree)
  • Compile again, load the special file (recompile)
    • Use the tree to generate better code

Comments

I don't really get the (difference between the) Alternative 1 and Alternative 2: Perhaps it helps to say where which program is actually being run & profiled?

-- RemiTurk - 13 Jan 2008

Difference lies in when you build the tree, inside GHC or outside GHC. An advantage of building the tree outside, is that you don't have to "mess about" in GHC (being HPC group) and maybe we can even unsafePeformIO the tree into any place in GHC (0CUBS group). However, this means you have to save a tree to a file (serialization) and you cannot easily edit certain measurements (ticks). The first approach simply requires counting information to be "magic comments", these can be changed by the developer, but you have to process these comments in GHC to build that tree (every time you recompile).

I guess the first is better, but it's good to have alternatives. Maybe later we see that passing the count-tree down to the level we need to be on is too much work, then maybe the second approach would be easier.

-- MarkStobbe - 14 Jan 2008

Flags

After a couple weeks of hacking, we ended up with lots of flags.

-fcase-ubs
Read caseUBSmap.txt to pass tick information to mk_switch (code generation).
-fscase-ubs
Static caseUBS flag, used to turn on hpc ticks, but don't generate the hpc infotable and hpc initialization code (previously also used to remove tick information in the code generation step)
-ftick-note
Convert ticks to Note (TickNr Int) in Core and put the information in the StgAlt for Stg, no ticks in output.
-ftwo-case-zero
Use alternative code when case-of is used with 2 tags, one of which is 0 (fixing TODO found in GHC code)
-fcase-hack
Do case Unbalanced Binary Search probabilistic optimization in mk_switch, using the CaseST read from caseUBSmap.txt.
-fno-stg-ticks
Static flag, removes ticks from STG, in STG simplifier stage.
-fdump-switches
Static flag, dumps information about switches in the code generation phase.
...?
More?

Useful things you can do with these flags

  • Using -fscase-ubs and -ftick-note together will give you exactly the same binary when compiling, but you'll have tick numbers for the alternatives available up to Stg.
  • ...

We would like to bring this down to just a couple of flags, each for a specific use-case, proposal (please edit at will):

  • User compiles her code using -fhpc to generate .?ix files that can be used for optimization (currently possible)
  • Then, using -fcase-ubs filename the code will be recompiled using the filename with CaseST information. This flag is a combination of -fscase-ubs, -ftick-note and -fcase-hacks (-fno-stg-ticks is not necessary, since tick-note alread takes care of removing the ticks).
  • We don't need a special case to compile the base-case to measure against, since we managed to do that by removing the ticks as early as possible from Core (before optimiziations). (Hooray!)

-- EelcoLempsink - 22 Jan 2008

Ticks are now transformed into Note (TickNr Int) by pattern matching on the specific Hpc Core construct (case+default thing). Next, the notes are propagated through Core. Then, in CoreToStg the ticknrs are placed in every alternative of the StgCase. These are later during the conversion between Stg to Cg aggregated into the code generation for the case. Then they are propagated down to the emitSwitch, where they are printed (for now).

-- MarkStobbe - 22 Jan 2008

TODO

  • Combine -fscase-ubs, -ftick-note and -fcase-hacks into -fcase-ubs.
  • Let -fcase-ubs take a file argument (possible?).
  • Adapt the code such that ticks will be kept if -fhpc is given.
  • ...

Related pages

Hpc Profiling

Intermediate Language