Assignment Compilation To AStack Machine By Transformation

Pt04

Introduction

In this assignment, you will extend your compiler to use a simple stack-based machine for storing local variables and function arguments. This transformation involves the use of dynamic rules and traversal strategies.

You will implement this as a source to source transformation. The language of the resulting program will still be in the Tiger language and the programs can be executed by the Tiger interpreter.

Machine

First, we introduce the simple stack machine that you will use in this assignment.

The stack in our programs is a huge Tiger array. All local variables and function arguments will be stored on this stack and variables will be used by their index on this stack. So, there will be no more names for local variables and function arguments.

The stack is divided in stack frames, which contains the information required for a function invocation. This stack frame contains the values of the local variables of the function invocation. Every function introduces a new stack frame, thereby increasing the size of the stack with a fixed amount.

Stack Layout

It is important to use a convention for the layout of a stack, since programs written in different languages or compiled by different compilers will then be able to invoke each other. Of course, conventions are useful anyway wink .

We will use the following stack layout:


 |                |
 |   ----------   |
 |   argument n   |
 |      ...       |  incoming
 |   argument 2   |  arguments
 |   argument 1   |
 |   static link  |
 |----------------| 
 |                | <= frame pointer
 |     local      |  
 |   variables    |
 |                |
 |   ----------   |
 |   argument n   |
 |      ...       | outgoing
 |   argument 2   | arguments
 |   argument 1   |
 |   static link  |
 |----------------| 
 |                | <= stack pointer
 |                |
 |                |
 |                |

The stack pointer points to the current top of the stack: beyond the stack pointer all values are garbage. The space below (in the ASCII art below is actually above) the stack pointer is allocated and consists of stack frames.

The frame pointer points to the first index of the stack frame of the current function invocation. The frame pointer is related to the static link, which we will discuss later.

The size of a stack frame is the number of local variables plus the maximal number of outgoing arguments passed to function calls in this function.

The initial value of the frame pointer is 0. The initial value of the stack pointer 0 as well. We will pack the Tiger program in main function to workaround the lack of a function at the top-level. The top-level needs no stack space, since it will immediately invoke the main function.

Registers

The machine has three registers, which are represented as variables in Tiger.

  • sp: stack pointer
  • fp: frame pointer
  • rt: return value of a function

Using a frame pointer is optional: you can choose yourself if you would like to use a frame pointer or not.

Example

It's about time for an example:

let var x
    var y
    var z
 in x := 2
  ; y := 4
  ; z := 0
  ; let function mul(a : int, b : int) : int =
          let var return
              var c
           in c := a * b
            ; return := c
            ; return
          end

     in z := mul(x, y)
    end
end

If the control-flow is in the mul function, just after the assignment to the return variable, then the state of the program will be as follows:


   |-----| 
 0 |  2  | x               stack frame for main
 1 |  4  | y               
 2 |  0  | z
   |  -  |
 3 |  4  | b = y
 4 |  2  | a = x          
 5 |  0  | static link
   |-----|
 6 |  8  | c               stack frame for mul
   |-----|
 7 |     |
 8 |     |
 9 |     |

 sp = 7
 fp = 6
 rt = 8

In this example, the stack frame of main has size 6, since the number of local variables is 3 and the maximal number of outgoing function calls is 2 + 1 for the static link (we'll come back to that later). The size of the mul stack frame is only 1, since it only has a single variable. We will translate the return variable to the rt register, that's why we don't count that one.

The stack pointer is at this point of the execution 7, which would be the start of the next stack frame if mul would have invoked any functions.

The frame pointer is 6, which is the index of the first local variable in the mul stack frame.

Static Links

The implementation of static links is not required for this assignment.

Static links are used to find the location of non-local variables. Non-local variables reside in the stack frame of different function invocation, so it takes more effort to use these.

Let's take a look at an example:

let var x
    var y
    var z
 in x := 2
  ; y := 4
  ; z := 0
  ; let function mul(a : int, b : int) =
          z := a * b
     in mul(x, y)
    end
end

The assignment to z involves such a non-local variable: the location of z is in the stack frame of main. Therefore, the mul function needs a reference to this stack frame. This is called a static link.

The static link does not necessarily link to stack frame of the caller: the static link for a function g should link to the stack frame of the function in which g is lexically enclosed.

For example,

let function f() = ...
      let function g() = ...
          function h() = ...
            let function i() = g()

Now, the function g is invoked from function i, but function i should not pass its own frame pointer as static link to g, since g needs to lookup variables in the stack frame of f. So, function i should pass the frame pointer of f to g.

Primitive Functions

The interpreter does not accept primitive functions that are invoked by passing arguments on the stack. Therefore, you will have to write some simple wrappers that replace the stack arguments by a 'real' function of the primitive function.

let type tstack = array of int
    var stack := tstack[10000] of 0
    var sp := 0
    var rt := 0

    function main() = e

    function printint() =
      printint(stack[sp - 2])

    function print() =
      print(stack[sp - 2])

    function not() =
      rt := not(stack[sp - 2])

 in main()
end

Part 1: Pipeline

First of all, add the strategies for the phases that you are going to implement in this assignment:

  compile =
    for-with-while
    ; return-value
    ; mark-procedure-calls
    ; simple-expressions

    ; if <get-config> "--gotos" => "on" then 
        control-flow-to-goto 
      end

    ; collect-declarations
    ; use-return-register
    ; vars-on-stack
    ; add-stack-machine
    ; flatten-sequences
    ; unmark-procedure-calls

The add-stack-machine phase is required for several phases, so implement this first. The template for the basic Tiger program is given above. You can insert the template in your program by parsing it and then adding it to your program. The e in the main function should be replaced by the Tiger program. You can make the term more pretty be applying pp-aterm to it. Be prepared for possible updates of the template code. Maybe you have to add some primitives yourself to keep your tests working (i.e. don't indent everything by hand wink ).

Part 2: Use Return Register

Next, we have to change the previous implementation of explicit returns a bit. The declaration of the return variable should be removed and the new body is a plain sequence of an assignment and the variable:

(rt := e; rt)

Now we are using this global register, all function calls that occur as expressions (i.e. in an assignment) can be replaced by the rt variable, lifting the call itself to the statement level.

Example:

let function twice(x : int) : int = x + x
 in printint(twice(10))
end

let var lift
    function twice(x : int) : int =
      (rt := x + x;
       rt)
 in twice(10);
    lift := rt;
    printint(lift)
end

Unfortunately, some tests that use primitives with arguments and results (such as not) will not run anymore. To solve this, we also have to remove the function arguments, which we will do later. After that, the primitive wrappers can be invoked and the tests will work again.

Part 3: Maintain the Stack Pointer

In the next four parts we will be working on the vars-on-stack strategy. The vars-on-stack strategies must be a single traversal. You are not allowed to traverse the tree multiple times.

For these transformations, it is much easier if everything in the Tiger program is a function declaration, so we first move the declaration of the main function from the template to this method. Now, we have an function declaration at the top-level and all variables occur in a function declaration.

Next, add statements to a function declaration to increase the stack pointer. For determining the stack frame size, you can reuse the utils listed at the end of this part of the assignment.

Example:

let var x
    var y
    var z
 in x := 2
  ; y := 4
  ; z := 0
  ; let function mul(a : int, b : int) : int = a * b
     in z := mul(x, y)
    end
  ; printint(z)
end

function main() =
  (sp := sp + 6;
   x := 2;
   y := 4;
   z := 0;
   let function mul() : int =
         (sp := sp + 1;
          rt := a * b;
          rt;
          sp := sp - 1)
    in mul(x, y);
       z := rt
   end;
   printint(z);
   sp := sp - 6)

  /**
   * Calculates the frame size
   *
   * @type FunDec -> Int
   */
  framesize =
    ?FunDec(_, _, _, e);

    let collect-calls =
          <collect-all(?Call(_, _), conc, IgnoreFunDec)> e

        max-nr-of-args =
          collect-calls
          ; map(?Call(_, <length>))
          ; (list-max; inc <+ !0)

        nr-of-vars =
          <vars-of-funbody> e
          ; length

     in <add> (<max-nr-of-args> (), <nr-of-vars> ())
    end

  /**
   * Returns a list variables in this function body
   *
   * @type Exp -> List(Var)
   */
  vars-of-funbody = 
    if ?Let(decs, _) then
      <filter(?VarDecNoInit(<id>, _))> decs
      ; map(!Var(<id>))
    else
      ![]
    end

  IgnoreFunDec :
    FunDec(f, x*, ta, e) -> []

Part 4: Remove Function Arguments

In this part, extend the vars-on-stack strategy to remove function arguments in functions declarations and calls. For this, you have to change a function call to place function arguments in the stack frame. In the function declaration, variables that refer to function arguments must be replaced by locations on the stack as well. This can be implemented concisely by generating dynamic rewrite rules at a function declaration.

Remember that you have to implement these rewritings in a single traversal. A nice scheme is to distinguish all the possible cases. This style also makes it easy to add scope for dynamic rules in the specific cases of the traversal.

  vars-on-stack =
    <+ handle FunDec
       ; ...

    <+ handle Call
       ; ....

    <+ handle Var
       ; ....

    <+ all(vars-on-stack)

You don't have to support function arguments that are used by some nested function (i.e. the variables are non-local). This would require the implementation static links. However, prepare your stack layout for adding static links.

The location of the arguments can be expressed in terms of a distance to the stack or frame pointer. This is up to you.

Thanks to the return register, there is only a single call construct left in our abstract syntax tree. This statement-level call can now be translated into a series of statements to put the arguments on the stack, followed by a call without the actual arguments.

Example:

let var z := 0
    function mul(a : int, b : int) : int = a * b

 in z := mul(x, y)
  ; printint(z)
end

function main() =
  (sp := sp + 4;
   let var z
    in z := 0;
       let function mul() : int =
             (sp := sp + 0;
              rt := stack[(sp - 2)] * stack[(sp - 3)];
              rt;
              sp := sp - 0)
        in (stack[(sp - 2)] := x;
            stack[(sp - 3)] := y;
            mul();
            z := rt;
            stack[(sp - 2)] := z;
            printint())
       end
   end;
   sp := sp - 4)

Part 5: Remove Local Variables

Local variables must be stored on the stack as well. We have already collected all variable declarations at the top-level of a function declaration. Hence, similar to the function arguments, local variables can be replaced at the function definition by a location on the stack. At the use locations, dynamic rules can be applied to replace the variable with an element of the stack array.

Example

let var x
    var y
    var z
 in x := 2
  ; y := 4
  ; z := 0
  ; let function mul(a : int, b : int) : int = a * b
     in z := mul(x, y)
    end
  ; printint(z)
end

function main() =
  (sp := sp + 6;
   stack[(sp - 6)] := 2;
   stack[(sp - 5)] := 4;
   stack[(sp - 4)] := 0;
   let function mul() : int =
         (sp := sp + 0;
          rt := stack[(sp - 2)] * stack[(sp - 3)];
          rt;
          sp := sp - 0)
    in stack[(sp - 2)] := stack[(sp - 6)];
       stack[(sp - 3)] := stack[(sp - 5)];
       mul();
       stack[(sp - 4)] := rt
   end;
   stack[(sp - 2)] := stack[(sp - 4)];
   printint();
   sp := sp - 6)

Part 6: Support Non Local Variables

Support for non local variables is not required for this assigmment, but feel free to implement it if you enjoy this wink . Supporting non-local variables involves the implementation of static links.

Part 7: Collect Function Declarations

The nested structure of the functions is no longer relevant, since all variables are stored on a global stack, so you can now collect all function declarations at the top level in nice linear list.