Assignment Run Time Code Generation

Pt
The paper `Runtime code generation in JVM and CLR' by Peter Sestoft describes how run-time code generation can be achieved efficiently in the JVM anD CLR. The approach builds on APIs for bytecode generation. While the approach is practical, writing algorithms in bytecode API calls is not very readable. The goal of this assignment is to develop a syntactic embedding of Java in Java and `assimilate' that embedding to sequences of calls to the GNU Bytecode API.

Run-Time Code Generation with a Bytecode Library

Consider the following example from `Runtime code generation in JVM and CLR'. The Power method is defined as follows:

  public static int Power(int n, int x) { 
    int p;
    p = 1;
    while(n > 0)

    { 
      if(n % 2 == 0) { 
        x = x * x;
        n = n / 2;
      } else { 
        p = p * x;
        n = n - 1;
      }
    }
    return p;
  }

The PowerGen method generates a specialization of the Power method for fixed values of n using the gnu.bytecode library:

public static void PowerGen(Method mo, int n) {
  CodeAttr jvmg = mo.getCode();
  Scope scope = mo.pushScope();
  Variable varx = jvmg.addLocal(Type.int_type, "x");
  Variable varp = jvmg.addLocal(Type.int_type, "p");
  jvmg.emitPushInt(1);
  jvmg.emitStore(varp);         // p = 1;
  while (n > 0) {
    if (n % 2 == 0) { 
      jvmg.emitLoad(varx);      // x is arg_0
      jvmg.emitLoad(varx);
      jvmg.emitMul();
      jvmg.emitStore(varx);     // x = x * x
      n = n / 2; 
    } else { 
      jvmg.emitLoad(varp);      // load p
      jvmg.emitLoad(varx);      // load x (arg_0)
      jvmg.emitMul();
      jvmg.emitStore(varp);     // p = p * x
      n = n - 1; 
    }
  }
  jvmg.emitLoad(varp); 
  jvmg.emitReturn();           // return p;
  mo.popScope();
}

For n is 16 this produces the following sequence of bytecodes:

  0: iconst_1
  1: istore_1
  2: iload_0
  3: iload_0
  4: imul
  5: istore_0
  6: iload_0
  7: iload_0
  8: imul
  9: istore_0
 10: iload_0
 11: iload_0
 12: imul
 13: istore_0
 14: iload_0
 15: iload_0
 16: imul
 17: istore_0
 18: iload_1
 19: iload_0
 20: imul
 21: istore_1
 22: iload_1
 23: ireturn

Syntactic Embedding

It would be attractive to be able to write concrete Java syntax instead of bytecode API calls. To achieve this we create an embedding of Java in Java that allows us to quote pieces of Java code. A basic syntactic embedding is defined by the following SDF definition:

module Java-15-in-Java-15
imports Java-15 Java-15-Prefixed
exports
  context-free start-symbols CompilationUnit
  context-free syntax
    "genclass" "|[" JavaClassDec "]|" -> Expr {cons("QuoteClassDec")}
    "genbstms" "|[" JavaBlockStm* "]|" ";" -> Stm {cons("QuoteBlockStms")}
    "#genbstms" "|[" BlockStm* "]|" ";" -> JavaStm {cons("EscapeFromStm")}
    "#var[" JavaId "]" -> Id {cons("MetaVar")}

A complete syntax definition and parse table can be created using the following Makefile:

JAVAFRONT    = $(HOME)/.nix-profile/
JAVAFRONTSDF = $(JAVAFRONT)/share/sdf/java-front
SDFINCLUDES  = -I $(JAVAFRONTSDF) -Idef $(JAVAFRONTSDF)/Java-15-Prefixed.def

all : Java-15-in-Java-15.tbl Java-15-in-Java-15.str

Java-15-in-Java-15.def : Java-15-in-Java-15.sdf
   pack-sdf -i $< -o $@ $(SDFINCLUDES)

Java-15-in-Java-15.tbl : Java-15-in-Java-15.def
   sdf2table -i $< -o $@ -m Java-15-in-Java-15

Java-15-in-Java-15.rtg : Java-15-in-Java-15.def
   sdf2rtg -i $< -o $@ --main Java-15-in-Java-15

Java-15-in-Java-15.str : Java-15-in-Java-15.rtg
   rtg2sig -i $< -o $@

An example of the use of the embedding is the following generator for the power function:

public static ClassType GenPowerClass(int n) {  
  return genclass |[
    public class MyClass {
      public static int MyPower(int x) {
        int p;
        p = 1;
        #genbstms|[
          while (n > 0) {
            if (n % 2 == 0) { 
              genbstms|[ x = x * x; ]|;
              n = n / 2; 
            } else { 
              genbstms|[ p = p * x; ]|;
              n = n - 1; 
            }
          }
        ]|
        return p;
      }
    }
  ]|;
}

The quotation genclass|[ class declaration ]| creates a ClassType object and fills it with bytecodes according to the declaration. The anti-quotation #genbstms|[ statements ]| escapes to the meta-level to execute the statements at generation-time. In the example the anti-quotation has the effect of unrolling the loop.

Assimilation Scheme

Given a syntactictic embedding we need an assimilation that translates the embedded code to an appropriate implementation. The idea of the Java in Java embedding that we are considering here is that a quoted Java fragment should produce a sequence of calls to the gnu.bytecode API in order to create a class file implementing the Java code. A problem to consider is how different piece of generating code can interact with each other. For instance, how do the quoted code fragments in the antiquoted while loop above end up in the specialized Power method? In order to do this smoothly we adopt a couple of conventions, which are illustrated by the following assimilation of GenPowerClass (and discussed afterwards):

public static gnu.bytecode.ClassType GenPowerClass(int n) 
{ 
  ClassType thisClass;
  thisClass = new ClassType("MyClass");
  thisClass.setSuper("java.lang.Object");
  thisClass.setModifiers(Access.PUBLIC);
  Method thisMethod = thisClass.addMethod("MyPower");
  thisMethod.setSignature("(I)I");
  thisMethod.setModifiers(Access.PUBLIC | Access.STATIC);
  thisMethod.initCode();
  CodeAttr thisCode = thisMethod.getCode();
  Variable var_0 = thisCode.addLocal(Type.int_type, "x");
  thisCode.pushScope();
  Variable var_1 = thisCode.addLocal(Type.int_type, "p");
  thisCode.emitPushInt(1);
  thisCode.emitStore(var_1);
  while(n > 0)
  { 
    if(n % 2 == 0)
    { 
      { 
        thisCode.emitLoad(var_0);
        thisCode.emitLoad(var_0);
        thisCode.emitMul();
        thisCode.emitStore(var_0);
      }
      n = n / 2;
    }
    else
    { 
      { 
        thisCode.emitLoad(var_1);
        thisCode.emitLoad(var_0);
        thisCode.emitMul();
        thisCode.emitStore(var_1);
      }
      n = n - 1;
    }
  }
  thisCode.emitLoad(var_1);
  thisCode.emitReturn();
  thisCode.popScope();
  return thisClass;
}

The gnu.bytecode library provides an API for creating elements of a class file. A complete class is represented by a ClassType object and is a container for class member declarations. For instance, a method is added using the addMethod method, wich returns a Method object. A Method has a CodeAttr, which stores the bytecodes for the method. New bytecode instructions can be added by applying emit methods to the CodeAttr.

The idea of the assimilation now is that a code fragment knows where to add code by referring to the appropriate this object. For instance, a method always lives in an environment where there exists a thisClass. Similarly, a statement always lives in the context of a thisCode. These simple conventions make it possible to assimilate quoted code fragments in isolation.

The programmer has to be aware of these conventions. For example, factoring out the generation of the body of a method requires that the appropriate environment is passed to the method generating the method body. In the following example, the PowerGen method is called with the thisCode environment. Furthermore, any variables in that are in scope in the body should be passed as well. That is the purpose of the #var[identifier] quotation:

public static ClassType GenPowerClass(int n) {  
  return genclass |[
    public class MyClass // extends java.lang.Object
    {
      public static int MyPower(int x) {
        #genbstms|[ PowerGen(thisCode, n, #var[x]); ]|;
      }
    }
  ]|;
}

public static void PowerGen(CodeAttr thisCode, int n, Variable #var[x]) {
  genbstms|[
    int p;
    p = 1;
    #genbstms|[
      while (n > 0) {
        if (n % 2 == 0) { 
          genbstms|[ x = x * x; ]|;
          n = n / 2; 
        } else  { 
           genbstms|[ p = p * x; ]|;
           n = n - 1; 
         }
      }
    ]|
    return p;
  ]|;
}

Consider the result of assimilating these methods:

  public static gnu.bytecode.ClassType GenPowerClass(int n)
  { 
    ClassType thisClass;
    thisClass = new ClassType("MyClass");
    thisClass.setSuper("java.lang.Object");
    thisClass.setModifiers(Access.PUBLIC);
    Method thisMethod = thisClass.addMethod("MyPower");
    thisMethod.setSignature("(I)I");
    thisMethod.setModifiers(Access.PUBLIC | Access.STATIC);
    thisMethod.initCode();
    CodeAttr thisCode = thisMethod.getCode();
    Variable var_0 = thisCode.addLocal(Type.int_type, "x");
    thisCode.pushScope();
    PowerGen(thisCode, n, var_0);
    ;
    thisCode.popScope();
    return thisClass;
  }

  public static void PowerGen(gnu.bytecode.CodeAttr thisCode, int n, gnu.bytecode.Variable var_1)
  { 
    { 
      Variable var_2 = thisCode.addLocal(Type.int_type, "p");
      thisCode.emitPushInt(1);
      thisCode.emitStore(var_2);
      while(n > 0)
      { 
        if(n % 2 == 0)
        { 
          { 
            thisCode.emitLoad(var_1);
            thisCode.emitLoad(var_1);
            thisCode.emitMul();
            thisCode.emitStore(var_1);
          }
          n = n / 2;
        }
        else
        { 
          { 
            thisCode.emitLoad(var_2);
            thisCode.emitLoad(var_1);
            thisCode.emitMul();
            thisCode.emitStore(var_2);
          }
          n = n - 1;
        }
      }
      thisCode.emitLoad(var_2);
      thisCode.emitReturn();
    }
  }

Assignment

The goal of the assignment is to create a fairly complete embedding and assimilation for Java in Java.

We measure completeness by means of the examples provided by Sestoft.

  • Create concrete syntax versions of the Sestoft examples

  • Extend the syntactic embedding if necessary

  • Develop an assimilator that translates embedded Java to gny,bytecode API calls

  • Test the embedding and assimilator against the examples, and develop your own testsuite

Experimental Setup

We cannot use the Java parser from the Dryad library, since that is for plain Java. Instead we use the parser for the embedding that we created above. The following Makefile defines a couple of actions to apply to Java-in-Java files:

JARS=../gnu-bytecode/kawa-1.8.jar
javainjava = ../embedding/Java-15-in-Java-15.tbl

# parse Java file with parse table for embedding
%.ajava : %.java
   sglri -p $(javainjava) -i $< | pp-aterm -o $@

# assimilate embedded Java code to bytecode API calls
%.assim.ajava : %.ajava ../src/java-bytecode-assimilation
   ../src/java-bytecode-assimilation -i $< | pp-aterm -o $@

# flatten uses of expression blocks
%.eflat.ajava : %.assim.ajava ../src/java-bytecode-assimilation
   core-lift-eblocks -i $< | pp-aterm -o $@

# pretty-print 
%.txt : %.ajava
   pp-java -i $< -o $@

# rename to proper Java file
# convention FooGen.java uses concrete syntax, but defines
# a class names FooGenerated.java
%erated.java : %.eflat.txt
   cp $< $@

clean:
   rm -f *.ajava

# compile Java code
%.class : %.java
   javac -cp $(JARS) $<

# run the Java code with appropriate command-line arguments
%.run : %.class
   java -cp $(JARS) $* $(RUNARGS_$*)

RUNARGS_RTCG4Generated = 1024
RUNARGS_RTCG4bGenerated = 1024

test : RTCG4Generated.class

Outline of the Assimilator

The assimilator proper uses the dryad library to reclassify ambiguous names. This is stretching the proper use of that operation, but seems to work if we assume that the generated code lives in the same (package/import) environment as the generator

module java-bytecode-assimilation
imports libdryad Java-15-in-Java-15 Java-EBlock 

strategies

  main = 
    init-observables
    ; xtc-multi-io-wrap(
        observables-wrap(
          map(read-from)
          ; map(define-compilation-unit)
          ; dryad-reclassify
          ; map(get-ast)
          ; alltd(assimilate-in-method)
        )
      )

  init-observables =
    where(
      <set-config> (ObservableClasses(), [
        <xtc-find> "rt.classes", 
        "<yourpathhere>/gnu-bytecode/kawa-1.8.jar"
      ])
    )

rules 

  // find quoted code fragments in methods

  assimilate-in-method :
    MethodDec(head1, block1) -> MethodDec(head2, block2)
    where {| LocalVar
           : <alltd(declare-meta-param)> head1 => head2
           ; <alltd(assimilate-quotes)> block1 => block2
           |} 

  // declare parameters marked as meta-var; 
  // use dynamic rule to propagate to uses of variables

  declare-meta-param :
    Param([], t, MetaVar(Id(x))) -> Param([], t, var)
    where <newjavaid> "var" => var
   ; rules( LocalVar : x -> var )

  // assimilated quote fragments

  assimilate-quotes =
    assimilate-quote-block-stms
    <+ assimilate-quoted-class
    <+ assimilate-meta-var

  // a class fragment is an expression that produces a ClassType
 
  assimilate-quoted-class :
    QuoteClassDec(cdec) ->
    expr|[ {| ClassType thisClass; ~bstm*:<assimilate-class> cdec | thisClass |} ]|  
  
  assimilate-quote-block-stms :
   QuoteBlockStms(stms) -> Block(<mapconcat(assimilate-block-statement)>stms)

  assimilate-meta-var :
    MetaVar(Id(x)) -> var
    where <LocalVar> x => var

  assimilate-block-statement =
    fail

  assimilate-class =
    fail

strategies // utils

  // Create a new Java identifier given a string.

  newjavaid = 
    !Id(<newname>)

Note that in order to parse Stratego with embedded Java you need to declare a .meta file (java-bytecode-assimilation.meta), which declares the syntax to use, in this case the Java-EBlock in Stratego embedding.

Meta([Syntax("Stratego-Java-EBlock")])

# Makefile for compiling java-bytecode-assimilation.str

# declare path for pkg-config
export PKG_CONFIG_PATH := $(HOME)/.nix-profile/lib/pkgconfig:$(PKG_CONFIG_PATH)

JAVAFRONT = $(HOME)/.nix-profile
DRYAD = $(HOME)/.nix-profile

XTCFLAGS = `pkg-config --variable=strcxtcflags dryad`
STRCFLAGS = `pkg-config --variable=strcflags dryad`

# assume that syntax definition is in ../embedding

STRINCLUDES = \
   -I ../embedding \
   -I $(JAVAFRONT)/share/sdf/java-front/ \
   -I $(JAVAFRONT)/share/java-front/ \
   -I $(DRYAD)/share/dryad

STRRUNLIBS = -la stratego-lib 

all: java-bytecode-assimilation

java-bytecode-assimilation : java-bytecode-assimilation.str Makefile
   strc -i $< $(STRINCLUDES) $(STRCFLAGS) $(STRRUNLIBS) $(XTCFLAGS)

References

Paper describing run-time code generation with examples

The bytecode API to be used in the assignment

Another bytecode library

If you need to know how to assimilate some Java source code to Java bytecode, then you can use javap -p to decompile a sample Java source file to bytecode. You can also use the tool class2aterm, which is provided by Dryad. Also, there are loads of resources available on the web on Java bytecode basics. Just search the web for something that suits you.

Resources

  • Makefile: makefile for syntactic embedding

  • maak: script to test the Java examples


Topic attachments
I Attachment Action Size Date Who Comment
elsesdf Java-15-in-Java-15.sdf manage 0.4 K 28 Mar 2006 - 09:43 EelcoVisser syntactic embedding of Java in Java
elseEXT Makefile manage 0.6 K 28 Mar 2006 - 09:45 EelcoVisser makefile for assimilator
javajava RTCG4Gen.java manage 2.1 K 28 Mar 2006 - 09:35 EelcoVisser power example with concrete syntax templates
javajava RTCG4Generated.java manage 2.6 K 28 Mar 2006 - 09:35 EelcoVisser assimilated from RTGC4Gen? .java
elsemeta java-bytecode-assimilation.meta manage 0.1 K 28 Mar 2006 - 09:37 EelcoVisser .meta file for java-bytecode-assimilation.str
elseEXT maak manage 2.8 K 28 Mar 2006 - 09:44 EelcoVisser script to test the Java examples