Benchmarking Framework

Swe03
Customer: StrategoXT project

Contact: Eelco Visser visser+bench@cs.uu.nl

Description

Develop a framework for conducting various kinds of performance tests on programs, collecting performance measurements, and presenting the results in different views. The framework should become generic and applicable to differents types of programs.

Technological Requirements

  • May have interactive mode, but should be executable automatically in batch mode

  • Should work on Linux

  • Data exchange should be in XML or ATerms

Customer Commitment

  • Customer will provide test material.

  • Questions about requirements will be answered within 48 hours.

  • Will attend team meetings and project reviews;-)

Use Cases

Time vs Size

The basic use case for the benchmarking tool is to take a single executable and a set of input files. The executable is applied to each of the inputs and the time of the execution is measured using the time command. These times are then offset against the size of the input file. The result is (1) a sorted table with size x time entries (2) the average time/size unit (3) a graphical display of the measurements.

Parameters: name of executable, input files

Size Function

It may be too simplistic to take the raw file size. The tool should be parameterizable with a function that computes the size of an input file. An example could be a function that computes the number of nodes of an abstract syntax tree.

Separation of Data and Presentation

The collection of timing measurements should be separated from the production of presentation of the data. It should be possible to provide a new display component to work with existing data collection, or vice versa to plug in a new collection method.

Presentation Methods

A variety of methods can be used to present the results of measurements

  • table : just listing input and output values (e.g., size vs time)
  • derived values : e.g., time per size unit, average speed, ...
  • charts : showing trends, comparisons

Comparing Tools

Rather then timing the performance of one tool, several variants of a tool should be tested against the same data. Examples include timing a compiler with different optimization levels, and timing different versions of a compiler.

Multi-stage Timing

It may be necessary to generate different tools or versions of tools and test the performance of each of the derivatives. For instance, to measure the effect that different compiler optimizations have on a program.

Etc.

The above is an initial set of use cases to get you started. Further extensions should be considered once a first prototype is up and running.