Benchmarking Framework
Swe03
Customer:
StrategoXT project
Contact:
Eelco Visser visser+bench@cs.uu.nl
Description
Develop a framework for conducting various kinds of performance tests on programs, collecting performance measurements, and presenting the results in different views. The framework should become generic and
applicable to differents types of programs.
Technological Requirements
- May have interactive mode, but should be executable automatically in batch mode
- Data exchange should be in XML or ATerms
Customer Commitment
- Customer will provide test material.
- Questions about requirements will be answered within 48 hours.
- Will attend team meetings and project reviews;-)
Use Cases
Time vs Size
The basic use case for the benchmarking tool is to take a single executable and a set
of input files. The executable is applied to each of the inputs and the time of the execution
is measured using the
time command. These times are then offset against the size of
the input file. The result is (1) a sorted table with size x time entries (2) the average
time/size unit (3) a graphical display of the measurements.
Parameters: name of executable, input files
Size Function
It may be too simplistic to take the raw file size. The tool should be parameterizable
with a function that computes the size of an input file. An example could be a function
that computes the number of nodes of an abstract syntax tree.
Separation of Data and Presentation
The collection of timing measurements should be separated from the production of presentation
of the data. It should be possible to provide a new display component to work with existing
data collection, or vice versa to plug in a new collection method.
Presentation Methods
A variety of methods can be used to present the results of measurements
- table : just listing input and output values (e.g., size vs time)
- derived values : e.g., time per size unit, average speed, ...
- charts : showing trends, comparisons
Comparing Tools
Rather then timing the performance of one tool, several variants of a tool should be tested
against the same data. Examples include timing a compiler with different optimization
levels, and timing different versions of a compiler.
Multi-stage Timing
It may be necessary to
generate different tools or versions of tools and test the performance
of each of the derivatives. For instance, to measure the effect that different compiler optimizations have on a program.
Etc.
The above is an initial set of use cases to get you started. Further extensions should be
considered once a first prototype is up and running.