Schema-BasedSynthesisOfDataAnalysisPrograms

Stc
ComputingScienceColloquium

Date: Monday March 17, 2003

Time: 3pm

Room: 508 BBL

Schema-Based Synthesis of Data Analysis Programs

Bernd Fischer

USRA/RIACS, NASA Ames Research Center fisch@email.arc.nasa.gov

Abstract

Automatic program synthesis is a formal approach to software development, in which efficiently executable programs are automatically derived from high-level specifications. It has successfully been applied to a number of domains, for example, celestial mechanics, transportation scheduling, or option pricing. In this talk I will discuss its application to machine learning, or more precisely, to statistical data analysis, and I will present the AutoBayes? system currently under development at NASA Ames.

AutoBayes? takes a specification in form of a statistical model, extracts a graphical model (i.e., Bayesian network) from it, and then derives code by a process called schema-based synthesis. Schemas are generic algorithms with their applicability conditions. Schemas come in different ''flavors''; some are derived from decomposition theorems for graphical models, others implement generic machine-learning algorithms like EM. Schemas are applied recusively until irreducible subproblems occur which are then solved by the application of symbolic or numeric solvers. AutoBayes? has been applied to a number of textbook and application problems, including clustering, image analysis, changepoint detection, and software reliability estimation.

In the talk, I will discuss some examples and their derivation processes in more detail and demonstrate the system ''live''. I will also discuss the role of term rewriting techniques in the symbolic system and for code optimization purposes.

AutoBayes? is joint work with W. Buntine, J. Schumann, and J. Whittle.