Experimentation Project

Title Mining Mice
Student 1 or 2 students
Supervisor Arno Siebes
ECTS 7.5 or 15
Related Course(s) Advanced Data Mining
Description

The relation between the phenotype (what you can see) and the genotype (the genetic make-up) of a species is one of the holy grails of life science research. For example, what is the relation between the genotype of a mouse and its behaviour such as fear and depression?

To study this, the research group "Emotion and Cognition" of the Faculty of Veterinary Sciences uses a so-called PhenoLab and different strains of mice. In the PhenoLab, a mouse is automatically and continiously observed. The measurements include, e.g., the position of the mouse, but also the experimental condition is registered. Since the different strains of mice differ in their genotype, these streams of measurements should allow to answer questions relating genotype and behaviour.

The amount of data generated is huge: many millions of data points. Moreover, the relation between the data and the questions is indirect. You don't measure whether the mouse is happy or not, you measure how it moves. In other words, answering the questions is a problem area in which data mining should be able to help.

The goal of this experimentation project is to see whether this is true or not: can data mining help? For this we will use real data from the experiments. It seems reasonable to start by looking for (frequent) patterns in the movements of the mouse. These patterns can then be used as features for further analysis.

This experimentation project is very much a real world data miners experience.
The veterinarians do want answers to their questions and are therefore not only sharing their data, but they will also invest their time in helping us along. Better then usual: they are also knowledgeable about data mining. In other words, the problems will be studied in a multi-disciplinary setting.

Special Note Clearly, the student who will perform this experimentation project will have to have followed a data mining course. You don't have to know about genes, mice or behaviour. Some interest in the problem area, on the other hand, seems indispensible.