Exceptional Model Mining

In most databases, it is possible to identify small partitions of the data where the observed distribution is notably different from that of the database as a whole. In classical subgroup discovery, one considers the distribution of a single nominal attribute, and exceptional subgroups show a surprising increase in the occurrence of one of its values.

We introduce Exceptional Model Mining (EMM), a framework that allows for more complicated target concepts. Rather than finding subgroups based on the distribution of a single target attribute, EMM finds subgroups where a model fitted to that subgroup is somehow exceptional.

In the first paper, we discussed regression as well as classification models, and defined quality measures that determine how exceptional a given model on a subgroup is. Our framework is general enough to be applied to many types of models, even from other paradigms such as association analysis and graphical modeling.

The project is a joint effort of LIACS at Universiteit Leiden and ADA at Universiteit Utrecht. The following people are currently involved:

Selected Refereed EMM Publications

2012
Duivesteijn, W., Loza Mencia, E., Fürnkranz, J., Knobbe, A. Multi-label LeGo - Enhancing Multi-label Classifiers with Local Patterns. In: Proceedings IDA 2012, 2012.
Duivesteijn, W., Feelders, A., Knobbe, A. Different Slopes for Different Folks - Mining for Exceptional Regression Models with Cook's Distance. In: Proceedings KDD 2012, 2012.
van Leeuwen, M. & Knobbe, A.J. Diverse Subgroup Set Discovery. In: Data Mining and Knowledge Discovery, special issue ECMLPKDD'12, pp 242-208, Springer, 2012.
2011
Duivesteijn, W., Knobbe, A. Exploiting False Discoveries - Statistical Validation of Patterns and Quality Measures in Subgroup Discovery. In: Proceedings ICDM 2011, 2011.
van Leeuwen, M. & Knobbe, A.J. Non-Redundant Subgroup Discovery in Large and Complex Data. In: Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Data 2011 (ECML PKDD'11), 2011.
2010
Duivesteijn, W., Knobbe, A.J., Feelders, A. & van Leeuwen, M. Subgroup Discovery meets Bayesian networks – an Exceptional Model Mining approach. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM'10), 2010.
van Leeuwen, M. Maximal Exceptions with Minimal Descriptions. In: Data Mining and Knowledge Discovery, special issue ECMLPKDD'10, vol.21(2), pp 259-276, Springer, 2010.
2008
Leman, D., Feelders, A. & Knobbe, A. Exceptional Model Mining. In: Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Data 2008 Part II (ECML PKDD'08), pp 1-16, 2008.

Selected Unrefereed EMM Publications

2011
Knobbe, A., Feelders, A., Leman, D. Exceptional Model Mining, Data Mining: Foundations and Intelligent Paradigms 2, Holmes, D., Jain, L. (eds.), 2011.