Master Thesis Project
|Title||Data Mining with Monotonicity Constraints|
A common form of prior knowledge in data mining concerns the monotonicity of relations between the dependent and explanatory variables. For example, economic theory would state that, all else equal, people tend to buy less of a product if its price increases. The precise functional form of this relationship is however not specified by theory.
Monotonicity may also be an important model requirement with a view toward explaining and justifying decisions, such as acceptance/rejection decisions. Consider for example a university admission procedure where candidate A scores at least as good on all admission criteria as candidate B, but A is refused whereas B is admitted. Such a nonmonotone admission rule would clearly be unacceptable. Similar considerations apply to selection procedures for applicants for e.g. a job or a loan.
Although the monotonicity requirement is quite common in practice, most data mining algorithms are not able to enforce such a constraint. This is also true for common classification and regression tree algorithms such as C4.5/C5 and CART. We have developed an algorithm for learning monotone classification trees from monotone as well as nonmonotone data (see M. Pardoel: Data Mining with classification trees for problems with monotonicity constraints. INF/SCR-03-15, May 2003). We want to extend this work in several directions: