Advanced data mining

Website:website containing additional information
Course code:INFOADM
Credits:7.5 ECTS
Period:periode 1 (week 36 t/m 45, dwz 3-9-2009 t/m 6-11-2009; herkansing week 52)
Timeslot:B
Participants:up till now 37 subscriptions
Schedule:Note: from now on the schedule is to be found in Osiris
Teachers:Dit is een oud rooster!
formgrouptimeweekroomteacher
college          Ad Feelders
 
practicum          Ad Feelders
 
Contents:Topics
  • The Knowledge Discovery Process
  • Classification Tree Algorithms
  • Frequent Pattern Mining
  • Graphical Models (including Bayesian Networks)
  • Subgroup Discovery
  • Clustering Algorithms
Literature:Lecture Notes "Advanced Data Mining".
Course form:Lectures and Computer Lab.
Exam form:Written exam and two practical assignments.
Minimum effort to qualify for 2nd chance exam:
Description:Note: the first lecture is on Tuesday, September 15.

The amount of data that is produced and stored by organisations is still growing almost every day.
This data needs to be processed and analysed to turn it into information and knowledge.
Knowledge thus obtained can improve our understanding and support decision making.
Some problems that data mining can help to solve:

  • For an incoming e-mail message, determine whether it's spam or not.
  • Identify the risk factors for prostate cancer on the basis of clinical and demographic variables.
  • Make a segmentation into groups of similar customers on the basis of their characteristics and purchase bahaviour.
  • Which products are typically bought together in one transaction by customers?
Learning models from data can be an important part of building an intelligent decision support system. In turn, the computer plays an increasingly important role in data analysis:
through the use of computers, computationally expensive data mining methods can be applied that were not even considered in the early days of statistical data analysis.

In this course we study a number of well-known data mining algorithms. We discuss what type of problems they are suited for, their computational complexity and how to interpret
and apply the models constructed with them.

wijzigen?