Department of Information and Computing Sciences

Departement Informatica Onderwijs
Bachelor Informatica Informatiekunde Kunstmatige intelligentie Master Computing Science Game&Media Technology Artifical Intelligence Human Computer Interaction Business Informatics

Onderwijs Informatica en Informatiekunde

Vak-informatie Informatica en Informatiekunde

Data analytics

Vakcode:INFOB2DA
Studiepunten:7.5 ECTS
Periode:periode 2 (week 46 t/m 5, d.w.z. 9-11-2020 t/m 5-2-2021; herkansing week 16)
Timeslot:B
Deelnemers:tot nu toe 107 inschrijvingen
Rooster:De officiële roosters staan in MyTimetable
Docenten:
vormgroeptijdweekzaaldocent
college          Michael Behrisch
werkcollege groep 1        Saba Gholizadeh
Diede van der Hoorn
groep 2        Hessel Laman
groep 3        Anneloes Meijer
groep 4        Simardeep Singh
Inhoud:In this data analytics course, you will learn to:
  1. Evaluate different Data Analysis (DA) processes and their differentiating key aspects.
  2. Apply selected techniques and algorithms to a data set from a task-oriented perspective.
  3. Analyze semi-structured and unstructured data, for example using text analysis.
  4. Use external data sources in analyses to derive new insights.
  5. Relate the potential negative impact of data quality problems.
  6. Use principles of human perception and cognition in visualization design
  7. Conceptualize ideas and interaction techniques using sketching and prototyping
  8. Apply methods for visualization of data from a variety of fields
  9. Create web-based interactive visualizations using D3
  10. Work constructively as a member of a team to carry out a complex project
For an overview of the schedule with lecture topics and the assignments, please see the course documentation on our website https://viguu.gitlab.io/infob2da/ (overview, structure, schedule) and blackboard (operational and assignments).
Literatuur:

The several textbooks are used in this this course.

A central reference is Peng and Matsui (2016), which is available as PDF, e-book, paperback, but you can also read the latest version online at https://bookdown.org/rdpeng/artofdatascience/. Further resources are:
  • Han J., Kamber M., Data Mining: Concepts and Techniques, 2006, Morgan Kaufmann Publishers, Second Edition
  • Berthold M., Borgelt C., Höppner F., Klawonn F., Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data: Making Practical Sense of Real Data (Texts in Computer Science), 2010 Springer
  • Hand D.J., Mannila H., Smyth P., Principles of Data Mining, 2001, MIT Press
  • Spence R., Information Visualization, 2007, ACM Press Books, Second Edition
  • Ward M. and Grinstein, G. and Keim D. A., Interactive Data Visualization: Foundations, Techniques, and Application, 2010, A.K. Peters, Ltd, ISBN: 978-1-56881-473-5, http//www.idvbook.com
  • Munzner, T. (2014). Visualization Analysis and Design. CRC Press.
  • Interactive Data Visualization for the Web, Scott Murray, O’Reilly (2017) Second edition! (The 2nd edition teaches D3 Version 4, which we will be using in this course!)
Note that many of the resources are available for free online, however, you can also buy copies.
Werkvorm:There are a total of nine assignments. These assignments will be graded and contribute significantly to the final grade. You must achieve at least 50% of the points to be eligible to take the final exam.
Toetsvorm:The final grade will be determined based on the following course components:
  1. Assignments: 30%

  2. Final exam: 70%

Note that in order to be accepted to the final exam you will have to score at least 50% of the assignment points.
The minimum grade for the final exam is a 5. If you score a grade between a 4 and a 5 in the final exam, you can repair that score with a high percentage of points in the assignments (>85%).
Inspanningsverplichting voor aanvullende toets:Om aan de aanvullende toets te mogen meedoen moet de oorspronkelijke uitslag minstens 4 zijn.
Beschrijving:

Applied data analytics is a multidisciplinary field where you will learn insights needed to make sense of data, research, and observations from everyday life.

You will learn how to apply a data-driven approach to problem-solving, but will not only learn about tools, methods, and techniques, or the latest trends, but also more generic insights: why do certain approaches work, why the field is so popular, what common mistakes are made.

The lectures will provide the theoretical background of how a data analytics process should be performed. Furthermore, we discuss an overview of popular data analytics and visualization techniques to help match techniques with information needs, including applications of text mining and data enrichment.

Content:
  • Fundamental Data Mining Methods

  • Data Preparation and Preprocessing

  • Common Analysis Algorithms and Methods

  • Principles of Information Visualization

  • Human Perception and Visualization Design

  • Data Visualization Techniques for Particular Data Types

The lecture is separated in two parts. The content of the first one are principal Data Mining methods whereas the main focus lies on Data Preprocessing, Cluster & Outlier Analysis, Classification and Association Rules. Subject of the second part are the basics of Information Visualization. Foundations of Human Perception and Design Decisions are followed by examples of visualizations of different data sources (Non-Spatial, Temporal, Geo-Spatial and 3D Spatial Data).

The course will be taught in English.

wijzigen?