Optimization & Vectorization

Universiteit Utrecht - Information and Computing Sciences

academic year 2015/16 – 1st period

title image title image title image

Navigation

News

Lectures & Slides

Examination & Grading

Course Overview

Schedule

Literature & Links

News

Recent news

July 30:

  • Site created.

 

Course Overview back to navigation

bunny logo image

Course: INFOMOV is a practical course on optimization: the art of improving software performance, without affecting functionality. We apply high level and low level optimizations, in a structured manner. Especially for the low level optimizations, we must intimately understand the hardware platform (CPU, GPU, memory, caches) and modify our code to use it efficiently.

Vectorization: Modern processors achieve their performance levels using parallel execution. This happens on the thread level, but also on the instruction level. Being able to produce efficient vectorized code is an important factor in achieving peak performance.

Context: Optimization is a vital skill for game engine developers, but also applies to other fields.

Lecturer: Jacco Bikker (j.bikker@uu.nl)

Venue:

  • Mondays, 11:00h - 12:45h,
    Room BBG-023 (starting week #37)
  • Wednesdays, 11:00h - 12:45h,
    Room BBG-083 (starting week #37)

The lecture will be given in English.


Lecture Slides & Recommended Readingsback to navigation

Below is a list of all lectures with a very brief summary of the topics, slides downloads, and recommended readings to prepare for the lecture.

Lecture 01
Mon Sep 7

Topic: Introduction This lecture serves as an introduction to the course.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 02
Wed Sep 9

Topic: Profiling With or without knowledge of optimization, it proves hard to 'guess' application performance bottlenecks. Profiling is a vital first (and often repeated) step in a structured approach to optimization.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 03
Mon Sep 14

Topic: Low Level Optimization In this lecture, we explore various low level factors that determine application performance.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 04
Wed Sep 16

Topic: Caching (1) Considering the huge latencies involved in fetching data from RAM, caches play a crucial role in 'feeding the beast'. We explore various cache architectures and investigate implications in software development.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 05
Mon Sep 21

Topic: Caching (2) Continuation of the topic of the previous lecture.

Suggested readings:

TBD

Slides:

(will be available after lecture)


Lecture 06
Wed Sep 23

Topic: High Level Optimization Improving algorithmic complexity is often the most effective way to get significant performance improvement. In this lecture we explore several examples.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 07
Mon Sep 28

Topic: SIMD (1) With CPU clock speeds reaching practical limits, parallelism becomes the main source of further advances in hardware performance. In this lecture, Intel's approach to SIMD programming (SSE) is introduced.

Suggested readings:

TBD

Slides:

(will be available after lecture)


Lecture 08
Wed Sep 30 

Topic: SIMD (2) Building on the concepts of lecture 7, we investigate advanced SIMD topics such as gather / scatter and masking.

Suggested readings:

TBD

Slides:

(will be available after lecture)


 Lecture 09
Mon Oct 5

Topic: Fixed Point Math Floating point calculations can often be done with integer arithmetic, and there are good reasons for doing so. In this lecture, the 'lost art' of fixed point arithmetic is introduced.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 10
Wed Oct 7

Topic: GPGPU (1) For certain problems, a streaming processor is a good (and powerful) alternative to the CPU. In this lecture, we briefly explore GPU architecture and the concept of GPGPU.

Suggested readings:

TBD

Slides:

(will be available after lecture)


Lecture 11
Mon Oct 12

Topic: GPGPU (2) Building on the previous lecture, we investigate some GPGPU-specific algorithms for common problems.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 12
Wed Oct 14

Topic: Presentations In preparation of the final assignment, you are invited to (briefly) present an optimization problem, along with literature on the topic. Use this session for peer feedback and inspiration.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 13
Mon Oct 19

Topic: Optimizing for GPU Like CPU software, GPGPU code benefits from hardware-specific optimizations. Several examples for AMD and NVidia are explored.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 14
Wed Oct 21

Topic: Presentations In preparation of the final assignment, you are invited to (briefly) present an optimization problem, along with literature on the topic. Use this session for peer feedback and inspiration.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 15
Mon Oct 26

Topic: Process & Grand Recap Successful software optimization is the result of a structured and deliberate process. We review this process, and review the course as a whole.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
Lecture 16
Wed Oct 28

Topic: TBD This lecture slot is reserved for topic requests.

Suggested readings:

TBD

Slides:

(will be available after lecture)

 
   

Course Schedule back to navigation

Period 1 Schedule

Week Date Lecture / Exams Practicum Deadlines
Literature
37
Mon Sep 7
11:00-12:45
Lecture 1:
Introduction
   
Wed Sep 9
11:00-12:45
Lecture 2:
Profiling

 
38
Mon Sep 14
11:00-12:45
Lecture 3:
Low level optimization
   
Wed Sep 16
11:00-12:45
Lecture 4:
Caching (1)

 
39
Mon Sep 21
11:00-12:45
Lecture 5:
Caching (2)
   
Wed Sep 23
11:00-12:45
Lecture 6:
High level optimization

 
40
Mon Sep 28
11:00-12:45
Lecture 7:
SIMD (1)
 Tue Sep 29: Deadline Assignment 1
"Cache Simulator"
 
Wed Sep 30
11:00-12:45
Lecture 8:
SIMD (2)

 
41
Mon Oct 5
11:00-12:45
Lecture 9:
Fixed Point Math
   
Wed Oct 7
11:00-12:45
Lecture 10:
GPGPU (1)

 
42
Mon Oct 12
11:00-12:45
Lecture 11:
GPGPU (2)
Tue Oct 13: Deadline Assignment 2
"SIMD"
 
Wed Oct 14
11:00-12:45
Lecture 12:
Presentations

 
43
Mon Oct 19
11:00-12:45
Lecture 13:
Optimizing for GPU
   
Wed Oct 21
11:00-12:45
Lecture 14:
Presentations

 
44
Mon Oct 26
11:00-12:45
Lecture 15:
Process & Grand Recap
   
Wed Oct 28
11:00-12:45
Lecture 16:
LAB (final assignment)

 

45




Thu Nov 5: Deadline Final Assignment

 



Assignment P1

For the first assignment, you will implement an accurate cache simulator for a current architecture (AMD and/or Intel). The simulator is configured by specifying sizes and latencies for L1, L2, L3 and RAM (and optionally, the eviction policy), and interfaces with an application using a READ and WRITE function for each valid datum size (8, 16, 32, 64 and 128 bit).
Deliver the cache simulator and a test application, along with a (brief) report that describes the simulator.
For this project, you may either work alone or together with one other student.

Deadline: Tuesday, September 29, 23.59.
Late delivery: Wednesday, September 30, 23.59 (1pt penalty).

Assignment P2

For this assignment, you will be provided with a working (and well-optimized) application: "particles". Optimize the particle simulation loop using SSE and/or AVX to improve performance.
For this project, you are required to work alone.

Deadline: Tuesday, October 13, 23.59.
Late delivery: Wednesday, October 14, 23.59 (1pt penalty).
 

Assignment P3

To be announced and specified later in more detail.
Tentative description:
For this project you (and up to 1 other student) will chose an existing program (such as an open source project) and optimize it using all available means.

Deadline: Thursday, November 5, 23.59.
Late delivery: Friday, November 6, 23.59 (1pt penalty). 

Exam & Grading back to navigation

GRADING

Programming assignments:

Final grade:

RETAKES AND REQUIREMENTS

Retake

 

Literature & Links back to navigation

Overview of literature used during this course:

 

News Archive back to navigation

Old posts

Nothing here yet.