In the third and final assignment, you will design and implement an image processing pipeline to recognize shapes of a particular object in images. You are free to decide which object you want to recognize, and under which circumstances. There are also some "examples" available, with some levels of additional difficulty (see below). Of course, you can use the functions that you have implemented in the previous two assignments.
Choose the object you'd like to detect wisely! Realize that factors such as the viewpoint, lighting, (partial) occlusions, noise and variations in the depiction of an object can make its detection much harder. Be realistic, but don't make it yourself too easy. Stock footage or images on white backgrounds are too simple. When in doubt, please check on Slack.
When choosing your own object, please complete the table below.
|Minimum/maximum size||All/Minimum size/Maximum size|
|Lighting variations||Indoors/Outdoors/No direct sunlight|
|Rotation variations||Only upright/Somewhat rotated/Fully rotated|
|Occlusion||None/Partial/More than half|
|Other||No noise/Specific color/Object viewed frontally|
Your pipeline should be able to detect the specific object work on a range of images in the given context. You don't have to have perfect detection, but you need to explain why your pipeline fails sometimes.
Importantly, do not choose a "sprite-style" object such as a specific Pokémon that you can only detect well with image comparison techniques. These techniques are only allowed in the refinement phase (see also under "image processing pipeline").
Collect at least 10 images containing your object, in the context specified. Make sure you have a varied set. E.g., if you are going to detect beer bottles with occlusions in a range of light settings, do not only include images of beer bottles with white backgrounds from a supermarket website. Also include similar images without the object of interest. E.g., include images with wine bottles. We will find and try images when grading your assignment, according to your specified context.
Images should be grayscale or colored, not binary. Images should be real pictures or drawings.
Image processing pipeline
Two (or three) "phases" are needed in your pipeline:
- Pre-processing deals with point operators and filtering to arrive at an abstracted image that can be used for further processing. For example, you can increase contrast, reduce noise and locate edges.
- Object recognition is the process of localizing the objects of a particular type. You can use Hough transform, Harris corners, shape labeling and image descriptors, etc. E.g., you can detect round traffic signs from the edge map using Hough transform, or calculate the circularity of a shape extracted using morphological filters.
- Refinement is an optional step to further filter your results or provide more details of the object. This is the only step where you are allowed to use image comparison techniques and color. E.g., you can differentiate between blue and red-white traffic signs, or recognize the speed limit.
For each input image, the output of your image processing pipeline is a grayscale or color image where zero (black) means no object and a non-zero value means that the pixel belongs to a detected object. Each object will have to be represented by a specific value spanning the full range of grey or color values.
Hand in a report that contains:
- A description of the object class you will detect, including the context table
- A brief explanation of the pipeline and the processes in it (around 2 pages, with images)
- A reflection on your choice of parameters, and how it affects the results
- Example output, with some good and some bad detections. Discuss these results
Your report should be around 4-5 pages.
Besides the quality of results, the originality of your application, ingeniousness of your solution and relative difficulty of the chosen problem will be the major grading factors. You can access the detailed grading system that will be used here.
Submission Submit (through http://www.cs.uu.nl/docs/submit, and select Image Processing Assignment 3):
- your code (NO binaries/libs)
- your report
- at least 10 images with your object class. Some should not give the desired output, some should contain distractor objects
Deadline Deadline is Sunday November 12, 23:00. One full point will be deducted for submissions within one day of the deadline (you will have to email Ronald Poppe). No further extensions to the deadline will be given. There is NO re-take for the assignments.
Questions/Contact The assistants for this course are available to answer your questions and to provide guidance about your project. Contact them through the INFOIBV2017 Slack team. There are walk-in sessions where the student assistants can help you.
Example object classes
Choosing one of the object classes below will reduce your score for originality.
Snooker balls - top and side shots of snooker tables. Detect all balls. Balls can be partially occluded, have shadows or reflections. Refinement by (A) identifying the striped and solid balls, (B) identifying the color of each.
Analogue clocks - clocks on walls and churches, viewed somewhat frontally. Clocks should contain a circle. No occlusions but different types of pointers, backgrounds and numbering on the clock. Refinement by (A) finding the large (minute) and small (hour) pointer.
Hot air balloons - typically balloon-shaped, in the air or inflated but still on the ground. Backgrounds could contain clouds, trees and other objects. Balloons can have every color and print, and can be partially occluded by others.
Round traffic signs - blue and white-red traffic signs. (Remember, no object finding based on color, except for refinement.) Refinement by (A) differentiating between blue and white-red signs, (B) identification what the sign is (e.g. speed limit, other).
Triangular traffic signs - triangular traffic signs, either point up or point down. (Remember, no object finding based on color, except for refinement.) Refinement by (A) identification what the sign is.
Stop traffic signs - hexagonal traffic signs with "stop" in white text. (Remember, no object finding based on color, except for refinement.)
Beer bottles - Common beer bottles with or without labels (not only frontal), with some occlusions, shadows and reflections. Refinement by (A) detecting specific brands.