Administrivia
Lectures
Tuesday/Thursday, 11:30 - 12:45, Hasbrouck 113
CMPSCI 370: Introduction to Honors section: Tuesday 4:00 - 5:00, CS 142
Computer Vision
Instructor: Subhransu Maji
University of Massachusetts, Amherst Office hours: Monday, 3:00 - 5:00, CS 274
January 19, 2016
Instructor: Subhransu Maji
Website: http://www-edlab.cs.umass.edu/~smaji/cmpsci370/
News, lecture slides, etc (check regularly)
Homework submission via Moodle
1 2
Administrivia Homework #0
Grading policy: 1.Figure out a way to run Matlab
370: homework (60%), mid-term (15%), final (25%) Obtain a student copy (Matlab suite 99$)
370HH: homework (45%), mid-term (10%), final (15%), project (20%)
Your lab machines might have it
Homework
5 in total (expect one every two weeks)
2.Learn how to program in Matlab
First one will be posted this Thursday Plenty of online resources (the course website lists some)
Course textbooks (recommended)
Richard Szeliski, Computer Vision: Algorithms and Applications Alternatives: Python, Octave, JAVA, C++, .
(available online as pdf) - readings will be from this
Necessary background: Linear algebra, calculus, probability,
programming in Matlab (image toolbox needed)
Question: How many of you are familiar with Matlab?
3 4
3 4
Before we start Why Vision?
Are there any questions?
5 6
5 6
Why Vision? Light! Why is light good for measurement?
It is how we see other people,
Remote
navigate our environment, Microscopy
Surveillance
3D Analysis / Navigation
Sensing
communicate ideas, entertain,
and measure the world around us. Plentiful, sometimes free
Interacts with many things, but not too many
Goes generally straight over distance
Very small g high spatial resolution
Fast, but not too fast g time of flight sensors
Easy to detect g cameras work, are cheap
Comes in many flavors ( wavelengths )
Source: A. Berg 7 Source: A. Berg 8
7 8
The goal of computer vision An experiment #1
Extract properties of the world from visual data
(i.e., measurements of light)
We are remarkably good at this!
9 animal or not? 10
9 10
An experiment #2 An experiment #3
animal or not? 11
animal or not? 12
11 12
An experiment #4 An experiment #5
animal or not? 13 animal or not? 14
13 14
An experiment #6 The images
#1 #2 #3
animal or not? 15
#4 #5 #6
16
15 16
Human vision But we make mistakes
Amazingly good, fast and accurate
Sometimes wrong, but often not in doubt
Huge amount of bandwidth to the brain is visual data
Large amount of the brain seems to be for processing
visual data
Vision is difficult!
Source: A. Berg 17 Checker shadow illusion - Edward H. Adelson 18
17 18
Other optical illusions Vision as inverse of graphics
Many possibilities how do we solve this ambiguity?
Images are confusing, but they also reveal the structure of
the world through numerous cues
Our job is to interpret the cues!
Are the horizontal lines parallel? Are the purple lines straight?
Is this a spiral? is the left circle (in the center) bigger?
Are these failures of our vision system?
http://www.illusions.org 19 (following slides from J. Koenderink) 20
19 20
Cues: Linear perspective Cues: Aerial (Atmospheric) perspective
Scattering of skylight by
Parallel lines
particles in the air adds
merge at the
to the luminosity
horizon
Photo by ole Wind
http://kalisdigitalphotos.blogspot.com
As the distance of the object from the viewer increases, the
Analyzing parallel lines to estimate space contrast between the object and its background decreases.
21 22
21 22
Cues: Occlusion ordering Cues: Texture gradient
Gustave Caillebotte. Paris Street, Rainy Day, 1877, Art Institute of Chicago
Chicago loop, image source: wikipedia
23 24
23 24
Cues: Shading and Lighting Many other cues
Motion parallax: how things move relative to each other as
we move. Objects near us move more than objects far
away. Also provides grouping cues.
Familiar size: Size of known things, e.g. faces gives us an
estimate of the depth.
Defocus blur: Far away objects are blurrier than nearer.
Commonly used in photographs to create a perception of
depth.
Elevation: Distance from the horizon. Objects closer to the
horizon are perceived to be farther.
The four seasons sculpture set
25 26
25 26
The study of computer vision Optical character recognition (OCR)
Lots of tasks: detection, classification, segmentation, pose
estimation, depth estimation, etc.
Problems are often ill-posed. Most of the hard work is in crispy
defining the problem you wish to solve.
It is hard, ad-hoc. There are few theorems, but we rely on those Digit recognition License plate readers
from many other areas: optics, geometry, physics, etc. yann.lecun.com (google street view)
You are in good company:
Euclid, Alhazen, da Vinci, Kepler, Galileo, Descartes, Sudoku grabber
Newton, Huygens, Maxwell, Helmholtz, Mach, Herring, Cajal, http://sudokugrab.blogspot.com/
Minkowski, Hubel & Wiesel, Wald
If that is not enough, there are many applications Automatic cheque readers
(following slides from Charless Flowkes) 27 (Most bank ATMs) Source: S. Seitz, N. Snavely 28
27 28
Biometrics Face detection
Fingerprint scanners are Face recognition systems are Face detection is on many cameras these days
now on many new laptops beginning to appear more widely
and other devices http://www.sensiblevision.com
Source: S. Seitz 29 Source: S. Seitz 30
29 30
Smile detection Face recognition
http://www.apple.com/ilife/iphoto
Source: S. Seitz 31 Source: S. Seitz 32
31 32
Instance recognition Automotive safety
Mobileye : Vision systems on high end BMW, GM, Volvo models
Pedestrian collision warning
Forward collision warning
Lane departure warning
Headway monitoring and warning
Source: S. Seitz 33 Source: A. Shashua, S. Seitz 34
33 34
Self-driving cars Interactive interfaces
Microsoft Kinect depth sensors
Source: L. Lazebnik 35 Source: L. Lazebnik 36
35 36
Large-scale 3D reconstruction Vision for robotics, space exploration
Photo Tourism: Exploring Photo Collections in 3D
YouTube link NASAs Curiosity Rover has 17 cameras as a part of its sensing system
Source: S. Seitz, N. Snavely 37 http://en.wikipedia.org/wiki/Curiosity_(rover) 38
37 38
What this course is about? I. Early vision
Course overview Basic image formation and processing
I. Early vision: image formation, sensing, light and shading, filtering
II. Mid-level vision : grouping, perceptual organization
III. Multi-view geometry
IV. Recognition * =
V. Additional topics (time permitting)
Cameras and sensors
image formation Linear filtering
image
Goal: To develop vision researchers. You can come up with a reasonable Light and color Edge detection
solution to various vision problems (and implement it yourself).
We are not going to cover:
Graphics: Physics of light transport, material properties, rendering
Computational photography: design of sensing devices, etc
How the human vision system works
39
feature
Featureextraction,
extraction: key-point
corner anddetection
blob detection
Source: L. Lazebnik 40
39 40
II. Mid-level vision III. Multi-view geometry
Model fitting and grouping
Stereo Epipolar geometry
Alignment
Fitting: Least squares structure Tomasi & Kanade (1993)
from motion
Hough transform
RANSAC Affine structure from motion Projective structure from motion:
Source: L. Lazebnik 41 Here be dragons! Source: L. Lazebnik 42
41 42
IV. Recognition V. Additional topics
bag-of-word models Deep learning Human-centric vision
part-based models
learning Optical flow Tracking
43 44
43 44
For next class
Familiarize yourself with MATLAB (more information is on
the course page)
Student copy is 99$ from Matlabs page
UMASS IT (100% free): https://www.it.umass.edu/
support/software
Readings:
The speed of processing in the human visual system,
Thorpe et al., Letters to Nature, 1996
Chapter 1 in RS textbook
45
45