CSE 3172: Foundations of Computer Vision
Instructor: P. C. S. Swamy
Department of Computer Science & Engineering
MIT, Manipal
What is (computer) vision?
What is it related to?
Why study computer vision?
Vision is useful: Images and video are everywhere!
Personal photo albums Movies, news,
sports
The goal of computer vision
• To extract “meaning” from pixels
What we see What a computer sees
Source: S. Narasimhan
The goal of computer vision
• To extract “meaning” from pixels
Humans are remarkably good at this…
Origins of computer vision:
an MIT undergraduate summer project
What kind of information can be extracted
from an image?
• Metric 3D information
•
What kind of information can be extracted from an
image?
12
The car is in front of the pole
Sky
Person
White
Horse
Car
Road
Shadow
1m
Wheel
13
Computer Vision
• Low Level Vision
• Measurements
• Enhancements
• Region segmentation
• Features
• Mid Level Vision
• Reconstruction
• Depth
• Motion Estimation
• High Level Vision
• Category detection
• Activity recognition
• Deep understandings
14
Computer Vision
• Low Level Vision
• Measurements
• Enhancements
• Region segmentation
• Features
White
• Mid Level Vision
• Reconstruction
• Depth
Shadow
• Motion Estimation 1m
• High Level Vision
• Category detection
• Activity recognition
• Deep understandings
15
Measurement
Brightness
Measurement
Length
http://www.michaelbach.de/ot/sze_muelue/index.html
Image Enhancement
Image Enhancement
Image Inpainting, M. Bertalmío et al.
http://www.iua.upf.es/~mbertalmio//restoration.html
19
Computer Vision
• Low Level Vision
• Measurements
• Enhancements The car is in front of the pole
• Region segmentation
• Features
• Mid Level Vision
• Reconstruction
• Depth
• Motion Estimation
• High Level Vision
• Category detection
• Activity recognition
• Deep understandings
20
Applications: 3D Scanning
21
Google’s 3D Maps
Structure estimation from tourist photos
22
Computer Vision
• Low Level Vision
• Measurements
• Enhancements
Sky Person
• Region segmentation
• Features
• Mid Level Vision Car
Road
Horse
• Reconstruction
• Depth
• Motion Estimation
• High Level Vision
• Category detection
• Activity recognition
• Deep understandings
• Pose estimation
23
Face detection
• Many new digital cameras now detect faces
• Canon, Sony, Fuji, …
Vision-based interaction: Xbox Kinect
25
Why is computer vision difficult?
Challenges: viewpoint variation
Michelangelo 1475-1564
Challenges: illumination
Challenges: scale
Challenges: deformation
Challenges: occlusion
Magritte, 1957
Challenges: background clutter
Challenges: Motion
Challenges: object intra-class variation
slide credit: Fei-Fei, Fergus & Torralba
Challenges: local ambiguity
Challenges or opportunities?
• Images are confusing, but they also reveal the structure of the world through
numerous cues
• Our job is to interpret the cues!
Image source: J. Koenderink
Optical character recognition (OCR)
Digit recognition License plate readers
yann.lecun.com http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Sudoku grabber
http://sudokugrab.blogspot.com/
Automatic check processing
Ex. Object Recognition
• Problem: Given an image A, does A contain an image of a person?
Ex. Object localization
Ex. Human Detection
Ex: Face Detection: Apple iPhoto, Facebook, Google..
Ex. Face Recognition
Open-Universe Face Identification
Open-Universe Face Identification
Ex. Facial expression
Ex. Fatigue detection
Ex. Lip-reading
Video Surveillance and Monitoring
UAVs: Unmanned Aerial Vehicles (drones)
Ex. Tracking (multi-object)
Ex. Human Action Recognition
Ex. Counting in Extremely Dense Crowd
Images
Mobile visual search: Google Goggles
Biometrics
How the Afghan Girl was Identified by Her Iris Patterns
Automotive safety
• Mobileye: Vision systems in high-end BMW, GM, Volvo models
• Pedestrian collision warning
• Forward collision warning
• Lane departure warning
• Headway monitoring and warning
56
Mobile robots
NASA’s Mars Spirit Rover
http://en.wikipedia.org/wiki/Spirit_rover http://www.robocup.org/
Saxena et al. 2008
STAIR at Stanford
Augmented Reality and Virtual Reality
MS HoloLens, Oculus, Magic Leap,
Medical imaging
Image guided surgery
3D imaging
Grimson et al., MIT
MRI, CT
Human shape capture
Human shape capture
Human shape capture
Topics
• Filtering, Edge Finding, Transformations
• Color, Texture,
• Interest Points and Region Descriptors
• Segmentation- Few advanced methods
• Cameras, Stereo, Reconstruction
• Motion and Tracking
• Object Detection and Recognition
• Case study: CBIR/ Face detection & recognition
64
References
• Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing”, Third
Edition, Pearson Education, 2012.
• Richard Szeliski, Computer Vision: Algorithms and Applications (available
online) http://szeliski.org/Book/
• Forsyth & Ponce, Computer Vision: A Modern Approach
• Relevant research papers