0% found this document useful (0 votes)

367 views55 pages

Computer Vision:: From 3D Reconstruction To Recognition

This document provides an overview of the CS231A Computer Vision course at Stanford University. The course covers two main areas: 1) Space/Geometry, which involves estimating spatial properties of objects and scenes from images using geometric methods, and 2) Semantics/Learning, which involves estimating semantic and dynamic properties through learning methods. The syllabus outlines 16 lectures covering topics like camera models, calibration, 3D reconstruction, tracking, SLAM, representation learning, and applications in areas like robotics, augmented reality, and autonomous systems. Prerequisites include knowledge of linear algebra, probability, statistics, machine learning and basic computer vision and programming skills.

Uploaded by

Nono Nono

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

367 views55 pages

Computer Vision:: From 3D Reconstruction To Recognition

Uploaded by

Nono Nono

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

CS231A

Computer Vision:
From 3D Reconstruction
to Recognition
Class Time
M‐W; 11:30—12:50PM

Silvio Savarese & Jeanette Bohg Lecture 1 11-Jan-21

CS231A
• Instructor
• ssilvio@stanford.edu
o Silvio Savarese
• Office: Gates Building, room: 154
o ssilvio@stanford.edu
• Office hour: Friday 2-3pm or by
o Office: Gates Building, room: 154
appointment
o Office hour: Thursday 2‐3pm or under appoint.
• bohg@stanford.edu
• Office: Gates Building, room: 140
• Office hour: Friday 1-2pm or by
appointment

CAs:
- Andrey Kurenkov, Kuan Fang, Brent Yi, Krishnan
Srinivasan

Silvio Savarese & Jeanette Bohg Lecture 1

Lecture 1
Introduction

• An introduction to computer vision
• Course overview

Silvio Savarese & Jeanette Bohg Lecture 1 11-Jan-21

AI is a propelling force of
today’s technology

4
Smart Agriculture

Courtesy of Agriculture Corner

Courtesy of D. Rubin, Stanford

Courtesy of of Amazon.com
Health care

Courtesy of D. Rubin, Stanford
Retail

From Imagining the Retail Store of the Future ‐ The New York Times, April 12, 2017
Manufacturing

Courtesy of ITRI, 2017
Transportation and Logistics
Construction Management
Why is this acceleration
happening now?

11
Enabling factors
• Big data

ImageNet, 2009
ShapeNet, 2015

12
Enabling factors
• Big data
• Faster hardware

13
Enabling factors
• Big data
• Faster hardware
• New algorithms
– Representation learning
– Neural networks
– Inject learning to
deterministic reasoning

14
Computer vision

• Information
extraction
• Interpretation
Sensing device Computational
device

1. Information extraction: features, 3D structure, motion
flows, etc…
2. Interpretation: recognize objects, scenes, actions,
events
Major areas in Computer Vision

Space/Geometry Semantics/Learning
• Object detection and pose
• Object shape recovery estimation
• Depth estimation • Object tracking
• 3D scene reconstruction • Scene understanding

16
Major areas in Computer Vision

Space/Geometry Semantics/Learning
• Object detection and pose
• Object shape recovery estimation
• Depth estimation • Object tracking
• 3D scene reconstruction • Scene understanding

17
Recovering 3D models of the environments

Armeni et al. 2016
Recovering 3D models of the environments
Courtesy of Luminar

19
This is critical for autonomous
driving or navigation!

20
Major areas in Computer Vision

Semantics/Learning
Space/Geometry
• Object detection and pose
• Object shape recovery
estimation
• Depth estimation
• Object tracking
• 3D scene reconstruction
• Scene understanding

21
Detecting and tracking objects in the environments

building
pedestrian

car
car

22
3D Scene Parsing
Held, Thrun, Savarese, 2016‐206

23
Major areas in Computer Vision

Semantics/Learning
Space/Geometry
• Object detection and pose
• Object shape recovery
estimation
• Depth estimation
• Object tracking
• 3D scene reconstruction
• Scene understanding

24
CS 231A course overview

1. Space/Geometry
Estimating spatial properties of objects and scene
from images through geometrical methods

2. Semantics/Learning
CS 231A course overview

1. Space/Geometry
Estimating spatial properties of objects and scene
from images through geometrical methods

2. Semantics/Learning
Estimating semantic and dynamic properties of
scene elements from images through learning
methods
CS 231A course overview

1. Space/Geometry
Estimating spatial properties of objects and scene
from images through geometrical methods

2. Semantics/Learning
Estimating semantic and dynamic properties of
scene elements from images through learning
methods
Camera systems
Establish a mapping from 3D to 2D
How to calibrate a camera
Estimate camera parameters such pose or focal length

?
Single view metrology
Estimate 3D properties of the world from a single image

?
Multiple view geometry
Estimate 3D properties of the world from multiple views

Epipolar geometry
Structure from motion

Courtesy of Oxford Visual Geometry Group

Panoramic Photography

kolor
3D Modeling of landmarks

34
Accurate 3D Object Prototyping

Scanning Michelangelo’s “The David”
• The Digital Michelangelo Project
‐ http://graphics.stanford.edu/projects/mich/
• 2 BILLION polygons, accuracy to .29mm
Augmented Reality

• Magic leap
• Daqri
• Meta
• Etc…

36
CS 231A course overview

1. Space/Geometry
Estimating spatial properties of objects and scene
from images through geometrical methods

2. Semantics/Learning
Estimating semantic and dynamic properties of
scene elements from images through learning
methods
Representations and
Representation Learning

Example from Advances in Computer Vision – MIT – 6.869/6.819

Feature Tracking and Flow

J. J. Gibson, The Ecological Approach to Visual Lucas‐Kanade Feature Tracking over multiple frames.

Perception Picture adopted from OpenCV Webpage.
Object Pose Estimation and Tracking

: DQJ HWDOq' HQVH)XVLRQ' 2 EMHFW3RVH(VWLP DWLRQE\,WHUDWLYH 0 DQXHO: ÙKWULFK HWDOq3UREDELOLVWLF2 EMHFW7UDFNLQJXVLQJD' HSWK
' HQVH)XVLRQr& 935 & DP HUDr,52 6
SLAM and Localization

Accumulated registered point cloud from lidar SLAM.
Autonomous navigation and
safety

Mobileye: Vision systems in high‐end BMW, GM, Volvo models
But also, Toyota, Google, Apple, Tesla, Nissan, Ford, etc….

Source: A. Shashua, S. Seitz

Personal robotics

43
More Applications

Assistive technologies Surveillance
Factory inspection

Exploration and remote operations
Syllabus
Lecture Topic
January

1 Introduction
2 Camera models
3 Camera calibration

Geometry
4 Single view metrology
5 Epipolar geometry
6 Multi‐view geometry
7 Structure from motion/ SLAM
Proposal due
8 Volumetric stereo
9 Fitting and Matching
Mid term
February

10 Low Level Representations
11 Depth Estimation, Low Level Tracking
12 Optical and Scene Flow

Learning
13 6D pose Estimation, Object Tracking
14 Object Tracking Continued
15 SLAM
March

16 Guest

Project presentations Final projects

Prerequisites
• This course requires knowledge of linear algebra,
probability, statistics, machine learning and computer
vision, as well as decent programming skills.
• Though not an absolute requirement, it is encouraged and
preferred that you have at least taken either CS221 or
CS229 or CS131A or have equivalent knowledge.
• We will leverage concepts from low‐level image processing
(CS131A) (e.g., linear filters, edge detectors, corner
detectors, etc…) and machine learning (CS229) (e.g., SVM,
basic Bayesian inference, clustering, neural networks,
etc…) which we won’t cover in this class.
• We will provide links to background material related to
CS131A and CS229 (or discuss during TA sessions) so
students can refresh or study those topics if needed.
Text books

Required:
‐ [FP] D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach (2nd
Edition). Prentice Hall, 2011.
‐ [HZ] R. Hartley and A. Zisserman. Multiple View Geometry in Computer
Vision. Academic Press, 2002.

Recommended:
‐ R. Szeliski. Computer Vision: Algorithms and Applications. Springer, 2011.
‐ D. Hoiem and S. Savarese. Representations and Techniques for 3D Object
Recognition and Scene Interpretation, Synthesis
lecture on Artificial Intelligence and Machine Learning. Morgan Claypool
Publishers, 2011
‐ Learning OpenCV, by Gary Bradski & Adrian Kaehler, O'Reilly Media, 2008.
Course assignments
• 1 warm up problem set (HW‐0)
• 4 problem sets
• 1 mid‐term exam
• 1 project

• Look up class schedule for release and due dates.
• Problems will be released through the schedule page and must
be submitted through Gradescope.
Midterm Exam
• The exam will be on 02/22. It will be released on
Canvas and be available for 48 hours. You will have 2
hours to complete it once you start.

• You will be updated with more details, e.g., material
to be covered, review sessions etc., as we approach
the midterm.

49
Course Projects
• Replicate an interesting paper
• Comparing different methods to a test bed
• A new approach to an existing problem
• Original research

• Write a 10‐page paper summarizing your results
• Release the final code
• Give a final in‐class presentation
• SCPD students can send videos instead.

• We will introduce projects in 1‐2 weeks
• Important dates: look up class schedule
Course Projects
• Form your team:
– 1‐3 people
– The larger is the team, the more work we expect
from the team
– Be nice to your partner: do you plan to drop the
course?
• Evaluation
– Quality of the project (including writing)
– Final project in‐class presentation (~ TBA minutes
spotlight presentations)
Grading policy
• Homeworks: 37%
– 1% for HW0
– 9% for HW1, HW2, HW3, HW4 (each)

• Mid term exam: 20%
• Course project: 38%
– Project proposal 1%
– mid term progress report 5%
– final report 25%
– presentation 7%

• Attendance and class participation: 5%
– Questions, answers, remarks, piazza posts,…
– Class participation are waived for SCPD students. For the project
presentation, SCPD students can send videos instead.
Grading policy (HWs)

– 25% will be deducted per day late.
– Two 48‐hours one‐time late submission
“bonuses” are available; that is, you can use
this bonus to submit your HW late after at
most 48 hours. This is one time deal: After
you use all your bonuses, you must adhere to
the standard late submission policy.
– No exceptions will be made.
Grading policy (project)

– If 1 day late, 25% off the grade for the
project
– If 2 days late, 50% off the grade for the
project
– Zero credits if more than 2 days
– No "late submission bonus" is allowed when
submitting your progress report or project
report
CS231
Introduction to
Computer Vision

Next lecture: Camera systems

Silvio Savarese & Jeanette Bohg Lecture 1 11-Jan-21

Prerequisites: What Is Computer Vision? Vision For Measurement
No ratings yet
Prerequisites: What Is Computer Vision? Vision For Measurement
8 pages
CS231n Intro: CNNs for Visual Recognition
No ratings yet
CS231n Intro: CNNs for Visual Recognition
52 pages
CS436 CS5310 EE513 L01 Introduction
No ratings yet
CS436 CS5310 EE513 L01 Introduction
54 pages
Lecture 1 - : Fei-Fei Li & Justin Johnson & Serena Yeung
No ratings yet
Lecture 1 - : Fei-Fei Li & Justin Johnson & Serena Yeung
53 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
1 Intro
No ratings yet
1 Intro
103 pages
01 Introduction
No ratings yet
01 Introduction
19 pages
Intro to Computer Vision Course
No ratings yet
Intro to Computer Vision Course
76 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
ComputerVision Intro
No ratings yet
ComputerVision Intro
50 pages
01 Introduction
No ratings yet
01 Introduction
46 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
Computer Vision 2011
100% (1)
Computer Vision 2011
103 pages
Discussion 1 - Introduction
No ratings yet
Discussion 1 - Introduction
26 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
Lec01 Intro
No ratings yet
Lec01 Intro
61 pages
Lecture 1 - : Fei-Fei Li & Andrej Karpathy & Justin Johnson
No ratings yet
Lecture 1 - : Fei-Fei Li & Andrej Karpathy & Justin Johnson
47 pages
T2310 TDS3651 L01 Introduction
No ratings yet
T2310 TDS3651 L01 Introduction
73 pages
Introduction to Data Science: (Khoa học dữ liệu)
No ratings yet
Introduction to Data Science: (Khoa học dữ liệu)
91 pages
Computer Vision: Linda Shapiro
No ratings yet
Computer Vision: Linda Shapiro
73 pages
Basics of Computer Vision Course
No ratings yet
Basics of Computer Vision Course
83 pages
3D Computer Vision - Foundations and Advanced Methodologies-Springer (2024)
No ratings yet
3D Computer Vision - Foundations and Advanced Methodologies-Springer (2024)
478 pages
Unit 5 Introduction Robot Vision
No ratings yet
Unit 5 Introduction Robot Vision
60 pages
CV Unit 1 Overview of Computer Vison and Application
No ratings yet
CV Unit 1 Overview of Computer Vison and Application
51 pages
CV Introduction
No ratings yet
CV Introduction
10 pages
Lecture 1 Part 2
No ratings yet
Lecture 1 Part 2
49 pages
Ch-3 Image AnalysisComputer Vision
No ratings yet
Ch-3 Image AnalysisComputer Vision
88 pages
Computer Vision Course Notes 2018
No ratings yet
Computer Vision Course Notes 2018
2 pages
Computer Vision Three Dimensional Reconstruction Techniques Springer
No ratings yet
Computer Vision Three Dimensional Reconstruction Techniques Springer
348 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
CV s2015 Lec 1
No ratings yet
CV s2015 Lec 1
32 pages
DL4CV Week01 Part01
No ratings yet
DL4CV Week01 Part01
35 pages
Cv-00 Course Organization
No ratings yet
Cv-00 Course Organization
39 pages
Lec01 - Intro To Computer Vision
No ratings yet
Lec01 - Intro To Computer Vision
43 pages
Foundations of Computer Vision Computational Geometry Visual Image Structures and Object Shape Detection 1st Edition James F. Peters (Auth.) Instant Download
No ratings yet
Foundations of Computer Vision Computational Geometry Visual Image Structures and Object Shape Detection 1st Edition James F. Peters (Auth.) Instant Download
156 pages
Syllabus CV
No ratings yet
Syllabus CV
2 pages
Computer Vision Course Overview
No ratings yet
Computer Vision Course Overview
79 pages
CS5330 F22 Lectures
No ratings yet
CS5330 F22 Lectures
116 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
00 - Course Info - MSC
No ratings yet
00 - Course Info - MSC
12 pages
Lecture 1 Part 2
No ratings yet
Lecture 1 Part 2
53 pages
Unit 1
No ratings yet
Unit 1
186 pages
CS231A - Computer Vision: Project Proposals
No ratings yet
CS231A - Computer Vision: Project Proposals
46 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
Lecture1 1
No ratings yet
Lecture1 1
30 pages
00CV Intro Full
No ratings yet
00CV Intro Full
58 pages
Computer Vision for Researchers
No ratings yet
Computer Vision for Researchers
54 pages
1 - Pdfsam - Stanford University CS 131 Computer Vision - Foundations and Applications
No ratings yet
1 - Pdfsam - Stanford University CS 131 Computer Vision - Foundations and Applications
1 page
Lec01 Intro
No ratings yet
Lec01 Intro
55 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Advanced Computer Vision Course
No ratings yet
Advanced Computer Vision Course
50 pages
Computer Vision Presentation Updated
No ratings yet
Computer Vision Presentation Updated
15 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
CompVisNotes PDF
No ratings yet
CompVisNotes PDF
115 pages
Computer Vision11 PDF
No ratings yet
Computer Vision11 PDF
18 pages
Vdoc - Pub Clinical Brain Mapping
No ratings yet
Vdoc - Pub Clinical Brain Mapping
301 pages
School Catalog - Minnesota Institute of Ayurveda
No ratings yet
School Catalog - Minnesota Institute of Ayurveda
17 pages
Medical Imaging: Handbook of
No ratings yet
Medical Imaging: Handbook of
16 pages
Lecture5 Epipolar Geometry
No ratings yet
Lecture5 Epipolar Geometry
64 pages
(Winter 2021) : CS231A: Computer Vision, From 3D Reconstruction To Recognition Homework #4 Due: Monday, March 15
No ratings yet
(Winter 2021) : CS231A: Computer Vision, From 3D Reconstruction To Recognition Homework #4 Due: Monday, March 15
4 pages
CS231A Homework: Fundamental Matrix & Structure from Motion
No ratings yet
CS231A Homework: Fundamental Matrix & Structure from Motion
5 pages
(Winter 2021) : CS231A: Computer Vision, From 3D Reconstruction To Recognition Homework #0 Due: Sunday, January 17
100% (1)
(Winter 2021) : CS231A: Computer Vision, From 3D Reconstruction To Recognition Homework #0 Due: Sunday, January 17
2 pages
A Universal Vision-Based Navigation System For Autonomous Indoor Robots
No ratings yet
A Universal Vision-Based Navigation System For Autonomous Indoor Robots
14 pages
Ikeyman 8 User Guide
No ratings yet
Ikeyman 8 User Guide
73 pages
Tables Colgroup HTML
No ratings yet
Tables Colgroup HTML
4 pages
Abses Perianal Jurnal
No ratings yet
Abses Perianal Jurnal
4 pages
CBLM - Driving (Use and Apply Lubricantcoolant)
No ratings yet
CBLM - Driving (Use and Apply Lubricantcoolant)
68 pages
Insul 8 8 Bar Design Features
No ratings yet
Insul 8 8 Bar Design Features
22 pages
Aerosols and Climate Ken S. Carslaw Full
100% (5)
Aerosols and Climate Ken S. Carslaw Full
155 pages
Textile Forms' Computer Simulation Techniques
No ratings yet
Textile Forms' Computer Simulation Techniques
29 pages
Sensors 25 00981
No ratings yet
Sensors 25 00981
26 pages
MPL Labmanual
No ratings yet
MPL Labmanual
76 pages
10 Adj CL
No ratings yet
10 Adj CL
3 pages
LP Delta Neutral
No ratings yet
LP Delta Neutral
11 pages
HeliSAS RFMS Bell 407 SR2344LA
No ratings yet
HeliSAS RFMS Bell 407 SR2344LA
16 pages
001-052 Vibration Damper, Viscous
No ratings yet
001-052 Vibration Damper, Viscous
4 pages
2025 hw5 Sol
No ratings yet
2025 hw5 Sol
12 pages
Service Manual: SS-WG880
100% (1)
Service Manual: SS-WG880
6 pages
Waqar Ansari's RISE QM Ch#14
No ratings yet
Waqar Ansari's RISE QM Ch#14
20 pages
Chapter 4 Mem
No ratings yet
Chapter 4 Mem
20 pages
3RD Quarter Stat 102044
No ratings yet
3RD Quarter Stat 102044
54 pages
Vol17 Pulp and Paper Testing Toc
100% (2)
Vol17 Pulp and Paper Testing Toc
17 pages
Crushing & Screening Solutions
No ratings yet
Crushing & Screening Solutions
34 pages
4.1 - Interpreting Statistics
No ratings yet
4.1 - Interpreting Statistics
3 pages
Lacanian Ink 29 From An Other To The Other Josefina Ayerza Ebook All Chapters PDF
100% (7)
Lacanian Ink 29 From An Other To The Other Josefina Ayerza Ebook All Chapters PDF
45 pages
Reservoir Fluid Properties Class
No ratings yet
Reservoir Fluid Properties Class
12 pages
Makalah To Be
100% (1)
Makalah To Be
12 pages
RK900-05 Wireless Home Weather Station
No ratings yet
RK900-05 Wireless Home Weather Station
3 pages
FYBScCS - Sem I - LabBook C Programming CS 102 P
No ratings yet
FYBScCS - Sem I - LabBook C Programming CS 102 P
45 pages
TYCS Sem5 SQA
No ratings yet
TYCS Sem5 SQA
3 pages
Academic Transcript: Geophysics Engineering
No ratings yet
Academic Transcript: Geophysics Engineering
1 page
Serengeti 228
No ratings yet
Serengeti 228
2 pages
Using Blockchain Technology To Ensure Honey Purity - M.A.S. Rünzel
No ratings yet
Using Blockchain Technology To Ensure Honey Purity - M.A.S. Rünzel
11 pages

Computer Vision:: From 3D Reconstruction To Recognition

Uploaded by

Computer Vision:: From 3D Reconstruction To Recognition

Uploaded by

CS231A

Silvio Savarese & Jeanette Bohg Lecture 1 11-Jan-21

Silvio Savarese & Jeanette Bohg Lecture 1

Silvio Savarese & Jeanette Bohg Lecture 1 11-Jan-21

Courtesy of Oxford Visual Geometry Group

Example from Advances in Computer Vision – MIT – 6.869/6.819

J. J. Gibson, The Ecological Approach to Visual Lucas‐Kanade Feature Tracking over multiple frames.

Source: A. Shashua, S. Seitz

Project presentations Final projects

Silvio Savarese & Jeanette Bohg Lecture 1 11-Jan-21

You might also like