CS231n:
Deep Learning for Computer Vision
Lecture 1 - Overview
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 1 April 4, 2023
Instructors
Fei-Fei Li Yunzhu Li Ruohan Gao
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 2 April 4, 2023
Today’s agenda
● A brief history of computer vision
● CS231n overview
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 3 April 4, 2023
Today’s agenda
● A brief history of computer vision
● CS231n overview
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 4 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 5 April 4, 2023
Deep Learning Basics
• Image Classification: A core task in Computer Vision
cat
This image by Nikita is
licensed under CC-BY 2.0
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 6 April 4, 2023
Deep Learning Basics
• Image Classification: A core task in Computer Vision
cat
This image by Nikita is
licensed under CC-BY 2.0
Linear Classifier
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 7 April 4, 2023
Deep Learning Basics
• Image Classification: A core task in Computer Vision
cat
This image by Nikita is
licensed under CC-BY 2.0
Regularization & Optimization
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 8 April 4, 2023
Deep Learning Basics
• Image Classification: A core task in Computer Vision
cat
This image by Nikita is
licensed under CC-BY 2.0
Neural Networks
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 9 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 10 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 11 April 4, 2023
Perceiving and Understanding the Visual World
Tasks Models
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 12 April 4, 2023
Tasks Beyond Image Classification
Semantic Object Instance
Classification
Segmentation Detection Segmentation
CAT GRASS, CAT, DOG, DOG, CAT DOG, DOG, CAT
TREE, SKY
No spatial extent No objects, just pixels Multiple Object This image is CC0 public domain
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 13 April 4, 2023
Tasks Beyond Image Classification
Video Multimodal Video Visualization &
Classification Understanding Understanding
Running?
Jumping?
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 14 April 4, 2023
Models Beyond Multi-Layer Perceptron
Illustration of LeCun et al. 1998 from CS231n 2017 Lecture 1
Convolutional neural network
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 15 April 4, 2023
Models Beyond Multi-Layer Perceptron
Recurrent neural network Attention mechanism / Transformers
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 16 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 17 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 18 April 4, 2023
Beyond 2D Recognition
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 19 April 4, 2023
Beyond 2D Recognition: Self-supervised Learning
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 20 April 4, 2023
Beyond 2D Recognition: Generative Modeling
“Teddy bears working on new
AI research underwater with
1990s technology”
DALL-E 2
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 21 April 4, 2023
Beyond 2D Recognition: Generative Modeling
Style Transfer
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 22 April 4, 2023
Beyond 2D Recognition: 3D Vision
Choy et al., 3D-R2N2: Recurrent Reconstruction Neural Network (2016)
Zhou et al., 3D Shape Generation and Completion through Point-Voxel Diffusion (2021) Gkioxari et al., “Mesh R-CNN”, ICCV 2019
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 23 April 4, 2023
Beyond 2D Recognition: Embodied Intelligence
Li et al., BEHAVIOR-1K: A Benchmark for Embodied AI with 1,000 Everyday Activities Mandlekar and Xu et al., Learning to Generalize Across
and Realistic Simulation (2022) Long-Horizon Tasks from Human Demonstrations (2020)
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 24 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 25 April 4, 2023
CS231n overview
● Deep Learning Basics
● Perceiving and Understanding the Visual World
● Generative and Interactive Visual Intelligence
● Human-Centered Applications and Implications
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 26 April 4, 2023
2018 Turing Award for deep learning
most prestigious technical award, is given for major contributions of lasting importance to computing.
Jeffrey Hinton Yoshua Bengio Yann LeCun
This image is CC0 public domain This image is CC0 public domain This image is CC0 public domain
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 27 April 4, 2023
IEEE PAMI Longuet-Higgins Prize
Award recognizes ONE Computer Vision paper from ten years ago with significant impact on computer
vision research.
At CVPR 2019, it was awarded to the 2009 original ImageNet paper
That’s Fei-Fei
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 28 April 4, 2023
>9k submissions, 2,360 accepted papers
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 29 April 4, 2023
Logistics
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 30 April 4, 2023
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 31 April 4, 2023
Lectures
- Tuesdays and Thursdays between 12:00 PM to 1:20 PM at NVIDIA Auditorium
- Lectures will not be streamed on Zoom but will be broadcasted live via Panopto
- Slides will be posted on the course website shortly before each lecture
- All lectures will be recorded and uploaded to Canvas after the lecture under the
“Panopto Course Videos” Tab.
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 32 April 4, 2023
Course website [http://cs231n.stanford.edu/] - Refresh!
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 33 April 4, 2023
Friday Discussion Sections
6 Discussion sections Fridays 1:30 PM - 2:20 PM at Thornton 102
04/07 Python / Numpy Review Session
04/14 Backprop Review Session
04/21 Final Project Overview and Guidelines
04/28 PyTorch / TensorFlow Review Session
05/05 RNNs & Transformers
05/12 Midterm Review Session
Hands-on tutorials, with more practical details than the main lecture
Check canvas for the Zoom link of the discussion sessions!
This Friday: Python / numpy / Colab
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 34 April 4, 2023
Ed
For questions about assignments, final project, midterm, logistics, etc, use Ed!
Access: Canvas -> Deep Learning for Computer Vision -> Ed Discussion
SCPD students: Use your @stanford.edu address to register for Ed; contact
scpd-customerservice@stanford.edu for help.
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 35 April 4, 2023
Office Hours
We'll be hosting both in-person and remote office hours. (starting week 2)
- Location
- In-person: Huang basement, look for a CS231N sign
- Remote: Zoom and QueueStatus to setup queues
- Please see Canvas or Ed for the QueueStatus link
- TAs will admit students to their Zoom meeting rooms for 1-1 conversations when it’s your
turn using QueueStatus.
- Office hour schedule is on the course website
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 36 April 4, 2023
Overview on communication
Course Website: http://cs231n.stanford.edu/
- Syllabus, lecture slides, links to assignment downloads, etc
Ed:
- Use this for most communication with course staff
- Ask questions about homework, grading, logistics, etc
- Use private questions only if your post will violate honor code if you release publicly.
Mailing list
- cs231n-staff-spr23@cs.stanford.edu
Gradescope:
- For turning in homework and receiving grades
Canvas:
- For watching recorded lectures
- For watching recorded discussion sessions
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 37 April 4, 2023
Assignments
All assignments will be completed using Google Colab
Assignment 1: Will be out Friday 4/7, due 4/21 by 11:59 PM
- K-Nearest Neighbor
- Linear classifiers: SVM, Softmax
- Two-layer neural network
- Image features
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 38 April 4, 2023
Grading
All assignments, coding and written portions, will be submitted via Gradescope.
An auto-grading system:
- A consistent grading scheme
- Public tests:
- Students see results of public tests immediately
- Private tests
- Generalizations of the public tests to thoroughly test your implementation
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 39 April 4, 2023
Grading
3 Assignments: 10% + 20% + 15% = 45%
In-Class Midterm Exam: 20%
Course Project: 35%
- Project Proposal: 1%
- Milestone: 2%
- Final Project Report: 29%
- Poster & Poster Session: 3%
Participation Extra Credit: up to 3%
Late policy
- 4 free late days – use up to 2 late days per assignment
- Afterwards, 25% off per day late
- No late days for project report
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 40 April 4, 2023
AWS
We will have AWS Cloud credits available for projects
- Not for HWs (only for final projects)
We will be distributing credits to all enrolled students using your AWS account
IDs
We will have a tutorial for walking through the AWS setup
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 41 April 4, 2023
Collaboration policy
We follow the Stanford Honor Code and the CS Department Honor Code – read
them!
● Rule 1: Don’t look at solutions or code that are not your own; everything you
submit should be your own work
● Rule 2: Don’t share your solution code with others; however discussing ideas
or general strategies is fine and encouraged
● Rule 3: Indicate in your submissions anyone you worked with
Turning in something late / incomplete is better than violating the honor code
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 42 April 4, 2023
Prerequisites
Proficiency in Python
- All class assignments will be in Python (and use numpy)
- Later in the class, you will be using Pytorch and TensorFlow
- A Python tutorial available on course website
College Calculus, Linear Algebra
No longer need CS229 (Machine Learning)
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 43 April 4, 2023
Optional textbook resources
- Deep Learning
- by Goodfellow, Bengio, and Courville
- Here is a free version
- Mathematics of deep learning
- Chapters 5, 6 7 are useful to understand vector calculus and continuous optimization
- Free online version
- Dive into deep learning
- An interactive deep learning book with code, math, and discussions, based on the NumPy
interface.
- Free online version
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 44 April 4, 2023
Learning objectives
Formalize computer vision applications into tasks
- Formalize inputs and outputs for vision-related problems
- Understand what data and computational requirements you need to train a model
Develop and train vision models
- Learn to code, debug, and train convolutional neural networks.
- Learn how to use software frameworks like PyTorch and TensorFlow
Gain an understanding of where the field is and where it is headed
- What new research has come out in the last 0-5 years?
- What are open research challenges?
- What ethical and societal considerations should we consider before deployment?
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 45 April 4, 2023
Why should you take this class?
Become a vision researcher (an incomplete list of conferences)
- Get involved with vision research at Stanford: apply using this form.
- CVPR 2022 conference
- ICCV 2021 conference
Become a vision engineer in industry (an incomplete list of industry teams)
- Perception team at Google AI, Vision at Google Cloud
- Vision at Meta AI
- Vision at Amazon AWS
- Nvidia, Tesla, Apple, Salesforce, ……
General interest
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 46 April 4, 2023
CS231n: Deep Learning for Computer Vision
● Deep Learning Basics (Lecture 2 – 4)
● Perceiving and Understanding the Visual World (Lecture 5 – 12)
● Reconstructing and Interacting with the Visual World (Lecture 13 – 16)
● Human-Centered Artificial Intelligence (Lecture 17 – 18)
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 47 April 4, 2023
Syllabus
Deep Learning Basics Convolutional Neural Networks Computer Vision Applications
Data-driven learning Convolutions RNNs / Attention / Transformers
Linear classification & kNN PyTorch / TensorFlow Image captioning
Loss functions Activation functions Object detection and segmentation
Optimization Batch normalization Style transfer
Backpropagation Transfer learning Video understanding
Multi-layer perceptrons Data augmentation Generative models
Neural Networks Momentum / RMSProp / Adam Self-supervised learning
Architecture design 3D vision
Robot learning
Human-centered AI
Fairness & ethics
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 48 April 4, 2023
Next time: Image classification with Linear Classifiers
k- nearest neighbor Linear classification
Plot created using Wolfram Cloud
Fei-Fei Li, Yunzhu Li, Ruohan Gao Lecture 1 - 49 April 4, 2023