Lecture 1 AI Summary
Lecture 1 AI Summary
Biological Vision
What is it?
● Biological vision is the ability of living beings to perceive their surroundings using
their eyes.
2. Transmission: The captured signals are sent to the brain via the optic nerve.
● Biological vision serves as inspiration. If we can understand how humans see and
interpret the world, we can try to replicate this process in machines.
Key Research:
o Simple cells in the brain that detect edges or bars of light at specific
orientations.
o Complex cells that detect edges but are less sensitive to position.
Example:
● When you look at a flower, simple cells might detect its edges, while complex cells
recognize its structure, regardless of where you’re focusing.
2. Computer Vision
What is it?
● It’s not just about taking a photo (like in photography); it's about extracting
meaningful insights from it.
Key Idea:
● Machines need to "see" to:
Example:
● A self-driving car uses cameras to identify road signs, pedestrians, and obstacles.
a. Inverse Problem
Why is it hard?
● The same 2D image could correspond to many different 3D scenes. For example:
b. Challenges
1. Occlusion: Some objects may block others in the scene, making it hard to identify all
objects.
● Light from the real world enters a camera (or eyes) and gets converted into a 2D
image.
Steps:
1. Light Interaction: Light reflects off objects and enters the lens.
2. Projection: The camera lens projects the light onto an image sensor, forming a 2D
representation.
3. Digitization: The continuous light signals are converted into discrete pixel values.
Example:
o 0: Black
o 255: White
5. Digitization
What is it?
● Digitization converts a continuous image into discrete pixels and intensities.
a. Sampling
b. Quantization
● Example: Grayscale images usually have intensity values between 0 (black) and 255
(white).
Trade-off:
● Higher resolution and more intensity levels = better image quality but require more
storage.
6. Types of Images
3. Color Images: Each pixel has three intensity values for red, green, and blue (RGB).
Example:
● A pixel in a color image might have RGB values (255, 0, 0), representing bright red.
Computer vision has a wide range of applications, enabling machines to perform tasks that
typically require human visual understanding. Below are some common and impactful
applications:
● Example:
How it works:
1. Preprocessing: The image is cleaned to enhance the contrast and remove noise.
● Features like edge gradients and pixel distributions are compared using similarity
metrics (e.g., Euclidean distance).
2. Medical Imaging
● What it does: Assists doctors in diagnosing diseases from X-rays, MRIs, and CT scans.
● Example:
How it works:
● Techniques like edge detection and region segmentation help isolate areas of
interest (e.g., a tumor).
● Convolutional Neural Networks (CNNs) are often used for this task, as they excel at
pattern recognition in images.
3. Self-Driving Cars
● What it does: Allows vehicles to "see" their environment and navigate without
human input.
● Example:
How it works:
4. Surveillance
● Example:
How it works:
● Example:
numeric value.
Filtering
What is it?
Types of Filters
Applications of Filtering
1. Noise Removal:
2. Enhancing Features:
3. Preprocessing:
1. Binary Images:
2. Grayscale Images:
o Each pixel represents a single intensity value, typically ranging from 0 (black)
to 255 (white).
3. Color Images:
Applications of Digitization
1. Medical Imaging:
3. Image Analysis:
● In medical imaging (e.g., MRI or CT scans), high resolution is critical to capture fine
details for accurate diagnosis.
● Example: A low-resolution brain scan might miss small abnormalities (like a tumor)
that a high-resolution scan would reveal.
● High Resolution: Better image quality but higher storage and processing
requirements.
o Surveillance systems might use lower resolution to save storage but record at
higher resolutions during critical events.
Practical Solutions and Examples
● Deep Learning: Modern algorithms like YOLO or Mask-RCNN are designed to handle
occlusion, variability, and illumination differences by learning from large datasets.
1. Take two images of the same scene from slightly different angles.
● Workflow: