0% found this document useful (0 votes)

33 views10 pages

Unit 1 Computer Vision

Computer Vision is a branch of AI focused on enabling machines to interpret and understand visual information from images and videos. Key tasks include image classification, object detection, and motion tracking, with applications in areas like autonomous vehicles, medical imaging, and security systems. The document also discusses image formation, capture, representation, and techniques like linear filtering, correlation, and convolution for image processing.

Uploaded by

anithakumaran29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views10 pages

Unit 1 Computer Vision

Uploaded by

anithakumaran29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 10

UNIT 1-COMPUTER VISION BASICS

COMPUTER VISION

Computer Vision is a branch of Artificial Intelligence (AI) that enables computers to

acquire, process, analyze, and understand images or videos, and make decisions or take actions
based on that information.

Computer Vision is the technology that allows machines to gain understanding from images and
videos.

Key Goals of Computer Vision:

 Detect and recognize objects

 Classify and label images
 Track motion in videos
 Understand scenes and environments

Objective:
To simulate human vision by enabling machines to:
 See (capture images)
 Understand (interpret objects, scenes, motion)

 Make decisions (based on visual input)

Key Tasks in Computer Vision:

 Image classification – Identifying what is in an image

 Object detection – Locating and identifying multiple objects

 Image segmentation – Dividing an image into meaningful parts

 Facial recognition – Identifying individuals from their facial features

 Motion tracking – Analyzing movement in video sequences

Applications of Computer Vision:

 Autonomous Vehicles – Object and lane detection

 Medical imaging Healthcare – Analyzing X-rays, MRIs, and scans
 Surveillance and security systems
 Augmented and Virtual Reality
 Security Systems – Face and activity recognition
 Manufacturing – Quality inspection using cameras

 Retail & Marketing – Customer behavior analysis

Related Fields/Disciplines

 Artificial Intelligence (AI)

 Machine Learning (ML)

 Computer Graphics

 Image Processing

 Robotics

 Computer Vision aims to enable machines to perceive, interpret, and understand visual
information from the world. Below are its key goals along with purposes and examples.
Goals of Computer Vision
1. Image Understanding
→ Understand content in an image
Example: Google Photos grouping pictures by person or location
2. Object Recognition & Classification
→ Identify and classify objects
Example: Amazon Go stores recognizing items for checkout
3. Object Detection & Localization
→ Detect objects and their positions
 Object Detection is a computer vision technique that identifies and locates objects
within an image or video.
 Localisation refers to identifying the position of the detected object in the image,
usually by drawing a bounding box around it.

Example: Face detection in mobile phone cameras

4. Scene Reconstruction
→ Create 3D models from 2D images
Example: Augmented Reality (AR) in interior design apps

5. Motion Analysis & Tracking

→ Track moving objects in video
Example: CCTV tracking a person’s movement in real

6. Image Restoration & Enhancement

→ Improve image quality
Example: AI tools restoring old or blurred photographs
7. Automation & Robotics
→ Help machines interact with surroundings
Example: Self-driving cars detecting roads and obstacles

8. Face & Text Recognition

→ Identify faces or read text in images
Example: Passport scanners at airports, Google Lens for text

Advantages of Computer Vision:

1. Automation and Speed: Processes visual data much faster than humans (e.g., inspection
in factories).Enables real-time decisions in applications like self-driving cars.
2. Accuracy and Consistency: Reduces human error in tasks like medical image analysis
or quality control.
3. Handles Large Volumes of Data
Can process and analyze vast amounts of image or video data that would be
overwhelming for humans.

Disadvantages of Computer Vision:

1. High Initial Cost and Complexity: Requires expensive hardware and large datasets for
training.
2. Limited in Unstructured Environments: Performance may drop in poor lighting,
cluttered scenes, or unfamiliar situations.
3. Privacy Concerns:Widespread surveillance and facial recognition can raise ethical and
legal issues.

IMAGE FORMATION

Image formation is the process of capturing a visual representation of a scene using a

camera or sensor and converting it into a digital image that a computer can process.

Key Steps in Image Formation:

1. Light Reflection from Objects
o Light from a source (like the sun or a bulb) reflects off objects in the scene.

2. Camera Lens Captures Light

o The reflected light passes through a camera lens, which focuses it to form an
image.
3. Projection onto Image Sensor
o The focused light hits a sensor (like CCD or CMOS) in the camera, converting it
into electrical signals.
4. Conversion to Digital Image
o The signals are digitized into pixels — small units that represent brightness and
color.

Example:

When you take a photo of a tree using a smartphone:

o The tree reflects light.

o The phone's lens captures and focuses that light.

o The sensor records the light and produces a digital image.

IMAGE CAPTURE

Definition:
Image capture refers to the process of recording the formed image using a sensor and converting
it to a digital format.
Steps/Process:
 Analog Signal Generation: The sensor detects light intensity.
 Analog-to-Digital Conversion (ADC): Converts analog signals to digital pixel values.

 Image Storage: The digital image is stored in memory (as JPG, PNG, etc.).
 CCD (Charge-Coupled Device) and CMOS (Complementary Metal-Oxide
Semiconductor) are image sensors used in cameras to capture light and convert it into
digital images.

Example:
CCTV camera records a video stream in a store. It captures continuous frames per second and
stores them in a digital video format.

IMAGE REPRESENTATION
Definition:
Image representation is how a digital image is stored and processed in a computer system.

Types:

 Grayscale Image: Each pixel has one intensity value (0–255).

 Color Image (RGB): Each pixel has three components – Red, Green, Blue.
 Binary Image: Pixels are either 0 (black) or 1 (white).

Image as Matrix:

 An image is represented as a 2D (grayscale) or 3D (color) matrix of pixels.

Example:
Face recognition systems convert captured facial images into pixel matrices to compare and
identify people

Summary Table:

Concept Meaning Real-Time Example

Converting scene into image using
Image Formation Taking a photo using a camera
optics
Image Capture Digitizing and storing the image CCTV recording a video
Image Storing image as a pixel matrix in Face recognition software
Representation digital format processing an image

LINEAR FILTERING, CORRELATION, AND CONVOLUTION

Linear filtering, correlation, and convolution are fundamental operations in image

processing and computer vision. They are used to manipulate or extract features from images.

What is a Kernel?
A kernel (or filter or mask) is a small matrix used in image processing. It is applied to each pixel
of an image to change its value based on its neighbors. Common sizes are 3×3, 5×5, etc.

 It moves over each pixel in the image (this is called convolution or correlation).
 At each position, it performs a calculation using neighboring pixel values to produce a
new value.
EX:
In photo editing apps, when you apply blur or sharpen, the app is using different kernels behind
the scenes

Example 3×3 Kernel:This kernel averages the pixel values in a 3×3 neighborhood.Useful for
blurring or smoothing an image.
[1/9 1/9 1/9]
[1/9 1/9 1/9]
[1/9 1/9 1/9]

If you use a 5×5 kernel (25 pixels), each element will be 1/25.
If you use a 2×2 kernel, each element will be 1/4, and so on.

Kernel Size Number of Pixels Divide By

3×3 9 9
5×5 25 25
2×2 4 4
Kernel Size Values in Kernel Value of Each Cell
3×3 9 1/9
5×5 25 1/25
2×2 4 1/4

Center Pixel Concept

When a kernel is placed over a patch of an image, the pixel at the center of the patch is called the
center pixel. The kernel calculates a new value for this center pixel using its neighbors.

Example image patch:

[10 20 30]
[40 50 60]
[70 80 90]
Here, 50 is the center pixel.

Why only the middle? Because when the kernel slides across the image, we assign the calculated
result to the position of the center pixel in the output image. This ensures the output image size
remains the same.

LINEAR FILTERING

Definition: A process of applying a filter (or kernel) to an image to enhance certain features (like
edges) or reduce noise.

 The image is processed by sliding a kernel (small matrix) across it.

 Each pixel is updated based on the weighted sum of neighboring pixels.

Use cases:

 Noise reduction (e.g., Gaussian blur)

 Edge detection
 Smoothing
Common Filters:

 Mean Filter: Averages surrounding pixels – smoothing

 Gaussian Filter: Weighted average – less blurring than mean

 Laplacian Filter: Edge enhancement

Formula:

Output(x, y) = Σ Σ [ Image(x+i, y+j) × Kernel(i, j) ]

Example: Mean Blur Kernel (3×3):

[1/9 1/9 1/9]
[1/9 1/9 1/9]
[1/9 1/9 1/9]
Applied to:
[10 20 30]
[40 50 60]
[70 80 90]

Take all 9 numbers, add them, and then divide by 9.

This gives the average – that's why it blurs or smooths the image.

Take all 9 pixels:

10 + 20 + 30 + 40 + 50 + 60 + 70 + 80 + 90 = 450
Now divide by 9:
450 / 9 = 50
Sum = 450, divide by 9 = 50 → new value for center pixel.

So, the center pixel (50) stays the same in this case,
but in real images, this would smooth sharp edges and reduce noise.
CORRELATION
Correlation measures similarity between the kernel and the image patch. We slide the kernel over
the image, multiply corresponding values, and sum them. The kernel is NOT flipped.

Formula:

Output(x, y) = Σ Σ [ Image(x+i, y+j) × Kernel(i, j) ]

Example Kernel:
[0 1 0]
[1 -4 1]
[0 1 0]
Image Patch:
[1 2 3]
[4 5 6]
[7 8 9]
Calculation: (1×0)+(2×1)+(3×0)+(4×1)+(5×-4)+(6×1)+(7×0)+(8×1)+(9×0) = 0

In a face detection system, correlation helps to:

 Detect eyes or nose by comparing parts of the image with a known pattern.
 If a part of the image matches the kernel (like the shape of an eye), the correlation result
is high.

3. CONVOLUTION
Convolution is similar to correlation but the kernel is flipped horizontally and vertically before
applying. In many cases, if the kernel is symmetric, flipping has no effect.

What Does “Kernel Flipped” Mean?

When we say "flipping a kernel", we mean reversing the kernel in both directions:
1. Flip Horizontally (Left–Right)
 Switch columns from left to right.
2.Flip Vertically (Top–Bottom)
Switch rows from top to bottom.
In mobile photo editing apps, convolution is behind effects like sharpen, emboss, or blur.

Formula:

Output(x, y) = Σ Σ [ Image(x+i, y+j) × Kernel(-i, -j) ]

Example:
Kernel before flip:
[0 1 0]
[1 -4 1]
[0 1 0]
After flipping (same in this symmetric case), applying to:
[1 2 3]
[4 5 6]
[7 8 9]
Result = 0.

General Formula for Correlation / Linear Filtering

The general formula for applying a kernel (size: (2m+1) × (2n+1)) to an image is:

g(x, y) = Σ (i = -m to m) Σ (j = -n to n) [ f(x + i, y + j) × h(i, j) ]

where:
- g(x, y) is the output image pixel value at (x, y)
- f(x + i, y + j) is the input image pixel value at the corresponding position
- h(i, j) is the kernel value at position (i, j)
- m, n define the kernel size (for 3×3 kernel, m = 1, n = 1)

EDGE DETECTION

What is an Edge?

An edge in an image is a point where the brightness (intensity) of the image changes sharply. It
marks the boundary between two regions, such as between an object and the background.

Why Detect Edges?

Edge detection helps:

 Identify object boundaries

 Reduce the amount of data

 Extract important features for further processing (like object recognition, segmentation,
etc.)

Types of Edges

1. Step Edge – sudden change in intensity.

2. Ramp Edge – gradual change in intensity.

3. Line Edge – bright line on a dark background.

4. Roof Edge – sharp peak (similar to ramp but thinner).

Common Edge Detection Operators

Operator Description

Sobel Uses gradient magnitude in horizontal and vertical directions.

Prewitt Similar to Sobel but simpler masks.

Canny Advanced method with noise reduction and edge thinning.

Lane Detection in Self-Driving Cars

 The car's camera captures road images.

 Edge detection highlights white lane markings.
 Helps the car stay in the correct lane.

Computer Vision
No ratings yet
Computer Vision
4 pages
Unit-5 Computer Vision
No ratings yet
Unit-5 Computer Vision
3 pages
Computer Vision Class 10 Notes
No ratings yet
Computer Vision Class 10 Notes
5 pages
Al3502 - DLV Unit 1 Notes
No ratings yet
Al3502 - DLV Unit 1 Notes
15 pages
Computer Vision
No ratings yet
Computer Vision
29 pages
Computer Vision Class 10 Notes
100% (8)
Computer Vision Class 10 Notes
7 pages
AI-Computer Vision
No ratings yet
AI-Computer Vision
16 pages
Computer Vision: Facial Recognition
No ratings yet
Computer Vision: Facial Recognition
9 pages
Ai CV Notes
No ratings yet
Ai CV Notes
6 pages
Chapter1 CV
No ratings yet
Chapter1 CV
29 pages
Unit 1 Computer Vision Notes
No ratings yet
Unit 1 Computer Vision Notes
11 pages
8394 Making Machines See
No ratings yet
8394 Making Machines See
50 pages
Unit 1
No ratings yet
Unit 1
200 pages
Computer Vision
No ratings yet
Computer Vision
19 pages
Computer Vision Class X
No ratings yet
Computer Vision Class X
39 pages
cc349-2 MARKS
No ratings yet
cc349-2 MARKS
8 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
Ch01 Introduction To Computer Vision and Image Processing 1
No ratings yet
Ch01 Introduction To Computer Vision and Image Processing 1
29 pages
What Is Computer Vision? What Are The Applications of Computer Vision?
No ratings yet
What Is Computer Vision? What Are The Applications of Computer Vision?
31 pages
Computer Vision
No ratings yet
Computer Vision
30 pages
Unit 1
No ratings yet
Unit 1
15 pages
Image Manipulation Finall
No ratings yet
Image Manipulation Finall
7 pages
Computer Vision Al 701
No ratings yet
Computer Vision Al 701
50 pages
Class 10 AI 417 Computer Vision
No ratings yet
Class 10 AI 417 Computer Vision
22 pages
Lecture 1 AI Summary
No ratings yet
Lecture 1 AI Summary
31 pages
Computer Vision
No ratings yet
Computer Vision
36 pages
Computer Vision
No ratings yet
Computer Vision
15 pages
CV Unit-1
No ratings yet
CV Unit-1
26 pages
Computer Vision: AI Applications Guide
No ratings yet
Computer Vision: AI Applications Guide
3 pages
Multimedia and Computer Vision Unit 5
No ratings yet
Multimedia and Computer Vision Unit 5
25 pages
Computer Vision and Image Processing
No ratings yet
Computer Vision and Image Processing
23 pages
CV Questions
No ratings yet
CV Questions
15 pages
Computer Vision Class X (23-24)
No ratings yet
Computer Vision Class X (23-24)
4 pages
Session 1 & 2 Introduction To Computer Vision and Pixel Relationships
No ratings yet
Session 1 & 2 Introduction To Computer Vision and Pixel Relationships
35 pages
Computer Vision for Tech Enthusiasts
No ratings yet
Computer Vision for Tech Enthusiasts
47 pages
CV CL10
No ratings yet
CV CL10
4 pages
CV (Unit1&2ans)
No ratings yet
CV (Unit1&2ans)
32 pages
CV Unit 1
No ratings yet
CV Unit 1
61 pages
Computer Vision - Unit 1 Notes
No ratings yet
Computer Vision - Unit 1 Notes
13 pages
Image and Video Analytics Unit 1
No ratings yet
Image and Video Analytics Unit 1
110 pages
Notes
No ratings yet
Notes
34 pages
Lect 1 Computervision Student PPT 16-9-2017
No ratings yet
Lect 1 Computervision Student PPT 16-9-2017
143 pages
Unit 1
No ratings yet
Unit 1
179 pages
AD8703 Basic of Computer Vision UNIT 1
No ratings yet
AD8703 Basic of Computer Vision UNIT 1
65 pages
Unit 1
No ratings yet
Unit 1
20 pages
Chapter One-3
No ratings yet
Chapter One-3
8 pages
Computer Vision
No ratings yet
Computer Vision
23 pages
Img Processing
No ratings yet
Img Processing
40 pages
Image Processing & Vision Course
No ratings yet
Image Processing & Vision Course
21 pages
Computer Vision Edited
No ratings yet
Computer Vision Edited
18 pages
CH 3
No ratings yet
CH 3
22 pages
Computer Vision
No ratings yet
Computer Vision
35 pages
Lec1 - Computer Vision - v1
No ratings yet
Lec1 - Computer Vision - v1
38 pages
Wa0194.
No ratings yet
Wa0194.
7 pages
Computer Vision
No ratings yet
Computer Vision
8 pages
Computer Vision (7th Sem)
No ratings yet
Computer Vision (7th Sem)
48 pages
Grade10 AI Notes - Unit 5 Computer Vision (1) - 404622
No ratings yet
Grade10 AI Notes - Unit 5 Computer Vision (1) - 404622
9 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
2 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
4 pages
Gradient Descent 5slides
No ratings yet
Gradient Descent 5slides
5 pages
AD24301
No ratings yet
AD24301
2 pages
Political Unit 5 Democratic Rights
No ratings yet
Political Unit 5 Democratic Rights
4 pages
Goals of Computer Vision
No ratings yet
Goals of Computer Vision
1 page
Unit 2-DLV
No ratings yet
Unit 2-DLV
84 pages
Propics
No ratings yet
Propics
25 pages
Control and Coordination
No ratings yet
Control and Coordination
14 pages
Dust of Snow and Fire and Ice All in One
No ratings yet
Dust of Snow and Fire and Ice All in One
11 pages
SST Project
No ratings yet
SST Project
11 pages
File Organizations
No ratings yet
File Organizations
22 pages
Malware Trojans Botnets
No ratings yet
Malware Trojans Botnets
48 pages
Coding and Robotic
No ratings yet
Coding and Robotic
21 pages
Stock Market Prediction Using Hidden Markov Model
No ratings yet
Stock Market Prediction Using Hidden Markov Model
4 pages
FINAL ASSIGNMENT 3 Swarali Final
No ratings yet
FINAL ASSIGNMENT 3 Swarali Final
11 pages
Bi 16PR HB
100% (1)
Bi 16PR HB
4 pages
Important Principles of Animation Notes
No ratings yet
Important Principles of Animation Notes
15 pages
Original Programming Manual PDM360 NG 12" / Touch: Runtime System V02.03.xx Codesys V2.3
No ratings yet
Original Programming Manual PDM360 NG 12" / Touch: Runtime System V02.03.xx Codesys V2.3
385 pages
Publisher: Korea Nepal Polytechnic Institute (KNPI)
No ratings yet
Publisher: Korea Nepal Polytechnic Institute (KNPI)
136 pages
Module 3
No ratings yet
Module 3
8 pages
Classification Basics
No ratings yet
Classification Basics
14 pages
Google Takeout
No ratings yet
Google Takeout
13 pages
Emp800 e
No ratings yet
Emp800 e
69 pages
Sony SCD 1 SCD 777 SM
No ratings yet
Sony SCD 1 SCD 777 SM
78 pages
Chemical & Tensile Requirements Guide
No ratings yet
Chemical & Tensile Requirements Guide
1 page
Urea Prilling Tower Design
100% (8)
Urea Prilling Tower Design
5 pages
Office Binding Machines Guide
No ratings yet
Office Binding Machines Guide
3 pages
Become A Local Seller On Daraz
100% (1)
Become A Local Seller On Daraz
19 pages
Operation Instructions of Grinding Plant
100% (2)
Operation Instructions of Grinding Plant
389 pages
Hisense Electronics & Appliances Catalog
No ratings yet
Hisense Electronics & Appliances Catalog
4 pages
Notes - B
No ratings yet
Notes - B
43 pages
OM Golive FINGate2.0
No ratings yet
OM Golive FINGate2.0
1 page
Impacts of IFRS 17 Insurance Contracts Accounting Standard: Considerations For Data, Systems and Processes
No ratings yet
Impacts of IFRS 17 Insurance Contracts Accounting Standard: Considerations For Data, Systems and Processes
23 pages
Philips LCD 15PFL4122 - 19PFL4322 - 20PFL4122 Chassis TPS1.0E LA PDF
100% (1)
Philips LCD 15PFL4122 - 19PFL4322 - 20PFL4122 Chassis TPS1.0E LA PDF
105 pages
CBT
No ratings yet
CBT
34 pages
Product Lifecycle and End of Life Information For Broadcom, Symantec, and VMware Products
No ratings yet
Product Lifecycle and End of Life Information For Broadcom, Symantec, and VMware Products
2 pages
Introduction To Systems Analysis and Design
No ratings yet
Introduction To Systems Analysis and Design
48 pages
Cobham Aviator 700D
No ratings yet
Cobham Aviator 700D
438 pages
De La Salle University: Canvas Helps Reimagine and Improve Teaching and Learning
No ratings yet
De La Salle University: Canvas Helps Reimagine and Improve Teaching and Learning
2 pages
An Introduction To Well Control Calculations For Drilling Operations (PDFDrive)
100% (1)
An Introduction To Well Control Calculations For Drilling Operations (PDFDrive)
12 pages
Algebra 2 Honors Lesson 5
No ratings yet
Algebra 2 Honors Lesson 5
23 pages
Clustering with Kruskal's Algorithm
No ratings yet
Clustering with Kruskal's Algorithm
1 page
NoSQL vs RDBMS: A Modern Shift
100% (1)
NoSQL vs RDBMS: A Modern Shift
142 pages

Unit 1 Computer Vision

Uploaded by

Unit 1 Computer Vision

Uploaded by

UNIT 1-COMPUTER VISION BASICS

Computer Vision is a branch of Artificial Intelligence (AI) that enables computers to

Key Goals of Computer Vision:

 Detect and recognize objects

 Make decisions (based on visual input)

Key Tasks in Computer Vision:

 Image classification – Identifying what is in an image

 Image segmentation – Dividing an image into meaningful parts

 Facial recognition – Identifying individuals from their facial features

 Motion tracking – Analyzing movement in video sequences

Applications of Computer Vision:

 Autonomous Vehicles – Object and lane detection

 Retail & Marketing – Customer behavior analysis

 Artificial Intelligence (AI)

Example: Face detection in mobile phone cameras

5. Motion Analysis & Tracking

6. Image Restoration & Enhancement

8. Face & Text Recognition

Advantages of Computer Vision:

Disadvantages of Computer Vision:

Image formation is the process of capturing a visual representation of a scene using a

Key Steps in Image Formation:

2. Camera Lens Captures Light

When you take a photo of a tree using a smartphone:

o The tree reflects light.

o The sensor records the light and produces a digital image.

 Grayscale Image: Each pixel has one intensity value (0–255).

 An image is represented as a 2D (grayscale) or 3D (color) matrix of pixels.

Concept Meaning Real-Time Example

LINEAR FILTERING, CORRELATION, AND CONVOLUTION

Linear filtering, correlation, and convolution are fundamental operations in image

Kernel Size Number of Pixels Divide By

Center Pixel Concept

Example image patch:

 The image is processed by sliding a kernel (small matrix) across it.

 Noise reduction (e.g., Gaussian blur)

 Mean Filter: Averages surrounding pixels – smoothing

 Laplacian Filter: Edge enhancement

Output(x, y) = Σ Σ [ Image(x+i, y+j) × Kernel(i, j) ]

Example: Mean Blur Kernel (3×3):

Take all 9 numbers, add them, and then divide by 9.

Take all 9 pixels:

Output(x, y) = Σ Σ [ Image(x+i, y+j) × Kernel(i, j) ]

In a face detection system, correlation helps to:

What Does “Kernel Flipped” Mean?

Output(x, y) = Σ Σ [ Image(x+i, y+j) × Kernel(-i, -j) ]

General Formula for Correlation / Linear Filtering

g(x, y) = Σ (i = -m to m) Σ (j = -n to n) [ f(x + i, y + j) × h(i, j) ]

Why Detect Edges?

Edge detection helps:

 Identify object boundaries

1. Step Edge – sudden change in intensity.

3. Line Edge – bright line on a dark background.

4. Roof Edge – sharp peak (similar to ramp but thinner).

Common Edge Detection Operators

Sobel Uses gradient magnitude in horizontal and vertical directions.

Prewitt Similar to Sobel but simpler masks.

Canny Advanced method with noise reduction and edge thinning.

Lane Detection in Self-Driving Cars

 The car's camera captures road images.

You might also like