Unit 2
COMPUTER VISION
INDEX
🞭 Introduction
🞭 Basic techniques of Computer Vision
🞭 Applications of Computer Vision
🞭 Computer Vision Libraries and Tools
🞭 Ethical Considerations in Computer Vision
INTRODUCTION
🞭 Computer Vision (CV) is a branch of
Artificial Intelligence (AI) that helps
computers to interpret and understand
visual information much like humans.
🞭 beginners and experienced professionals
and covers key concepts such as Image
Processing, Feature Extraction, Object
Detection, Image Segmentation and other
core techniques in CV.
BASIC TECHNIQUES OF COMPUTER
VISION
🞭 the basics of computer vision seem easy,
processing and understanding an image via
machine vision are quite difficult. Here’s why—
🞭 An image consists of several pixels, with a pixel
being the smallest quanta in which the image can
be divided into.
🞭 Computers process images in the form of an array
of pixels, where each pixel has a set of values,
representing the presence and intensity of the
three primary colors: red, green, and blue.
🞭 All pixels come together to form a digital image.
THIS IS HOW COMPUTER “SEES” IMAGE
BASIC TECHNIQUES OF COMPUTER
VISION
🞭 The values represent the pixel values at
the particular coordinates in the image,
with 255 representing a complete white
point and 0 representing a complete dark
point.
BASIC TECHNIQUES OF COMPUTER
VISION
🞭 Some operations commonly used in computer vision based on a
Deep Learning perspective include:
🞭 Convolution: Convolution in computer vision is an operation in which
a learnable kernel is “convolved” with the image. In other words—the
kernel is slided across the image pixel by pixel, and an element-wise
multiplication is performed between the kernel and the image at
every pixel group.
🞭 Pooling: Pooling is an operation used to reduce the dimensions of an
image by performing operations at a pixel level. A pooling kernel
slides across the image, and only one pixel from the corresponding
pixel group is selected for further processing, thus reducing the
image size., eg., Max Pooling, Average Pooling.
🞭 Non-Linear Activations: Non-Linear activations introduce non-
linearity to the neural network, thereby allowing the stacking of
multiple convolutions and pooling blocks to increase model depth.
FACE AND PERSON RECOGNITION
🞭 Facial Recognition is a subpart of object
detection where the primary object being
detected is the human face.
🞭 While similar to object detection as a task,
where features are detected and localized,
facial recognition performs not only
detection, but also recognition of the
detected face.
FACE AND PERSON RECOGNITION
IMAGE RESTORATION
🞭 image Restoration refers to the restoration
or the reconstruction of faded and old
image hard copies that have been captured
and stored in an improper manner, leading
to loss of quality of the image.
IMAGE RESTORATION
FEATURE MATCHING
🞭 The applications of feature matching are found in computer vision
tasks like object identification and camera calibration. The task of
feature matching is generally performed in the following order:
🞭 Detection of features: Detection of regions of interest is generally
performed by Image Processing algorithms like Harris Corner
Detection
🞭 Formation of local descriptors: After features are detected, the
region surrounding each keypoint is captured and the local
descriptors of these regions of interest are obtained. A local
descriptor is the representation of a point’s local neighborhood and
thus can be helpful for feature matching.
🞭 Feature matching: The features and their local descriptors are
matched in the corresponding images to complete the feature
matching step.
🞭
Application of computer vision
1. Healthcare
● Medical Imaging Analysis: Detecting diseases in X-rays, MRIs, and CT scans (e.g., tumors,
fractures).
● Surgical Assistance: Real-time guidance during surgery using visual data.
● Skin Cancer Detection: Using image classification to identify malignancies from skin images.
Application of computer vision
A chest X-ray of a pneumothorax case—AI
overlays a heatmap (red-yellow) identifying
air pocket region that corresponds with
physician-confirmed abnormality
Application of computer vision
🞭 Autonomous Vehicles
🞭 Self-Driving Cars: Computer vision is used
for object detection, lane detection, traffic
sign recognition, pedestrian tracking, and
obstacle avoidance.
🞭 Drone Navigation: Drones use CV to
detect and avoid obstacles in real-time
while navigating.
🞭 Image cars count on road
Application of computer vision
🞭 Retail and E-commerce
🞭 Application: Enhances the shopping
experience through image-based search,
recommendation systems, and even
checkout-less stores.
🞭 Example: Amazon Go stores use
computer vision to track what customers
pick up, allowing them to leave without
manually checking out.
Application of computer vision
🞭 Retail and E-commerce
🞭 Application: Enhances the shopping
experience through image-based search,
recommendation systems, and even
checkout-less stores.
🞭 Example: Amazon Go stores use
computer vision to track what customers
pick up, allowing them to leave without
manually checking out.
Computer Vision Libraries and Tools
🞭 1. OpenCV (Open Source Computer Vision
Library)
🞭 Description: OpenCV is one of the most popular
and comprehensive open-source libraries for
computer vision tasks. It provides tools for image
processing, object detection, face recognition, and
real-time video processing.
🞭 Languages Supported: C++, Python, Java, and
others.
🞭 Key Features: Image filtering, feature detection,
image transformations, machine learning
integration, real-time video analysis.
Computer Vision Libraries and Tools
🞭 2. TensorFlow & TensorFlow.js
🞭 Description: TensorFlow, developed by
Google, is a popular machine learning
framework that also has strong support for
computer vision tasks. TensorFlow.js brings
machine learning to JavaScript for real-time
computer vision in the browser.
🞭 Languages Supported: Python, JavaScript.
🞭 Key Features: Object detection, image
segmentation, neural networks for visual
tasks, support for deep learning.
Computer Vision Libraries and Tools
🞭 3. PyTorch
🞭 Description: PyTorch is a deep learning
library that is widely used for computer vision
tasks. It’s known for its flexibility, ease of use,
and support for dynamic computation
graphs.
🞭 Languages Supported: Python.
🞭 Key Features: Deep learning for vision tasks
like image classification, segmentation, and
object detection. Popular models include
ResNet, etc.
Computer Vision Libraries and Tools
🞭 Keras
🞭 Description: Keras is a high-level neural
networks API that runs on top of TensorFlow,
making it easier to develop deep learning
models for computer vision tasks.
🞭 Languages Supported: Python.
🞭 Key Features: Simplified implementation of
deep learning models for image
classification, object detection, and
segmentation.
Ethical consideration in CV
Ethical Concern Description Example
CV often captures images/video in
CCTV systems in public spaces or
1. Privacy Invasion public/private spaces without
facial recognition in retail stores.
consent.
Training datasets may lack Face recognition works better on
2. Bias and Discrimination diversity, leading to biased lighter skin tones than darker
outputs. ones.
Individuals are often unaware their
Using social media photos to train
3. Consent and Data Use images are being used or
facial recognition algorithms.
analyzed.
CV can be used for unethical or Military drones using CV for
4. Misuse and Dual-Use
harmful purposes. autonomous targeting.
Inability to explain why an
CV systems (especially deep
5. Lack of Transparency algorithm flagged a person as
learning) are often “black boxes.”
suspicious.
It’s unclear who is liable when CV Who is responsible if a self-driving
6. Accountability
systems fail or cause harm. car hits a pedestrian?
CV enables creation of fake
Political deepfakes spreading
7. Deepfakes & Misinformation videos/images that can deceive
misinformation during elections.
and manipulate.