COMPUTER VISION
The Blossoms School, Aligarh Class X Mohd Suhail Athar
Basics of Images
How do they
display images
on the screen?
How do
computers
store images?
What are
images
made of?
PIXEL
The word “pixel” means a picture element.
Every photograph, in digital form, is made up of pixels.
They are the smallest unit of information that make up a picture.
Usually round or square, they are typically arranged in a 2-dimensional grid.
The more pixels you have, the more closely the image resembles the original.
Resolution
The number of pixels in an image is called the resolution.
Resolution is typically described using two numbers: width and
height.
• Width tells you how many pixels there are from left to right.
• Height tells you how many pixels there are from top to
bottom.
Resolution
Camera Megapixels
• A megapixel is a million pixels.
• Images taken by a 5 Megapixel Camera will follow the
following formula –
Length x Height = 5,000,000 pixels
Pixel Value
Each of the pixels that represents an image stored inside a
computer has a pixel value which describes how bright that
pixel is, and/or what color it should be.
The most common pixel format is 8-bit image, having pixel
value in range 0-255
Why do we have a value of 255 ?
Each bit in a computer system can have either a zero or a
one.
Generally, the value of a pixel is stored as an 8-bit integer.
This means a pixel value can range between 0-255.
2⁸ = 256
Greyscale Images
Grayscale images, also known as black and white images, are
images in which each pixel is represented in shades of gray,
typically ranging from black (the darkest) to white (the lightest). In
a grayscale image, there is no color information.
The pixel value 0 corresponds to black (completely dark), while 255
corresponds to white (completely bright).
A pixel value of 128 would represent a shade of gray that is roughly
halfway between black and white.
Greyscale Images
RGB Images
All the color images that we see around are made up of three
primary colors Red, Green and Blue.
How do computers store RGB images?
Every RGB image is stored in the form of three different channels
called the R channel, G channel and the B channel.
The pixel-value of pixels in each channel varies from 0-255.
Each pixel in the image is represented by a set of three values, one
for each channel, such as (R, G, B). These values determine the
color of the pixel.
RED (255, 0, 0) GREEN (0, 255, 0) BLUE (0, 0, 255)
RGB Images
RGB Online:
https://www.csfieldguide.org.nz/en/interactives/rgb-mixer/
Pixel Art Online:
https://www.piskelapp.com/p/create/sprite
Pixel Values:
https://www.csfieldguide.org.nz/en/interactives/pixel-viewer/
Computer Vision
The Computer Vision is a domain of Artificial
Intelligence, which enables machines to see
through images or visual data, process and analyze
them based on algorithms.
Home Applications of Computer Vision
Work
Facial Recognition
Face Filters
Google Search by Images
Computer Vision in Retail
Self Driving Cars
Medical Imaging
Google Translate App
How Computer Vision Works?
Computer vision processes and analyzes images and
videos through a series of steps, which are performed
to get certain information from the input image.
How Computer Vision Works
How Computer Vision Works
Classification
Image classification in computer
vision is the process of categorizing
or labeling an input image based on
its contents.
Classification + Localization
This is the task which involves both
processes of identifying what
object is present in the image and
at the same time identifying at
what location that object is present
in that image.
It is used only for single objects.
Object Detection
Object detection aims to locate
and identify multiple objects
within an image such as faces,
buildings, toys etc. and provide
information about their
positions.
Instance Segmentation
Instance Segmentation provides
detailed pixel-level mask for each
object-instance in the image.
A segmentation algorithm takes an
image as input and outputs a
collection of regions (or segments).
Object Detection V/S Image Classification V/s Localization
Image classification is used when you want to categorize
an entire image into a single class.
Localization aims to determine the location of a single
object within an image.
Object detection is suitable when you need to locate
and identify multiple objects within an image.