Machine learning is an application of artificial intelligence (AI) that provides systems the ability
to automatically learn and improve from experience without being explicitly
programmed. Machine learning focuses on the development of computer programs that can
access data and use it learn for themselves.
Machine learning is the science of getting computers to act without being explicitly
programmed.
Reinforcement learning algorithms manage the sequential process of taking an action,
evaluating the result, and selecting the next best action.
Deep Reinforcement Learning is the combination of
Reinforcement Learning and Deep Learning
RL is one of the three branches in which ML techniques are
generally categorized:
Supervised Learning is the task of learning from tagged
data and its goal is to generalize.
Unsupervised Learning is the task of learning from
unlabeled data and its goal is to compress.
Reinforcement Learning is the task of learning through
trial and error and its goal is to act.
Deep reinforcement learning is the combination of Reinforcement
Learning and Deep Learning that enables software-defined agents to
learn the best actions possible in virtual environment in order to attain
their goals.
Reinforcement learning is “teach by experience”
Human appear to learn to walk through “very few examples” of
trial and error.
Computer vision is a field of artificial intelligence that trains computers
to interpret and understand the visual world. Using digital images from
cameras and videos and deep learning models, machines can accurately
identify and classify objects and then react to what they “see.”
The ability of computer to see, identify and process images or videos in
the same way that human vision does and provide useful results based
on the observation.
Image or video Sensing Device Interpreting Device
Interpretations
The goal of Computer Vision is to make useful decisions about real
physical objects and scenes based on sensed images.
To see what is real and what is painting?
Are people playing? Are they fighting?
Why Is Computer Vision Important?
At Kairos we use computer vision for face recognition, identification, verification, emotion
analysis, and crowd analytics. Without it our business would not exist so it is extremely
important to us.
Computer vision is also great for:
Optical Character Recognition (OCR): Recognizing and identifying
text in documents, a scanner does this.
Vision Biometrics: Recognizing people who have been missing through
iris patterns.
Object Recognition: Great for retail and fashion to find products in
real-time based off of an image or scan.
Special Effects: Motion capture and shape capture, any movie with
CGI.
3-D Printing and Image Capture: Used in movies, architectural
structures, and more.
Sports: In a game when they draw additional lines on the field, yup
computer vision.
Social Media: Anything with a story that allows you to wear something
on your face.
Smart Cars: Through computer vision they can identify objects and
humans.
Medical Imaging: 3D imaging and image guided surgery.
Really the list goes on and on here too. We use computer vision in space, in video games, in
mobile and industrial robots, and in so many other industries.
• Extracting information
• Beyond human capacity
• Solution
• Lets give Computers an eye
The Complexity of Perception
Humans interpret different things
• Brain is complex
• Given any image
• Extracts only relevant information
• Humans are good in “interpreting”
• Similarly powerful in perception
• Machines good in “accuracy”
• Numerical processing
nd
2 Lecture
The pixel (a word invented from "picture element") is the basic unit of programmable
color on a computer display or in a computer image. Think of it as a logical - rather than
a physical - unit. The physical size of a pixel depends on how you've set the resolution
for the display screen.
A px (pixel) is the smallest portion of an image or display that a computer is capable of
printing or displaying
Each pixel has RGB (red, green, and blue) color components. The brightness of each
component is increased or decreased to produce the variance of colors you see on the
screen.
A digital image is a two-dimensional array of vectors.
The array (or matrix) of numbers represents a 2D
projection of a scene.
The number in each cell in a digital image represents the
spectral response of a particular band
The numbers are quantized to a finite number of bits for
representation.
Pinhole Camera
A pinhole camera is a simple camera without a lens but with a tiny aperture (the so-
called pinhole) – effectively a light-proof box with a small hole in one side. Light from a
scene passes through the aperture and projects an inverted image on the opposite side
of the box, which is known as the camera obscura effect.
An aperture is a hole or an opening through which light travels.
Effect of aperture size
Large aperture: light from the source spreads across the image (i.e., not properly
focused), making it blurry!
Small aperture: reduces blurring but (i) it limits the amount of light entering the camera
and (ii) causes light diffraction.
Refraction
Refraction occurs, as in a lens, when a wave passes from one medium into the
second, deviating from the straight path it otherwise would have taken. The amount of
deviation or bending depends on the indexes of refraction of each medium,
determined by the relative speed of the wave in the two media