Unit 3 Computer Vision: Structure Page Nos. 3.0 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
Unit 3 Computer Vision: Structure Page Nos. 3.0 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
Unit 3 Computer Vision: Structure Page Nos. 3.0 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
Information Technology
UNIT 3 COMPUTER VISION
Structure Page Nos.
3.0 Introduction 38
3.1 Objectives
38
3.2 What is Computer Vision
38
3.3 Basic Terminology
39
3.4 Goals of Computer Vision
41
3.5 Technical Challenges
41
3.6 Applications of Computer Vision
42
3.7 Advantages of Computer Vision
44
3.8 Examples
45
3.9 Summary
49
3.10 Answers/Solutions
49
3.11 Further Readings and References
51
3.0 INTRODUCTION
Computer Vision is the branch of Computer Science whose goal is to model the real
world or to recognize objects from digital images. These images can be acquired using
video or infrared cameras, radars or specialized sensors such as those used by doctors,
scientist, geologist etc.
Vision is a most powerful interfacing device for computers. They have the potential to
sense body movements and their position, head orientation, direction of the body, and
gestures.
When the vision based interaction is established with the machine, they could make
the interaction between both of us enjoyable or safer. The algorithms for the computer
vision should be reliable, work for different people, and work in odd conditions. The
responding time of these algorithms should be very fast. The user should sense delay
between when he moves or make gesture and when computer responds.
The student will learn the basic techniques of the field of Computer Vision.
You will also learn what are the technical challenges and applications for this
approach with examples. You will also learn the use of computer vision in computer
graphics to make interactive and effective.
38
State of the Art Practices in
Information Technology
3.1 OBJECTIVES
After going through this unit, you should be able to:
• define the term computer vision;
• list the goals, applications & advantages of computer vision; and
• define technical challenges of computer vision.
3.2 WHAT IS COMPUTER VISION
The goal of Computer vision is to process images acquired with cameras in order to
produce a representation of objects in the world. There already exists a number of
working systems that perform parts of this task in specialized domains. For example, a
map of a city or a mountain range can be produced semi automatically from a set of
aerial images. A robot can use the several image frames per second produced by one
or two video cameras to produce a map of its surroundings for path planning and
obstacle avoidance. A printed circuit inspection system may take one picture per
board on a conveyer belt and produce a binary image flagging possible faulty
soldering points on the board. A zip code reader takes single snapshots of envelopes
and translates a handwritten number into an ASCII string. A security system can
match one or a few pictures of a face with a database of known employees for
recognition.
Vision is the task of “see”. It is seeing with understanding other than seeing with
camera. When we “see” things, our eyes (sensing device) capture the image, then pass
the information to brain (interpreting device). The brain interprets the image, gives us
meanings of what we see. Similarly, in computer vision, camera serves as sensing
device, and computer acts as interpreting device to interpret the image the camera
captures.
“Computer vision is the science that develops the theoretical and algorithmic basis by
which useful information about the world can be automatically extracted and
analyzed from an observed image image”.
Computer vision is related to many areas, including biology, psychology, and
information Engineering, physics, maths, and of course computer science. Further
Subsections, will focus on the methods adopted in computer vision area, specifically
object recognition, which is the hardest domain in computer vision research.
3.3 BASIC TERMINOLOGY
There is several application of basic terminology. We will explain at some of them:
Point: A precise location or place on a plane. Usually represented by a dot.
Light: Light is everywhere in our world. We need it to see: it carries information from
the world to our eyes and brains..
Ray: A line which starts at a point and goes off in a particular direction to infinity.
39
Computer Vision
Image: “An optical or other representation of a real object; a graphic; a picture.”
Pixel: “In digital imaging, a pixel (or picture element) is the smallest item of
information in an image. Pixels are normally arranged in a 2dimensional grid, and are
often represented using dots, squares, or rectangles”.
Intensity: “The intensity of each pixel is variable; in color systems, each pixel has
typically three or four components such as red, green, and blue, or cyan, magenta,
yellow, and black”.
Range: “image line of sight distance”.
Focus: it is also known as an image point, is the point where light rays originating
from a point on the object converge
Image Processing: the study of the properties of operators that produce images from
other images – we will touch on image filtering and related operators from image
processing.
Pattern Recognition: typically refers to the recognition of structures in 2D images
(usually without reference to any underlying 3D information).
Diffused light Source: The light source may be a point light source as in Figure A or
a parallel beam of light source as depicted in Figure B or a diffused light source
similar to that of an ambient light.
Point light Source A parallel beam of light source
Photogrammetry: The science of measurement though noncontact sensing, e.g.
terrain maps from satellite images. Usually is more focused on accuracy issues than
interpretation.
Check Your Progress 1
1) What do you understand by “Computer Vision”?
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
40
State of the Art Practices in
Information Technology 2) Explain the term image processing with suitable example?
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
3) What is a Photogrammetry?
………………………………………………………………………………………
……….……………………………………………………………………………..
…….………………………………………………………………………………..
4) Define the term intensity, focus and pixel?
………………………………………………………………….…………………...
……………………………………………………………………….……………...
………………………………………………………………………………………
3.4 GOALS OF COMPUTER VISION
The technology “computer vision” is used to generate intelligent and useful scenes
and visual sequences by performing some operations on the signals that received from
the video cameras.
The goal of computer vision is to produce a representation of objects by processing
images captured through cameras. The main goals behind this technology are:
• To make useful and intelligent decisions based on sensed images.
• To construct 3D structure from 2D images.
• Compression of videos or images for content delivery.
These are the following subdomains of computer vision:
• scene reconstruction
• event detection
• tracking
• object recognition
• motion estimation
• image restoration
3.5 TECHNICAL CHALLENGES
Every technology has faced many challenges either in technical form or
41
Computer Vision
manufacturing form. Computer vision also has lots of technical problems to construct
an intelligent system with the help of video sequences or images. The technical
problems related to computer vision are explained below:
• Inversion of 3D into 2D: every image has a twodimensional projection, but
the people around this world want to sense visually in threedimension. So, if you
want to recover world properties from the image, you need to invert the 3D into
2D projection. Hence, the vision is invert optics. If we talked about in
mathematical terms the inversion of 2D into 3D of a projection is strictly
impossible.
• Face detection: A classical and central problem in computer vision is face
recognition. Humans perform this task without any effort, and reliablity.
Recognition: recognition is the classical problem stated in computer vision.
Recognition is used to determine whether or not the image contains some specific
object, feature, or activity. These tasks can be done by without effort of human,
but still doesn’t solve satisfactorily for the general case “arbitrary objects in
arbitrary situations”.
Image restoration: The main aim of the image restoration is to remove noise
(sensor noise, motion blur, etc.) from images. By using various types of filters to
avoid this types of problems.
• View point variation: view point variation refers to the different view points
of a same object. Whenever we see any sculpture or image from different view
points, we always get totally different pictures of the same object.
• Illumination: the illumination refers to the light. When we take a photo of a
person in dim light and another in heavy light, always we can get two different
images of same person. Hence light affect images.
• Occlusion: occlusion means to complete an incomplete image of an object.
Occlusions help to complete the incomplete images. In Human beings they have
capabilities to complete an incomplete image automatically. Whenever we get an
image with top half of the person, we complete the image by occlusion.
• Background clutter: picking up an object from background of the same
image is not an easy task to done by computers.
• Loss of Depth: Camera images of a scene are formed by projecting 3D space
to a 2D plane. During this process the distance traveled by light between scene
and camera (i.e. depth) is lost.
3.6 APPLICATIONS OF COMPUTER VISION
There is several application of computer vision. We will look at some of them:
Medical: One of the most prominent application fields is Medical computer vision or
medical image processing. This area is characterized by the extraction of information
from image data for the purpose of making a medical diagnosis of a patient.
Generally, image data is in the form of microscopy images, Xray images,
angiography images, ultrasonic images, and tomography images.
42
State of the Art Practices in
Information Technology
Figure 1: Shows MRI image of brain
An example of information which can be extracted from such image data is detection
of tumors, arteriosclerosis or other malign changes. It can also be measurements of
organ dimensions, blood flow, etc. This application area also supports medical
research by providing new information, e.g., about the structure of the brain, or about
the quality of medical treatments.
Industry: A second application area in computer vision is in industry. Here,
information is extracted for the purpose of supporting a manufacturing process. One
example is quality control where details or final products are being automatically
inspected in order to find defects. Another example is measurement of position and
orientation of details to be picked up by a robot arm.
Military: Military applications are probably one of the largest areas for computer
vision. The obvious examples are detection of enemy soldiers or vehicles and missile
guidance. More advanced systems for missile guidance send the missile to an area
rather than a specific target, and target selection is made when the missile reaches the
area based on locally acquired image data. Modern military concepts, such as
"battlefield awareness", imply that various sensors, including image sensors, provide a
rich set of information about a combat scene which can be used to support strategic
decisions. In this case, automatic processing of the data is used to reduce complexity
and to fuse information from multiple sensors to increase reliability.
Automobiles: One of the newer application areas is autonomous vehicles, which
include submersibles, landbased vehicles (small robots with wheels, cars or trucks),
aerial vehicles, and unmanned aerial vehicles (UAV). The level of autonomy ranges
from fully autonomous (unmanned) vehicles to vehicles where computer vision based
systems support a driver or a pilot in various situations. Fully autonomous vehicles
typically use computer vision for navigation, i.e. for knowing where it is, or for
producing a map of its environment (SLAM) and for detecting obstacles. It can also
be used for detecting certain task specific events, e. g., a UAV looking for forest fires.
43
Computer Vision
Figure 2: NASA's Mars Exploration Rover
Examples of supporting systems are obstacle warning systems in cars, and systems for
autonomous landing of aircraft. Several car manufacturers have demonstrated
systems for autonomous driving of cars, but this technology has still not reached a
level where it can be put on the market. There are ample examples of military
autonomous vehicles ranging from advanced missiles, to UAVs for recon missions or
missile guidance. Space exploration is already being made with autonomous vehicles
using computer vision, e. g., NASA's Mars Exploration Rover.
Video surveillance
Perhaps the most developed modern application of computer vision is video
surveillance. Long gone are the days when video surveillance meant lowresolution,
blackandwhite, analog closedcircuit television. Nowadays, computer vision enables
the integration of views from many cameras into a single, consistent “super image.”
Such an image automatically detects scenes with people and/or vehicles or other
targets of interest, classifies them in categories such as people, cars, bicycles, or
buses, extracts their trajectories, recognizes limb and arm positions, and provides
some form of behavior analysis
The Voice: The Voice provides a simple yet effective means of augmented perception
for people with partially impaired vision. In the virtual demonstration, the camera
accompanies you in your wanderings. The camera periodically scans the scene in
front of you and turns images into sounds, using different pitches and lengths to
encode objects’ position and size.
Check Your Progress 2
1) What is the goal of using computer vision?
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
44
State of the Art Practices in
Information Technology 2) Write down two technical challenges faces of computer vision and explain it?
…………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
3) Write three applications area where computer vision is used?
...................................................................................................................
…………………………………………………………………………………..
……………………………………………………………………………………
4) What is an occlusion?
..................................................................................................................
…………………………………………………………………………………..
……………………………………………………………………………………
3.7 ADVANTAGES OF COMPUTER VISION
Computer vision has almost all kinds of applications. For example in the medical
industry, it is used for microscopy images, Xray images, angiography images,
ultrasonic images etc; in military application, it is used for missile guidance and
detecting soldiers or vehicles. The major advantages of computer vision with respect
to different kinds of applications are:
• Automatically detect face and recognize facial expression.
• Object based compression of video streams.
• Track down the moving objects and avoid collision
• Automatically analyse the medical image, then interpret the image and finally
provide diagnosis.
• Recognition of handwritten and printed materials.
• Slow and fast motion detection.
• Small and large object tracking.
3.8 EXAMPLES
Computer vision and biological vision are complement of each other (but not
necessarily the opposite). In biological vision, humans and animals visual perception
are studied and creating models for evaluating the systems operate in terms of
physiological processes. On the other hand, in computer vision, it includes studies and
descriptions of artificial vision systems that are implemented in software and
hardware.
The following examples of computer vision are stated below:
45
Computer Vision
• Image processing: in image processing, image is an input such as image of a
person and the output can be either an image form or a set of objects (face, eye
etc.) related to the image. Feature detection is a lowlevel image processing
operation. This is the first operation performed on an image. In this, we examine
every pixel of an image to see if there is a feature present at that pixel. In Figure
3, it is shown that through process of digital assembling of multiple images how
to create a final image.
Figure 3: Image processing
• Examining the internal structure of a human body: It is a medical imaging
technique most commonly used in radiology to visualize the internal structure and
function of the body for medical and clinical purpose. It especially useful in
neurological (brain), musculoskeletal, cardiovascular, and oncological (cancer)
imaging. MRI (Figure 4) is used to get the image of internal body parts to
diagnose a patient. MRI stands for Magnetic Resonance Imaging.
46
State of the Art Practices in
Figure 4: MRI (Magnetic Resonance Imaging) Machine
Information Technology
• Optic Character Recognition (OCR): it is a electronic or mechanical device
equipped with OCR software that used to convert characters to respective
ASCII codes. It is a used to translate images that are handwritten, typewritten
or printed text into machineeditable text. These devices work reliably for
printed text. They are using two standard fonts first is OCRA (American
standard) and other is OCRB (European standard).
Figure 5: Optical Character Recognition (OCR)
• Analyze Satellite Image: It is also used to get satellite images of the climatic
condition, environment etc and analyse these images to get useful information
from the images. These satellite images are very helpful for whether
forecasting, whether reports etc. Satellite images (Figure 6) have been used in
several areas such as geology, agriculture, forestry, education etc.
47
Computer Vision
Figure 6: Satellite images
• Unmanned Aerial Vehicle (UAV): It is a remotely piloted aircraft. There are
two varieties of UAVs: some has control from their remote location, and
others are having preprogrammed flight planes which fly autonomously. It is
used to perform wide variety of functions. The major functions of UAVs
(Figure 7) are remote sensing.
Figure 7: Unmanned Aerial Vehicles (UAV)
• Smart offices: it is used to get information regarding to all office staff. It is
used to tracks staff members and office items. It also record gesture of a
person.
• Biometric based visual identification of persons: there are lots of tools
available for identification and authorization of persons. As we talked in
48
State of the Art Practices in
Information Technology terms of computer vision, there are wide ranges of tools such as finger
printing, face detection cameras, visual biometric speakers, signature tracking
etc.
Figure 8: Biometric based figure printing
• Compression of videos: compressing videos means reducing quantity of the
video image to represent digital video images. We can compress videos by
using model based system in computer vision. It is very easy and convenient
to compress videos and send these video sequences anywhere.
• Human Face Identification: it is used to detect human facial expression. It is
also used to authenticate a person to permit in highly security zone.
Figure 9: Face detection by face images
Check Your Progress 3
1) Write down three advantages of computer vision?
………………………………………………………………………………………
………………………………………………………………………………………
49
Computer Vision
………………………………………………………………………………………
2) Write down few example of computer vision and explain two of them?
………………………………………………………………………………………
………………………………………………………………………………………
……………………………………………………………………………………
3) What is an OCR?
………………………………………………………………………………………
………………………………………………………………………………………
…………………………………………………………………………………
4) Write down the uses of MRI machine?
………………………………………………………………………………………
………………………………………………………………………………………
………………………………………………………………………………………
3.9 SUMMARY
In this unit we learnt about computer vision and their application fields. Started with
computer vision introduction we move to the different goals of computer vision in
today’s era. In this unit we covered up all the applications area of computer vision
where it is used. The application area is very wide most of the area used this
technology according to their need such as medical, industry, military, automobiles
etc. After all of this, they have some issues which are very difficult to solve. The basic
technical problems related to the computer vision are face detection, occlusion,
illumination, recognition are major problems. In this unit we also covered up
advantages of using computer vision.
3.10 ANSWERS/SOLUTIONS
Check Your Progress 1
1) Computer Vision is the branch of Computer Science whose goal is to model the
real world or to recognize objects from digital images. These images can be
acquired using video or infrared cameras, radars or specialized sensors.
2) It is a form of signal processing, input is an image such as images or videos
sequences and the output can be either an image form or a set of parameters
related to the image. Mostly in imageprocessing techniques, it takes an image as
a twodimensional signal and applying standard signalprocessing techniques to it.
3) The science of measurement though noncontact sensing, e.g. terrain maps from
satellite images. Usually is more focused on accuracy issues than interpretation
4) Intensity: “The intensity of each pixel is variable; in color systems, each pixel has
50
State of the Art Practices in
Information Technology typically three or four components such as red, green, and blue, or cyan, magenta,
yellow, and black”.
Focus: it is also known as an image point, is the point where light rays originating
from a point on the object converge.
Pixel: “In digital imaging, a pixel (or picture element) is the smallest item of
information in an image.
Check Your Progress 2
1) The main goals behind this technology are:
• To make useful and intelligent decisions based on sensed images.
• To construct 3D structure from 2D images.
• Compression of videos or images for content delivery
2) The technical problems related to computer vision are explained below:
i) Inversion of 3D into 2D: every image has a twodimensional projection, but
the people around this world want to sense visually in threedimension. So, if
you want to recover world properties from the image, you need to invert the
3D into 2D projection. Hence, the vision is invert optics. If we talked about in
mathematical terms the inversion of 2D into 3D of a projection is strictly
impossible.
ii) Recognition: recognition is the classical problem stated in computer vision.
Recognition is used to determine whether or not the image contains some
specific object, feature, or activity. These tasks can be done by without effort
of human, but still doesn’t solve satisfactorily for the general case “arbitrary
objects in arbitrary situations”.
3) The three application areas of computer vision are:
i) Military
ii) Medical
iii) Industry
4) Occlusion means to complete an incomplete image of an object. Occlusions help
to complete the incomplete images. In Human beings they have capabilities to
complete an incomplete image automatically. Whenever we get an image with top
half of the person, we complete the image by occlusion.
Check Your Progress 3
1) The three advantages of computer vision are:
• Automatically detect face and recognize facial expression.
• Object based compression of video streams.
• Track down the moving objects and avoid collision
2) The following examples of computer vision are stated below:
51
Computer Vision
• Image processing
• MRI
• OCR
• UAV
• Analyze Satellite Image
Analyze Satellite Image: It is also used to get satellite images of the climatic
condition, environment etc and analyse these images to get useful information
from the images. These satellite images are very help full for whether forecasting,
whether reports etc. Satellite images have been used in several areas such as
geology, agriculture, forestry, education etc.
Unmanned Aerial Vehicle (UAV): It is a remotely piloted aircraft. There are two
varieties of UAVs: some has control from their remote location, and others are
having preprogrammed flight planes fly autonomously. It is used to perform wide
variety of functions. The major functions of UAVs are remote sensing. It can also
commonly used as interaction and transport.
3) Optic Character Recognition (OCR): It is a electronic or mechanical device
equipped with OCR software that used to convert characters to respective ASCII
codes. It is a used to translate images that are handwritten, typewritten or printed
text into machineeditable text. These devices work reliably for printed text. They
are using two standard fonts first is OCRA (American standard) and other is
OCRB (European standard).
4) MRI especially useful in neurological (brain), musculoskeletal, cardiovascular,
and oncological (cancer) imaging.
3.11 FURTHER READINGS AND REFERENCES
1) Computer Vision wiki en.wikipedia.org/wiki/Computer_vision
2) Image Processing wiki en.wikipedia.org/wiki/Image_processing
52