Chapter: Making Machines See
1. What is Computer Vision?
Computer Vision is a branch of AI that enables computers and systems to derive
meaningful information from digital images, videos, and other visual inputs—and
take action or make recommendations based on that information.
2. Applications of Computer Vision
• Face Detection – Detecting faces in photos/videos (e.g., face unlock)
• Object Recognition – Identifying objects (e.g., vehicles, animals)
• Autonomous Vehicles – Interpreting surroundings (e.g., self-driving cars)
• Retail – Monitoring customer behavior, automatic checkout
• Medical Imaging – Detecting tumors, analyzing scans
• Augmented Reality – AR filters and overlays on real-world views
3. How Computer Vision Works
1. Image Acquisition – Capturing the image using camera or scanner.
2. Image Preprocessing – Enhancing image (resize, grayscale, etc.)
3. Feature Extraction – Identifying shapes, edges, textures.
4. Classification/Detection – Using AI to recognize patterns.
4. Types of Computer Vision Tasks
• Image Classification – Label the entire image (e.g., dog, cat)
• Object Detection – Detect multiple objects and their positions
• Pose Estimation – Estimate human key points (joints)
• Image Segmentation – Divide image into regions or objects
• Color Detection – Identify specific or dominant colors
5. Tools and Libraries
• OpenCV – For image processing and computer vision
• TensorFlow/Keras – For training deep learning models
• Python – Programming language
• Matplotlib – For visualization of images and results
6. Introduction to OpenCV
OpenCV (Open Source Computer Vision Library) is used for image processing and
computer vision tasks.
Example: Reading and displaying an image
import cv2
img = cv2.imread("image.jpg")
cv2.imshow("My Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Common OpenCV Functions
• cv2.imread() – Reads an image
• cv2.imshow() – Displays the image
• cv2.cvtColor() – Converts color formats
• cv2.resize() – Resizes the image
• cv2.GaussianBlur() – Applies blur to image
7. Image Processing Techniques
• Grayscale Conversion – Converts image to black and white.
cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
• Edge Detection – Highlights object boundaries.
cv2.Canny(img, 100, 200)
• Blurring – Removes image noise.
cv2.GaussianBlur(img, (5,5), 0)
• Resizing – Changes the dimensions.
cv2.resize(img, (200, 300))
8. Deep Learning in Computer Vision
Modern CV uses Deep Learning, especially Convolutional Neural Networks (CNNs)
for tasks like classification, detection, and face recognition.
9. Challenges in Computer Vision
• Lighting variations
• Camera quality differences
• Occlusions (partially hidden objects)
• Distorted or blurry images
• Need for real-time processing
10. Ethical Concerns
• Privacy issues in surveillance
• Biased data affecting fairness
• Misuse in deepfakes and spying tech
11. Project Ideas Based on CV
• Face Mask Detection
• Handwritten Digit Recognition (using MNIST dataset)
• Traffic Sign Recognition
• Color Detection Tool (live webcam)
Summary
Computer Vision helps machines understand visual data. Key tasks include
classification, detection, segmentation. Tools like OpenCV simplify image processing.
CV is widely applied in healthcare, security, transport, and more. Ethics and
responsible use are essential.