1
 CER
  TIFI
  CAT
      E
 FORVision has been completed by
This is to certify that the project work titled
Computer
Avyakt Malhotra of class 10 C under my
                             th
    AI
guidance.
 PRO
This project fulfills the term put forth by the
 JECT
CBSE in terms of content, presentation, and
explanation.
(Mrs. Nakashi Bajaj)
3
Gui
dan
With a deep sense of gratitude, I acknowledge the
ce
invaluable guidance, motivation, and constructive
criticism rendered to me by my AI teacher.
                 Mrs.Nakashi Bajaj
Her constant inspiration made the completion of
this project titled Computer Science possible.
Name: Avyakt Malhotra
Class & Section: 10th C
Roll no: 06
 4
INTRODUCTION
◊What
  is
Computer Vision is a field of Artificial Intelligence
  comp
(AI) that enables computers and systems to
  uter
interpret and understand visual data from the
world, such as images or videos. The primary goal
ofvision
   computer vision is to replicate and enhance
human visual perception through machines. In
  ? words, it's the technology that allows
other
computers to "see" and interpret visual data in a
meaningful and useful way, similar to how humans
perceive and understand their surroundings.
Unlike traditional software, which relies solely on
numerical data, computer vision systems analyze
pixel-based information from images and videos,
transforming this raw data into structured insights,
predictions, or actions. As technology advances,
computer vision has become integral to many
applications, ranging from autonomous vehicles
and medical diagnostics to social media filters
and augmented reality (AR).
5
◊ Core & Concept of Computer Vision
To understand computer vision, it's helpful to break it
down into some core tasks that define the field. These
tasks are used to build systems that can perform visual
recognition, analysis, and decision-making:
1. Image Classification
Image classification is the task of categorizing an image
into a predefined category or class. For example, an
image might be classified as either "cat" or "dog." A
machine learning model trained on large datasets of
images can learn to recognize patterns and classify new
images accordingly. This process involves detecting
important features of an image, like colour, texture, and
shape, and mapping those features to known categories.
       Use Case Example: Automatically identifying
        whether a picture contains a dog, cat, or person,
        which is widely used in social media platforms for
        tagging or sorting images.
2. Object Detection
Object detection goes beyond simple classification by not
only identifying what is in an image but also locating
where it is. This process typically involves drawing
bounding boxes around the detected objects in an image.
Object detection requires the system to consider both
spatial information and the characteristics of the objects.
       Use Case Example: Self-driving cars use object
        detection to identify pedestrians, traffic signs, other
6
        vehicles, and obstacles, helping them navigate safely
        on the road.
3. Image Segmentation
Image segmentation refers to the process of dividing an
image into meaningful sections or segments, often based
on different features, like colour or texture. This task is
important in scenarios where the boundaries of objects
need to be determined very precisely.
       Use Case Example: In medical imaging,
        segmentation helps identify tumours or other
        abnormalities by outlining the specific regions that
        are affected, aiding in diagnosis and treatment
        planning.
4. Facial Recognition
Facial recognition is a specific type of object recognition
where the system is designed to identify human faces. It
involves analyzing facial features such as the distance
between the eyes, nose, mouth, and chin, then
comparing these features against a database of known
faces to find matches.
       Use Case Example: Used in security systems for
        surveillance or unlocking smartphones by recognizing
        the user's face.
7
◊ How Does Computer Vision Work?
Computer vision systems rely heavily on
machine learning, especially deep learning
techniques, to process and understand visual
data. These systems work through several key
steps:
    1.   Image Acquisition: The first step in
      computer vision involves capturing an
      image or video, often using cameras or
      sensors. The data from this capture is
      processed by the system.
    2.   Preprocessing: Before the image can
      be analyzed, preprocessing steps may be
      applied to enhance the quality or normalize
      the data. This might include reducing noise,
      resizing images, or adjusting contrast to
      make key features more distinguishable.
    3.    Feature Extraction: During this step,
      relevant information or features (such as
      edges, corners, and textures) are extracted
      from the image to help the machine
      recognize objects or patterns. Traditional
      computer vision techniques used filters like
      Sobel filters and Harris corner
8
     detection to extract these features.
     However, with deep learning, convolutional
     neural networks (CNNs) are commonly used
     for automated feature extraction.
    4.   Model Training: Machine learning
      models, particularly deep neural networks,
      are trained on large datasets of labelled
      images. For instance, a neural network
      might be trained to recognize faces by
      showing it thousands of images of faces,
      each labelled with the identity of the
      person. The model learns the patterns and
      features that are important for making
      accurate predictions.
    5.    Prediction/Recognition: After training,
      the model can predict or recognize objects
      in new, unseen images by applying the
      learned patterns and features to classify or
      locate objects.
◊Applications of Computer Vision
Computer vision has a wide range of practical
applications across various industries:
    1.   Autonomous Vehicles: In self-driving
      cars, computer vision systems process data
9
     from cameras and sensors to detect other
     vehicles, pedestrians, and road signs. This
     enables the car to navigate safely without
     human intervention.
    2.    Healthcare: In the medical field,
      computer vision is used to analyze medical
      images such as X-rays, MRIs, and CT scans.
      It can identify tumours, fractures, or other
      abnormalities more quickly and accurately
      than human doctors in some cases. It also
      aids in robotic surgeries and real-time
      diagnostics.
    3.    Retail: Retailers are using computer
      vision for inventory management, customer
      tracking, and even cashier-less stores.
      Systems can detect when products are out
      of stock or when customers are looking at
      specific items, helping to optimize
      operations.
    4.   Surveillance and Security: In
      security, computer vision systems can
      monitor video feeds to detect unusual
      activities or identify individuals using facial
      recognition technology. This is becoming
      common in public places like airports,
      stadiums, and shopping centres.
10
     5.   Entertainment and social media:
       Apps like Instagram and Snapchat use
       computer vision to apply filters that track
       facial movements and overlay effects. It
       also enables features like automatic image
     tagging by recognizing the content of the
     photo.
     6.    Manufacturing and Quality Control:
       In industries, computer vision can be used
       to inspect products on assembly lines. It
       can detect defects or quality issues in real-
       time, improving efficiency and reducing
       human error.
11
       CO
       MP
       UT
     ◊ Healthcare        and Medical Imaging
       ER
         Computer vision in healthcare is a
          transformative tool with the potential to
          enhance diagnostic accuracy, speed up the
       ViSmedical workflow, and offer personalized
          care for patients. One of the most prominent
       IO applications of computer vision in
          healthcare is diagnosis through medical
          imaging. Technologies like Convolutional
       NNeural     Networks (CNNs) are trained to
          analyze medical images—such as X-rays, CT
       AP scans, or MRIs—and detect abnormalities
          that may indicate diseases like cancer,
          fractures, or even subtle organ changes. For
       PLIinstance, CNN-based models can detect
          early-stage breast cancer by analyzing
          mammograms with remarkable accuracy,
       CA often outperforming human radiologists in
          spotting microscopic signs that the naked
       TIOeye may miss.
         Beyond diagnostics, computer vision plays a
          critical role in robot-assisted surgeries.
       NS Integrating high-resolution cameras with
          real-time image processing allows surgeons
          to operate with enhanced precision,
12
       minimizing human error. A notable example
       is the da Vinci Surgical System, which
       utilizes computer vision to offer surgeons a
       3D, magnified view of the surgery site. It
       enables more precise movements by
       filtering out hand tremors and providing
       clearer visual feedback, thus improving
       surgical outcomes and reducing recovery
       times.
      Another growing application is in
       telemedicine and remote patient
       monitoring. By analyzing video footage,
       computer vision systems can monitor
       patients with chronic conditions or elderly
       individuals at home. For example, patients
       with Parkinson’s disease can be observed
       for changes in their motor functions, such as
       tremors or gait disturbances, and doctors
       can adjust treatment plans based on these
       observations without the patient needing to
       visit the clinic frequently. In dermatology,
       apps like Skin Vision use computer vision
       algorithms to analyze images of skin lesions
       and provide users with information about
       the likelihood of skin cancer, encouraging
       earlier detection and treatment.
      In addition to benefiting patients, computer
       vision helps reduce hospital workload.
       Tasks like organizing medical records,
       tracking patient flow, and even disinfecting
       hospital spaces are increasingly automated
13
         with AI and computer vision, allowing
         medical staff to focus on more critical tasks.
        Computer vision in healthcare is still
         evolving, with exciting future possibilities,
         such as personalized medicine, where AI
         systems analyze genetic and physical data
         to recommend tailored treatments.
         However, challenges remain, particularly
         regarding data privacy and ethical
         concerns surrounding AI decision-making in
         life-or-death scenarios. As computer vision
         continues to develop, its role in
         revolutionizing healthcare is undeniable.
     ◊ Autonomous Vehicles
        Autonomous vehicles, or self-driving cars,
         represent one of the most futuristic and exciting
         applications of computer vision. These cars rely
         heavily on visual data to understand and
         navigate their environment, making computer
         vision an essential technology for achieving fully
         autonomous transportation. At the core of self-
         driving systems are cameras, lidar sensors, and
         radar systems that continuously feed data into
         machine learning models to interpret the
14
       surroundings, make decisions, and take actions
       in real time.
      One of the key applications of computer vision in
       autonomous vehicles is object detection. Cars
       must detect and classify objects in their
       environment—pedestrians, cyclists, other
       vehicles, road signs, and traffic lights. Using
       algorithms like YOLO (You Only Look Once) or R-
       CNN (Region-Based Convolutional Neural
       Networks), these systems can identify objects in
       real time and predict their movement, enabling
       the car to react appropriately. For example, when
       approaching a crosswalk, the car detects
       pedestrians and calculates whether they will
       cross the street. Based on this information, it can
       either slow down or continue driving.
      In addition to object detection, environment
       mapping is crucial for self-driving cars.
       Autonomous vehicles use a combination of
       cameras and lidar (light detection and ranging)
       systems to generate 3D maps of their
       surroundings. These maps help the car
       understand the spatial layout, including the
       location of buildings, lanes, and other vehicles.
       For instance, Tesla’s Autopilot system uses a
       combination of cameras and ultrasonic sensors to
       create a comprehensive understanding of the
       road environment, allowing it to follow lanes and
       avoid collisions.
15
      Collision avoidance is another critical feature
       made possible by computer vision. By constantly
       analyzing the environment, the vehicle’s
       computer vision system can predict potential
       accidents and take preventative actions, such as
       emergency braking or steering to avoid
       obstacles. Systems like Mobileye’s ADAS
       (Advanced Driver Assistance System) use
       cameras to monitor road conditions and detect
       hazards, warning the driver and, in some cases,
       taking corrective actions automatically.
      The future of autonomous vehicles also includes
       vehicle-to-vehicle (V2V) and vehicle-to-
       infrastructure (V2I) communication, where
       cars and traffic systems share information. For
       example, if a car ahead suddenly brakes, nearby
       vehicles receive this information instantly,
       preventing accidents. This integration of
       computer vision with other AI technologies
       promises to create smarter, safer roads.
      However, the challenges to the widespread
       adoption of self-driving cars are significant. There
       are technical issues related to unpredictable
       human behaviour, poor weather conditions (like
       fog or snow), and incomplete data for edge cases
       that autonomous vehicles still struggle with.
       Moreover, there are legal and ethical questions
       surrounding accountability in case of accidents.
16
       Despite these hurdles, advancements in
       computer vision and AI will eventually lead to
       safer, more efficient transportation.
Security and Surveillance
      In the realm of security and surveillance,
       computer vision plays a pivotal role in enhancing
       safety measures, reducing crime rates, and
       ensuring public safety. Security systems that rely
       on computer vision are far more efficient and
       accurate than traditional methods, allowing for
       real-time monitoring, facial recognition, and
       anomaly detection.
      One of the most widespread applications of
       computer vision in security is facial
       recognition. This technology can be seen in
       airports, public venues, and even smartphones. It
       works by analyzing facial features from video or
       still images and matching them to stored
       databases to identify individuals. For example, in
       airports, facial recognition is used for identity
       verification at security checkpoints, reducing the
       need for manual passport checks and speeding
       up the boarding process. It’s also used in law
       enforcement to track down suspects or missing
       persons by scanning CCTV footage. However, this
       technology raises privacy concerns, as it can
       be used to track individuals without their
17
       consent, leading to ethical debates about
       surveillance overreach.
      In addition to facial recognition, computer vision
       is integral to anomaly detection. This involves
       the real-time analysis of video feeds to identify
       unusual or suspicious activities, such as loitering,
       trespassing, or vandalism. AI-powered systems
       can flag these activities and alert security
       personnel, making surveillance much more
       proactive. For example, in a shopping mall, an AI
       system can detect a person behaving unusually,
       such as repeatedly circling a particular store or
       acting nervously, and alert security before an
       incident occurs. This kind of predictive
       surveillance can prevent crimes before they
       happen.
      Crowd management is another application of
       computer vision in security. In large public
       gatherings, such as concerts or sporting events,
       monitoring the flow of people and detecting
       overcrowded areas is crucial to avoid accidents
       or stampedes. Computer vision systems can
       analyze video footage from multiple cameras to
       understand crowd dynamics and send alerts
       when areas become too congested. This can help
       authorities redirect the crowd and prevent
       dangerous situations.
      In the context of smart cities, computer vision
       enhances public safety by monitoring traffic,
       enforcing road rules, and detecting hazards. For
       example, cameras equipped with vision
       algorithms can automatically detect vehicles
       running red lights or illegal parking, and issue
       tickets without human intervention. Furthermore,
       these systems can identify accidents, debris, or
18
     other obstructions on roads, ensuring faster
     emergency response times.
19
     ◊ Retail and E-commerce
        Computer vision is revolutionizing the retail
         and e-commerce industries, enabling
         businesses to offer more personalized,
         efficient, and immersive shopping
         experiences. As consumer demands
         continue to shift toward digital-first
         interactions, computer vision helps retailers
         streamline operations, understand customer
         behaviour, and enhance the online shopping
         journey.
        One of the most prominent applications of
         computer vision in retail is visual search.
         Traditional keyword-based searches often
         fail to capture the exact product a consumer
         is looking for, especially when it's based on
         appearance. With visual search, consumers
         can upload an image of a product they like—
         such as a pair of shoes, a jacket, or a piece
         of furniture—and the computer vision
         system will analyze the image to find similar
         products in the retailer’s catalo. This is
         especially useful in fashion and lifestyle
         categories, where visual appeal is a major
         factor in purchasing decisions. Large
         retailers like Amazon and ASOS have
         already integrated visual search features
         into their apps, allowing shoppers to find
         products by simply taking a picture.
        Inventory management is another area
         where computer vision is making a
20
       significant impact. Managing stock levels
       accurately is crucial for retailers and manual
       methods can be time-consuming and error-
       prone. Computer vision systems can
       automatically track inventory by using
       cameras placed throughout a store or
       warehouse. These systems can scan
       barcodes or recognize products on the
       shelves, notifying staff when stock is
       running low or when items are misplaced.
       For example, Walmart has experimented
       with robots that use computer vision to
       monitor inventory levels and restock shelves
       as needed. This not only increases efficiency
       but also reduces human error and labour
       costs.
      In e-commerce, computer vision is essential
       for improving the online customer
       experience through features like virtual
       try-ons. Shoppers can upload a photo of
       themselves or use their webcam to virtually
       "try on" clothing, accessories, or makeup,
       giving them a better sense of how the
       product will look on them before making a
       purchase. Companies like Warby Parker and
       Sephora have successfully implemented
       these virtual try-on systems, allowing
       customers to visualize products like glasses
       or lipstick on their faces without visiting a
       physical store. This technology is
       particularly useful during the pandemic
21
       when consumers have shifted to online
       shopping, and in-store visits are limited.
      Computer vision is also being used to
       analyze customer behaviour in physical
       stores. Cameras equipped with AI can track
       how customers move through a store, what
       products they spend the most time looking
       at, and even their emotional reactions to
       certain items. By analyzing this data,
       retailers can optimize their store layouts,
       product placements, and marketing
       strategies. For example, if a certain product
       display attracts a lot of attention but results
       in few sales, retailers can adjust pricing,
       positioning, or promotions to increase
       conversion rates.
      Checkout automation is another area
       where computer vision is having a
       transformative impact. Amazon Go stores
       are leading the way with their "just walk
       out" technology, where customers can pick
       up items and leave the store without having
       to go through a traditional checkout
       process. Cameras and sensors track the
       items as customers place them in their cart,
       and the system automatically charges their
       account when they exit the store. This
       frictionless experience reduces wait times,
       eliminates the need for cashiers, and
       provides a seamless shopping experience
       for consumers.
22
      Moreover, personalization in e-commerce
       is driven by computer vision as well.
       Algorithms analyze visual data to
       recommend products based on a user’s past
       purchases or browsing history. This allows
       retailers to create tailored marketing
       campaigns and recommend relevant
       products. For instance, if a shopper
       frequently browses for athletic wear,
       computer vision-powered recommendation
       engines can suggest similar items, thus
       increasing the likelihood of a sale.
      While the benefits of computer vision in
       retail and e-commerce are clear, there are
       challenges to overcome, particularly
       concerning data privacy and consumer
       trust. The collection of vast amounts of
       visual data, especially in physical stores,
       raises questions about how this data is
       stored, used, and shared. Retailers must
       ensure that their computer vision systems
       comply with privacy regulations, such as
       GDPR, and are transparent with customers
       about data usage.
      In conclusion, computer vision is
       transforming retail and e-commerce by
       enhancing both operational efficiency and
       customer experiences. As technology
       advances, it will continue to reshape how
       consumers shop, interact with brands, and
       receive personalized recommendations.
23
     ◊ Agriculture and Farming
       The agricultural industry is undergoing a
        digital transformation, with computer
        vision playing a critical role in making
        farming more efficient, sustainable, and
        profitable. By combining machine learning
        and computer vision, farmers are now able
        to monitor crop health, optimize resource
        use, and even automate labour-intensive
        tasks. These advancements are
        particularly important as the global
        population grows, putting more pressure
        on food production systems.
       One of the most impactful applications of
        computer vision in agriculture is crop
        monitoring. Traditional methods of
        assessing crop health—such as manual
        inspections—are time-consuming, labor-
        intensive, and prone to human error. With
        computer vision, drones and satellite
        imagery can capture high-resolution
        images of large agricultural fields. By
        analyzing these images, AI models can
        detect patterns in plant health, identify
        areas affected by diseases, pests, or
        nutrient deficiencies, and predict yields.
        For example, early detection of fungal
        infections or insect infestations allows
24
       farmers to take preventative measures
       before the problem spreads, minimizing
       crop loss. These systems can also monitor
       the growth rate of crops, helping farmers
       optimize planting and harvesting times.
      In addition to disease detection, computer
       vision is used for precision agriculture,
       which involves the efficient use of
       resources like water, fertilizer, and
       pesticides. Drones equipped with
       multispectral cameras can monitor soil
       moisture levels and determine which areas
       of the field need irrigation. This ensures
       that water is distributed where it’s needed
       most, reducing waste and promoting more
       sustainable farming practices. Similarly,
       AI-powered vision systems can analyze
       leaf colour and texture to identify nutrient
       deficiencies and recommend targeted
       fertilizer applications, preventing
       the overuse of chemicals that could harm
       the environment.
      Automated harvesting is another area
       where computer vision is making
       significant strides. Harvesting crops,
       especially fruits and vegetables, is labour-
       intensive and time-sensitive, as produce
       needs to be picked at the right stage of
       ripeness to maximize yield and quality.
       Vision-guided robots are now being used
       to pick crops with high accuracy, ensuring
       only ripe produce is harvested. These
25
       robots use cameras to assess the colour,
       size, and texture of fruits, such as
       strawberries or tomatoes, and use robotic
       arms to gently pluck them from the plant.
       This automation reduces labour costs,
       speeds up the harvesting process, and
       ensures consistency in the quality of the
       produce.
      Beyond field crops, computer vision is
       being used in livestock monitoring.
       Farmers can monitor the health and
       behaviour of their animals’ using cameras
       and AI models that analyze video footage.
       For example, vision systems can detect
       signs of illness, stress, or injury in cattle by
       observing their posture, gait, and eating
       patterns. Early detection of health issues
       allows farmers to intervene quickly,
       reducing the need for costly veterinary
       treatments and improving animal welfare.
       These systems can also track the
       movement of animals to prevent
       overcrowding in certain areas and ensure
       that all animals have access to food and
       water.
      Weed control is another area where
       computer vision is proving beneficial.
       Traditionally, herbicides are applied
       uniformly across entire fields, which can
       be wasteful and environmentally
       damaging. Computer vision systems,
       however, can identify and target individual
26
       weeds, allowing for precision herbicide
       application. This not only reduces the
       amount of chemicals used but also
       minimizes the impact on the surrounding
       crops and soil.
      While the potential benefits of computer
       vision in agriculture are vast, there are
       challenges to widespread adoption. Cost
       remains a significant barrier, particularly
       for small-scale farmers who may not have
       the resources to invest in high-tech
       equipment. Moreover, the complexity of
       setting up and maintaining these systems
       requires specialized knowledge, which
       may be lacking in some rural areas.
      However, as computer vision technology
       becomes more affordable and accessible,
       its adoption in agriculture is expected to
       grow. By enabling more precise, efficient,
       and sustainable farming practices,
       computer vision will play a crucial role in
       ensuring food security and meeting the
       demands of a growing global population.
27
     Tools used here
Computer vision models rely on a range of tools
and techniques to process, interpret, and
understand visual data. Here are some key tools
and libraries often used:
1. Programming Languages
        Python: Most common for computer vision due
         to the variety of libraries.
        C++: Often used for performance-critical tasks.
2. Libraries and Frameworks
        OpenCV: A popular library for real-time
         computer vision applications.
        TensorFlow & Keras: Deep learning
         frameworks widely used for building and
         training vision models.
28
        PyTorch: Another deep learning framework
         popular for image classification, segmentation,
         and object detection.
        sci-kit-image: A Python library focused on
         basic image processing.
        Dlib: Used for facial recognition, object
         detection, and more.
3. Deep Learning Models
        Convolutional Neural Networks (CNNs):
         Specialized for processing images (like VGGNet,
         ResNet, Efficient Net).
        YOLO (You Only Look Once): Real-time
         object detection system.
        Mask R-CNN: Used for instance segmentation
         (detecting objects and their boundaries).
        UNet: Often used in
Thanks