PLANT DISEASE DETECTION USING CNN
A project report submitted in partial fulfillment of the requirements
for the award of
Bachelor of Technology
In
ELECTRONICS & COMMUNICATION ENGINEERING
By
B. TEJA VARDHAN REDDY G. CHITRAHAS BALAJI
(21BQ1A0417) (21BQ1A0444)
G. SASANK D. YASWANTH NAIK
(21BQ1A0438) (21BQ1A0429)
UNDER THE GUIDANCE OF
Mrs. T. VINEELA M.Tech
Assistant Professor
DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING
VASIREDDY VENKATADRI INSTITUTE OF TECHNOLOGY
(AUTONOMOUS)
(Approved by AICTE and permanently affiliated to JNTUK)
Accredited by NBA and NAAC with 'A' Grade
NAMBUR (V), PEDAKAKANI (M), GUNTUR-522 508
APRIL 2025
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
VASIREDDY VENKATADRI INSTITUTE OF TECHNOLOGY: NAMBUR
(AUTONOMOUS)
JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA
CERTIFICATE
This is to certify that the project titled “PLANT DISEASE DETECTION USING
CNN” is a bonafide record of work done by Mr. B. TEJA VARDHAN REDDY
(21BQ1A0417), Mr. G. CHITRAHAS BALAJI (21BQ1A0444), Mr. G. SASANK
(21BQ1A0438) and Mr. D. YASWANTH NAIK (21BQ1A0429) under the guidance of
Mrs. T. VINEELA, Assistant Professor in partial fulfillment of the requirement of the
degree for Bachelor of Technology in Electronics and Communication Engineering, JNTUK
during the academic year 2024–25.
Mrs. T. VINEELA Prof. M. Y. BHANU MURTHY
PROJECT GUIDE HEAD OF THE DEPARTMENT
DECLARATION
We, Mr. B. TEJA VARDHAN REDDY (21BQ1A0417), Mr. G. CHITRAHAS
BALAJI (21BQ1A0444), Mr. G. SASANK (21BQ1A0438) and Mr. D. YASWANTH
NAIK (21BQ1A0429), hereby declare that the Project Report entitled “PLANT DISEASE
DETECTION USING CNN” done by us under the guidance of Mrs. T. VINEELA, Assistant
Professor, Department of ECE is submitted in partial fulfillment of the requirements for the
award of degree of BACHELOR OF TECHNOLOGY In ELECTRONICS AND
COMMUNICATION ENGINEERING.
DATE : SIGNATURE OF THE CANDIDATES
PLACE : VVIT, NAMBUR B. TEJA VARDHAN REDDY
G. CHITRAHAS BALAJI
G. SASANK
D. YASWANTH NAIK
ACKNOWLEDGEMENT
We express our sincere thanks wherever it is due
We express our sincere thanks to the Chairman, Vasireddy Venkatadri
Institute of Technology, Sri Vasireddy Vidya Sagar for providing us well equipped
infrastructure and environment.
We thank Dr. Y. Mallikarjuna Reddy, Principal, Vasireddy Venkatadri Institute
of Technology, Nambur, for providing us the resources for carrying out the project.
We express our sincere thanks to Dr. K. Giribabu, Dean of Academics for
providing support and stimulating environment for developing the project.
Our sincere thanks to Prof. M. Y. Bhanu Murthy, Head of the Department,
Department of ECE, for his co-operation and guidance which helped us to make our
project successful and complete in all aspects.
We also express our sincere thanks and are grateful to our guide Mrs.T.
Vineela, Assistant Professor, Department of ECE, for motivating us to make our
project successful and fully complete. We are grateful for her precious guidance and
suggestions.
We also place our floral gratitude to all other teaching staff and lab
technicians for their constant support and advice throughout the project.
NAME OF THE CANDIDATES
B. TEJA VARDHAN REDDY (21BQ1A0417)
G. CHITRAHAS BALAJI (21BQ1A0444)
G. SASANK (21BQ1A0438)
. D. YASWANTH NAIK (21BQ1A0429)
TABLE OF CONTENTS
S.NO INDEX PAGE NO
LIST OF FIGURES i
LIST OF TABLES ii
ABSTRACT iii
OBJECTIVE iv
1 CHAPTER 1 – INTRODUCTION 1-3
1.1 Motivation 1
1.2 Problem Statement 2
1.3 Outline of The Project 3
1.4 Data Collection 3
2 CHAPTER 2 – LITERATURE SURVERY 4-27
2.1 Literature Survery 5
2.2 Architecture 6
2.2.1 Resnet Architecture 6
2.2.2 Densenet Architecture 9
2.2.3 Alexnet Architecture 13
2.3 Convolutional Neural Networks (CNNs) 16
And Layer Types
2.3.1 CNN Building Blocks 16
2.3.2 Layer Types 17
2.3.3 Convolutional Layers 18
2.3.4 Activation Layers 19
2.3.5 Pooling Layers 21
2.3.6 Adam Optimizer 23
2.4 Reported Results 24
3 CHAPTER 3 – PROPOSED DESIGN AND 28-56
METHODOLOGY
3.1 Flow Chart Of Proposed System 29
3.2 Excepted Results 31
3.2.1 Software and Programming 33
Languages
3.3 Python Programming 36
3.4 Google Colab 38
3.5 Definition of Plant Diseases And Pests 39
3.5.1 Definition of Plant Diseases And 39
Pests Detection
3.5.2 Early Blight and Late Blight of 40
Potato
3.6 Mobilenetv2 Architecture 50
3.6.1 Inverted Residual Linear Bottle Neck 52
3.7 Experimental Analysis 53
4 CHAPTER 4 – RESULTS AND ANALYSIS 57-60
4.1 Simulation Results of Proposed Model 57
4.2 Synthesis of Proposed Model 58
4.2.1 Predicted Results 58
5 CHAPTER 5 – SUMMARY, CONCLUSION AND FUTURE 61-64
SCOPE
5.1 Summary 61
5.2 Conclusion 62
5.3 Future scope 63
REFERENCES 65-67
LIST OF FIGURES
FIG NO TITLE Page No
Fig 2.1 Architecture Of Resnet 9
Fig 2.2 Architecture Of Densenet 12
Fig 2.3 2D-Activation Map 16
Fig 2.4 Final Output Volume 19
Fig 2.5 An Example For Relu Activation 20
Fig 2.6 Applying 2x2 Max Pooling To Our Input 21
Fig 2.7 Matrix Of Resnet Algorithm 22
Fig 2.8 Matrix Of Densenet Algorithm 25
Fig 2.9 Matrix Of Alexnet Algorithm 26
Fig 3.1 Flow Chart Of Proposed Model 29
Fig 3.2 Expected Result 32
Fig 3.3 Tomato Diseases And Pests Images 40
Fig 3.4 Multi-Disease Recognition In Tomato Plant 44
Fig 3.5 Samples From Tomato Diseases & Healthy Leaves 46
Fig 3.6 Mobilenetv2 Architecture 52
Fig 3.7 Data Samples 55
Fig 4.1 Output 57
Fig 4.2 Tomato Spider Mites Two Spotted Spider Mite 58
Fig 4.3 Tomato Mosaic Virus 59
Fig 4.4 Pepper Bell Bacterial 59
Fig 4.5 Tomato Late Blight 60
Fig 4.6 Tomato Target Spot 60
LIST OF TABLES
TABLE NO TITLE PAGE NO
Table 2.1 Comparison Table 25
Table 3.1 Summary of PLD Dataset 43
ABSTRACT
The Plant Disease Detection project employs machine learning
techniques, specifically convolutional neural networks (CNNs), to address the
early detection of plant diseases. The project involves a systematic workflow,
including data collection, preprocessing, model development, training, and
performance evaluation. Using a curated dataset of healthy and diseased plant
images, the CNN is trained to identify unique patterns and features linked to
various plant diseases.
To enhance the model's robustness in diverse scenarios, data
augmentation techniques are applied. The model's architecture incorporates
dense layers and pooling layers to extract relevant features, followed by
classification layers for identifying diseases. These enhancements aim to
improve the system's ability to generalize across different datasets.
During training, the model parameters are optimized using the
Adamax optimizer while closely monitoring validation performance. After
training, the model is validated with an independent testing dataset to ensure
accuracy and generalizability. The final trained model is then deployed for real-
world applications, aiming to provide a reliable solution for plant disease
detection.
Keywords: Machine Learning, Plant Disorders, Visual Diagnosis.
iii
OBJECTIVE
The objective of implementing a deep learning system for visual diagnosis
of plant disorders is to empower farmers with an efficient tool capable of swiftly
and accurately identifying various issues affecting their crops. Through the
integration of advanced computer vision techniques and deep learning algorithms,
this system aims to streamline the process of detecting ailments such as fungal
infections, nutrient deficiencies, and pest infestations solely based on visual
inputs, typically images of the affected plants.
The development process involves meticulous data collection,
encompassing a diverse array of labelled images representing healthy plants and
a wide spectrum of disorders across different plant species, growth stages, and
environmental conditions. With robust data pre-processing, feature extraction, and
model training procedures, the deep learning system endeavours to attain high
accuracy and generalization capabilities, enabling farmers to make timely and
informed decisions to mitigate crop losses and optimize agricultural yields.
Upon successful development, the deep learning-based diagnostic tool will
be deployed through user-friendly interfaces, seamlessly integrating into existing
agricultural management platforms to facilitate easy access and adoption by
farmers. The system's deployment will mark a significant step forward in precision
agriculture, offering farmers a reliable means of swiftly diagnosing plant disorders
and implementing appropriate interventions to safeguard crop health and
productivity.
Continual improvement and collaboration with agricultural experts and
stakeholders will be paramount to ensuring the system's effectiveness, accuracy,
and relevance in real-world farming scenarios. Additionally, ethical considerations
will be carefully addressed to ensure the responsible and transparent deployment
of the ML system, fostering trust and confidence among users while promoting
sustainable agriculture practices.
iv
CHAPTER – 1
INTRODUCTION
1.1 MOTIVATION:
The agriculture system is supported by farmers. Agriculture, as is well
known, is an integral part of a country's development. Agriculture is very important
in India's economy and job market. One of the most common problems faced by
Indian farmers is that they do not choose the appropriate crop for their soil. One of
the most popular issues that Indian farmers face is failing to protect their
crops/plants in time from diseases.
The motivation behind this project stems from the critical need within
industrial settings to effectively identify and address plant malfunctions in a timely
manner. Plant malfunctions can lead to significant downtime, production losses,
and safety hazards, impacting both productivity and profitability. Traditional
methods of manual inspection and detection are often time-consuming, labor-
intensive, and prone to errors. By harnessing the power of machine vision
technology, coupled with the accessibility and computational capabilities of
Google Colab and the efficiency of the MobileNetV2 architecture, we aim to
develop a solution that can automate the process of malfunction identification,
significantly reducing response times and enhancing overall operational efficiency.
The potential impact of such a system extends beyond industrial plants to
various sectors where timely and accurate anomaly detection is crucial, including
manufacturing, energy, and infrastructure. By addressing these challenges, we
strive to contribute to safer, more reliable, and more productive industrial
environments, ultimately benefiting both businesses and society as a whole.
1
1.2 PROBLEM STATEMENT:
The problem at hand revolves around the need for an efficient and
accurate method of identifying and categorizing plant malfunctions within
industrial environments. Traditional approaches to malfunction detection often rely
on manual inspection or simple threshold-based methods, which are labor-
intensive, time-consuming, and prone to human error. These inefficiencies lead to
increased downtime, production losses, and safety hazards, negatively impacting
the productivity and profitability of industrial plants.
To address this problem, we propose leveraging machine vision technology
in conjunction with Google Colab software and the MobileNetV2 architecture. Our
goal is to develop a robust system capable of automatically analyzing visual data
captured from industrial plants and accurately identifying various types of
malfunctions in real-time. By doing so, we aim to significantly reduce response
times, improve operational efficiency, and enhance overall safety within industrial
environments.
The solution to this problem involves the development and optimization of
machine learning models trained on a dataset of plant malfunction images. The
trained models will be deployed for real-time inference within industrial plants,
providing plant operators with timely alerts and actionable insights into potential
malfunctions. Additionally, integration with Google Colab's cloud-based
infrastructure will enable scalable and efficient model training and deployment,
further enhancing the system's effectiveness.
Overall, the proposed solution aims to revolutionize the way plant
malfunctions are detected and addressed within industrial environments,
ultimately leading to safer, more reliable, and more productive operations.
2
1.3 OUTLINE OF THE PROJECT:
Many studies are being conducted in order to develop an accurate and
effective crop prediction model. Ensembling is one of the techniques used in
these types of studies. We can improve yield efficiency by using this Ensembling
technique prediction. When we want a high predictive model, we use hybrid
models, which have a high accuracy. Above all, the most effective model is
determined by perfect data cleaning. We used Python for both the backend and
the frontend in this project.
1.4 DATA COLLECTION:
The PlantVillage dataset is a valuable resource for researchers and
practitioners working on various aspects of plant health and agriculture. It contains
a large collection of images depicting different types of plant diseases, pests, and
other conditions, along with corresponding metadata such as plant species,
disease type, and severity. Collecting a total of 20,678 images provides a rich and
diverse dataset for training a machine learning model to identify and classify plant
malfunctions. Similarly, biotic and abiotic stresses fall into four categories: leaf
spot, leaf curl, slug damage and healthy leaf.
3
CHAPTER – 2
LITERATURE SURVEY / EXISTING MODELS
INTRODUCTION
In industrial settings, timely detection and resolution of plant malfunctions
are critical for ensuring operational efficiency, productivity, and safety. Traditional
methods of manual inspection often prove to be labor-intensive, time-consuming,
and prone to human error. Therefore, there is a growing interest in leveraging
machine learning and computer vision techniques to automate the process of
malfunction identification.
This research aims to develop an automated plant malfunction
identification system using convolutional neural networks (CNNs) with a focus on
leveraging pre- trained models such as RESNET, DENSENET and ALEXNET.
CNNs have demonstrated remarkable performance in image classification tasks,
making them well- suited for analyzing visual data from industrial plants.
The proposed system will be trained on a dataset consisting of 20,678
images collected from various industrial plant environments. These images will
encompass a wide range of plant conditions, including normal operation,
malfunctions, and anomalies. By training the model on such a diverse dataset, we
aim to enhance its ability to accurately detect and classify different types of plant
malfunctions.
The utilization of pre-trained CNN architectures like RESNET, DENSENET
and ALEXNET offers several advantages, including transfer learning, which
allows the model to leverage learned features from large-scale datasets such as
ImageNet. Fine- tuning these pre- trained models on our specific dataset enables
us to adapt them to the characteristics of industrial plant images, thereby
improving their performance and efficiency in malfunction identification.
Through this research, we seek to contribute to the advancement of
automated plant malfunction identification systems, providing industrial facilities
with a reliable and efficient tool for early detection and resolution of operational
issues. Additionally, by
4
utilizing pre-trained CNN architectures like RESNET, DENSENET and ALEXNET,
we aim to streamline the development process while achieving high accuracy and
robustness in malfunction detection.
2.1 LITERATURE SURVEY
The evolution of deep learning for plant disorder diagnosis from 2015 to
2024 has seen remarkable strides in methodology and accuracy. Beginning in
2015, the release of the PlantVillage Dataset by Hughes et al. provided a crucial
foundation for deep learning research in agriculture. This dataset comprised a
diverse collection of labeled images representing various plant diseases,
facilitating the development of deep learning models for automated diagnosis.
Gupta et al. (2016) capitalized on this dataset, pioneering the use of convolutional
neural networks (CNNs) for plant disease identification. Their work achieved an
initial accuracy milestone of 89%, demonstrating the potential of deep learning in
agriculture.
As deep learning techniques gained traction, researchers began exploring
transfer learning as a means to leverage pre-trained neural network models for
plant disorder diagnosis. Mohanty et al. (2016) exemplified this approach by
applying transfer learning on the Plant Pathology Challenge Dataset, resulting in a
notable accuracy improvement to 92%. Transfer learning reduced the need for
extensive labeled datasets and computational resources, making deep learning-
based diagnosis more accessible to researchers and practitioners alike.
Additionally, ensemble learning methods emerged as a promising strategy for
enhancing diagnostic performance. Ferentinos (2018) demonstrated the
effectiveness of ensemble learning with multiple CNN architectures, achieving a
remarkable accuracy of 94% on the Plant Diseases Dataset.
Continuing the trend of methodological advancements, researchers turned
their attention to the integration of multimodal data sources to further enhance
diagnostic accuracy. In 2020-2021, Sharma et al. introduced the Plant Diseases
Dataset, which incorporated both visual and spectral information. By utilizing
ensemble learning techniques alongside spectral data, they achieved an accuracy
rate of 95%, surpassing previous benchmarks. Moreover, by 2022-2024, Singh
et al. proposed a multimodal
5
deep learning approach, integrating spectral and spatial features from the
Multispectral Plant Images Dataset. Their method surpassed previous
benchmarks, achieving an accuracy exceeding 96%.
Throughout this period, advancements in dataset quality, methodological
approaches, and integration of multimodal data sources have played a crucial role
in driving progress in deep learning-based plant disorder diagnosis. The
availability of diverse and well-annotated datasets, combined with innovations in
deep learning architectures and transfer learning techniques, has enabled
researchers to achieve unprecedented accuracy levels. Looking ahead, ongoing
efforts are focused on addressing remaining challenges, such as dataset diversity,
model interpretability, and real-world deployment, to further enhance the
effectiveness and scalability of deep learning-based plant disorder diagnosis
systems. These advancements hold promise for revolutionizing agricultural
practices, improving crop management, and ensuring global food security in the
years to come.
2.2. ARCHITECTURE
2.2.1 RESNET Architecture
ResNet (Residual Network) is a deep CNN architecture designed
to solve the vanishing gradient problem in deep learning. It introduces skip
(residual) connections, which allow layers to learn the difference (residual)
rather than the full transformation. This helps in training very deep networks
(e.g., ResNet-50, ResNet-101, ResNet-152) without performance
degradation.
The architecture consists of:
1. Input Layer
• The input layer is responsible for receiving image data.
• In ResNet, the typical input shape for an RGB image is 224×224×3, where:
o 224×224 is the spatial resolution of the image.
o 3 represents the RGB color channels.
6
• Before being fed into the network, images are normalized (pixel values
scaled between 0 and 1 or -1 and 1) to improve training stability.
• If images are of different sizes, they are resized to 224×224 to maintain
consistency across the dataset.
2. Initial Convolution & Pooling
This stage extracts low-level features from the input image using
convolutional operations:
a) 7×7 Convolutional Layer
• The first layer applies a 7×7 convolution with 64 filters and a stride of 2
to extract features from the image.
• The stride of 2 reduces the spatial dimensions (from 224×224 to 112×112),
making the network more efficient.
• The output of this layer is passed through Batch Normalization (BN)
and ReLU activation, which:
o Batch Normalization helps stabilize training by normalizing activations.
o ReLU (Rectified Linear Unit) introduces non-linearity to improve learning.
b) 3×3 Max Pooling
• After the convolution, Max Pooling (3×3, stride 2) further reduces spatial
dimensions from 112×112 to 56×56.
• Max pooling helps in dimensionality reduction while retaining important
features.
3. Residual Blocks (Key Feature of ResNet)
Residual blocks are the core building blocks of ResNet, allowing deep
networks to train without performance degradation.
a) Why Use Residual Blocks?
• In deep networks, vanishing gradients occur when gradients
shrink during
7
backpropagation, making it difficult to update the initial layers.
• Residual blocks solve this by introducing skip (shortcut) connections,
which allow gradients to flow directly through layers.
b) Structure of a Residual Block
Each residual block consists of three convolutional layers:
1. 1×1 Convolution – Reduces the number of channels (dimensionality
reduction).
2. 3×3 Convolution – Extracts spatial features.
After these convolutions, a skip connection adds the original input (x) to the
transformed output:
Output=F(x)+x\text{Output} = \text{F(x)} + \text{x}Output=F(x)+x
where F(x) is the output of the three convolutional layers.
This helps preserve original features and allows easy gradient flow,
enabling deeper networks without degradation.
4. Fully Connected Layer (Classification Head)
After feature extraction through residual blocks, the final layers classify the
image:
a) Global Average Pooling (GAP)
• Instead of using fully connected layers with millions of parameters,
ResNet uses Global Average Pooling (GAP).
• GAP averages the feature maps from the last convolutional layer,
reducing dimensions while retaining important information.
• Output shape is reduced from 7×7×2048 → 1×1×2048 in ResNet-50.
b) Fully Connected (Dense) Layer
• A fully connected (FC) layer maps extracted features to class probabilities.
8
• If the dataset has N classes, the FC layer has N neurons.
• Example: In a Plant Disease Detection project with 5 disease classes,
the FC layer will have 5 neurons.
c) Softmax Activation (Final Classification)
• The output is passed through Softmax activation, which converts the
output into probability scores for each class:
P(yi)=ezi / ∑ ezj
• The class with the highest probability is the final prediction.
FIG – 2.1 : Architecture of Resnet
2.2.2 DENSENET Architecture
DenseNet is a deep CNN architecture that improves feature
reuse by using dense connections, where each layer receives inputs from
all previous layers. This helps in reducing vanishing gradients, improving
feature propagation, and lowering the number of parameters.
The DenseNet architecture consists of:
9
1. Input Layer
• The input layer takes in images of a fixed size. For example, in
DenseNet-121, the input size is 224×224×3 for RGB images.
• If the input images are of different sizes, they are resized to
224×224 for consistency.
• Pixel values are normalized (scaled between 0 and 1) for stable training.
2. Initial Convolution & Pooling
Before entering the dense blocks, the input undergoes basic feature
extraction:
a) 7×7 Convolutional Layer
• A 7×7 convolution with 64 filters and a stride of 2 is applied.
• This layer extracts low-level features such as edges, colors, and
textures.
• Output dimensions are reduced from 224×224×3 → 112×112×64.
b) 3×3 Max Pooling
• A 3×3 max pooling layer with stride 2 follows, further reducing spatial
size.
• Output dimensions shrink from 112×112×64 → 56×56×64.
At this stage, the image has undergone basic feature extraction and is
ready for deeper learning in dense blocks.
3. Dense Blocks (Core of DenseNet)
a) What Are Dense Blocks?
• A Dense Block consists of multiple convolutional layers where
each layer is connected to all previous layers.
• Unlike ResNet (which uses skip connections only from one
previous layer), DenseNet concatenates feature maps from all
preceding layers.
10
• If a dense block has L layers, then the Lᵗʰ layer receives inputs from L-1
previous layers.
b) Structure of a Dense Block
Each layer in a dense block has the following sequence:
1. 1×1 Convolution (Bottleneck layer) – Reduces the number
of channels (compresses features).
2. 3×3 Convolution – Extracts more spatial features.
3. Batch Normalization (BN) – Stabilizes training.
4. ReLU Activation – Adds non-linearity for better learning.
c) Growth Rate (k) in DenseNet
• The growth rate (k) controls how many new feature maps each layer
contributes.
• A typical DenseNet uses k = 32, meaning each layer adds 32 feature
maps.
For example:
• If the first layer starts with 64 feature maps, after 5 layers (with k=32), it
will have 64
+ (5 × 32) = 224 feature maps.
• This keeps the network compact while maximizing feature reuse.
4. Transition Layers
Since each dense block increases the number of feature maps, Transition
Layers are used to control dimensionality and prevent excessive
computational costs.
11
Structure of a Transition Layer Each transition layer consists of:
1. 1×1 Convolution – Reduces the number of channels.
2. Batch Normalization – Helps stabilize training.
3. ReLU Activation – Introduces non-linearity.
4. 2×2 Average Pooling – Reduces spatial dimensions by half.
For example:
• If a dense block outputs 224 feature maps (channels), the transition
layer reduces them to 112.
• The spatial dimensions (height × width) are also halved (e.g., 56×56 →
28×28).
FIG - 2.2 : Architecture of Densenet
12
2.2.3 ALEXNET Architecture
AlexNet is a deep CNN architecture designed for image classification,
introduced in 2012. It won the ImageNet competition by significantly
improving accuracy over traditional models. It consists of 8 layers: 5
convolutional layers and 3 fully connected layers.
The AlexNet architecture includes:
1. Input Layer
• Takes an image of size 227×227×3 (Height × Width × RGB Channels).
• Originally, ImageNet images were 256×256, but random cropping was
applied to obtain 227×227 inputs.
• The input image is normalized to have zero mean for better training
stability.
2. First Convolutional Layer
• 96 filters of size 11×11 with a stride of 4.
• ReLU activation function is applied to introduce non-linearity.
• Max Pooling (3×3, stride 2) follows to reduce spatial size and
retain important features.
Purpose:
⬛ Detects basic features like edges, corners,
and textures.
⬛ Reduces spatial dimensions from 227×227×3 → 55×55×96.
3. Second Convolutional Layer
• 256 filters of size 5×5 with a stride of 1.
• ReLU activation is applied.
• Max Pooling (3×3, stride 2) reduces the feature map size.
13
Purpose:
⬛ Detects more complex patterns like shapes
and textures.
⬛ Reduces dimensions from 55×55×96 → 27×27×256.
4. Third, Fourth, and Fifth Convolutional Layers
These layers extract deeper and more abstract features.
a) Third Convolutional Layer
• 384 filters of size 3×3, stride 1.
• ReLU activation applied.
• No pooling to retain spatial features.
b) Fourth Convolutional Layer
• 384 filters of size 3×3, stride 1.
• ReLU activation applied.
• No pooling to preserve learned features.
c) Fifth Convolutional Layer
• 256 filters of size 3×3, stride 1.
• ReLU activation.
• Max Pooling (3×3, stride 2) reduces feature map size.
Purpose of these layers:
⬛ Detects more complex structures like object parts.
⬛ Helps in hierarchical feature learning.
⬛ Spatial dimensions reduce from 27×27×256 → 13×13×256.
14
5. Max Pooling
Max pooling layers are used after the 1st, 2nd, and 5th convolutional
layers to:
• Reduce spatial size, making the model computationally efficient.
• Retain essential features while discarding unnecessary details.
• Prevent overfitting by reducing parameters.
Each max pooling operation halves the height and width while
keeping the depth (number of channels) the same.
6. First Fully Connected (FC) Layer
• 4096 neurons with ReLU activation.
• Dropout (50%) is applied to reduce overfitting.
Purpose:
⬛ Converts extracted features into a higher-level representation.
⬛ Prevents overfitting with dropout regularization.
7. Second Fully Connected (FC) Layer
• 4096 neurons, similar to the first FC layer.
• ReLU activation and Dropout (50%) applied.
Purpose:
⬛ Further abstraction of features.
⬛ Helps in separating classes more effectively.
15
2.3 Convolutional Neural Networks (CNNs) and Layer Types
2.3.1 CNN Building Blocks
Neural networks accept an input image/feature vector (one input
node for each entry) and transform it through a series of hidden layers,
commonly using nonlinear activation functions. Each hidden layer is also
made up of a set of neurons, where each neuron is fully connected to all
neurons in the previous layer. The last layer of a neural network (i.e., the
“output layer”) is also fully connected and represents the final output
classifications of the network.
A diverse dataset is essential for training a CNN and understanding
its different layer types. By training the network on the dataset, we can
observe how each layer extracts features and contributes to the final
prediction.
FIG-2.3: Convolutional Neural Networks (CNNs) and Layer Types
However, neural networks operating directly on raw pixel intensities:
1. Do not scale well as the image size increases.
2.Leaves much accuracy to be desired (i.e., a standard feedforward
neural network on CIFAR-10 obtained only 52% accuracy).
To demonstrate how standard neural networks do not scale well
as image size increases, let’s again consider the CIFAR-10 dataset.
Each image in CIFAR-10 is
16
32×32 with a Red, Green, and Blue channel, yielding a total of 32×32×3 =
3,072 total inputs to our network.
A total of 3,072 inputs does not seem to amount to much, but consider if
we were using 250×250 pixel images the total number of inputs and weights
would jump to 250×250×3 = 187,500 and this number is only for the input
layer alone! Surely, we would want to add multiple hidden layers with a
varying number of nodes per layer — these parameters can quickly add up,
and given the poor performance of standard neural networks on raw pixel
intensities, this bloat is hardly worth it.
Instead, we can use Convolutional Neural Networks (CNNs) that take
advantage of the input image structure and define a network architecture in a
more sensible way. Unlike a standard neural network, layers of a CNN are
arranged in a 3D volume in three dimensions: width, height, and depth
(where depth refers to the third dimension of the volume, such as the
number of channels in an image or the number of filters in a layer).
To make this example more concrete, again consider the CIFAR-10
dataset: the input volume will have dimensions 32×32×3 (width, height, and
depth, respectively). Neurons in subsequent layers will only be connected to
a small region of the layer before it (rather than the fully connected structure
of a standard neural network) — we call this local connectivity, which
enables us to save a huge amount of parameters in our network. Finally, the
output layer will be a 1×1×N volume, which represents the image distilled
into a single vector of class scores. In the case of CIFAR-10, given ten
classes, N = 10, yielding a 1×1×10 volume.
2.3.2 Layer Types
There are many types of layers used to build Convolutional Neural
Networks, but the ones you are most likely to encounter include:
Convolutional (CONV)
Activation (ACT or RELU, where we use the same or the
actual activation function)
17
Pooling (POOL)
Fully connected (FC)
Batch normalization (BN)
Dropout (DO)
Stacking a series of these layers in a specific manner yields a CNN.
We often use simple text diagrams to describe a CNN:
INPUT => CONV => RELU => FC => SOFTMAX
. Here, we define a simple CNN that accepts an input, applies a
convolution layer, then an activation layer, then a fully connected layer,
and, finally, a softmax classifier to obtain the output classification
probabilities. The SOFTMAXactivation layer is often omitted from the
network diagram as it is assumed it directly follows the final FC.
Of these layer types, CONVand FC(and to a lesser extent, BN) are
the only layers that contain parameters that are learned during the training
process. Activation and dropout layers are not considered true “layers”
themselves but are often included in network diagrams to make the
architecture explicitly clear. Pooling layers (POOL), of equal importance as
CONVand FC, are also included in network diagrams as they have a
substantial impact on the spatial dimensions of an image as it moves
through a CNN.
CONV, POOL, RELU, and FCare the most important when
defining your actual network architecture. That’s not to say that the
other layers are not critical, but take a backseat to this critical set of four as
they define the actual architecture itself.
2.3.3 Convolutional Layers
The CONVlayer is the core building block of a Convolutional Neural
Network. The CONVlayer parameters consist of a set of K learnable filters
(i.e., “kernels”), where each filter has a width and a height, and are nearly
always square. These filters are small (in terms of their spatial dimensions)
18
For inputs to the CNN, the depth is the number of channels in the
image (i.e., a depth of three when working with RGB images, one for each
channel). For volumes deeper in the network, the depth will be the
number of filters applied in the previous layer.
To make this concept more clear, let’s consider the forward-pass of
a CNN, where we convolve each of the K filters across the width and height
of the input volume. More simply, we can think of each of our K kernels
sliding across the input region, computing an element-wise multiplication,
summing, and then storing the output value in a 2-dimensional activation
map, such as in applying 2x2 max pooling to our input.
Fig – 2.4: Left: At each convolutional layer in a CNN, there are K kernels
applied to the input volume. Middle: Each of the K kernels is convolved
with the input volume. Right: Each kernel produces a 2D output, called an
activation map.
After applying all K filters to the input volume, we now have K, 2-
dimensional activation maps. We then stack our K activation maps along
the depth dimension of our array to form the final output volume final output
volume.
19
Fig-2.5: After obtaining the K activation maps, they are stacked
together to form the input volume to the next layer in the
network.
2.3.4 Activation Layers
After each CONVlayer in a CNN, we apply a nonlinear activation
function, such as ReLU, ELU, or any of the other Leaky ReLU variants. We
typically denote activation layers as RELUin network diagrams as since
ReLU activations are most commonly used, we may also simply stateACT—
in either case, we are making it clear that an activation function is being
applied inside the network architecture.
Activation layers are not technically “layers” (due to the fact that no
parameters/weights are learned inside an activation layer) and are
sometimes omitted from network architecture diagrams as it’s assumed that
an activation immediately follows a convolution.
In this case, authors of publications will mention which activation
function they are using after each CONV layer somewhere in their paper.
As an example, consider the following network architecture: INPUT =>
CONV => RELU => FC.
To make this diagram more concise, we could simply remove the
RELU component since it’s assumed that an activation always follows a
convolution: INPUT
=> CONV => FC. I personally do not like this and choose to explicitly
include the activation layer in a network diagram to make it clear when and
what activation function I am applying in the network.
An activation layer accepts an input volume of size Winput×Hinput×Dinput
and then applies the given activation function.
20
in an element-wise manner, the output of an activation layer is always the
same as the input dimension, Winput = Woutput, Hinput = Houtput, Dinput = Doutput.
Fig-2.6: An example of an input volume going through a ReLU activation,
max(0, x). Activations are done in-place so there is no need to create a
separate output volume although it is easy to visualize the flow of the
network in this manner.
2.3.5 Pooling Layers
There are two methods to reduce the size of an input volume —
CONVlayers with a stride > 1 (which we’ve already seen) and
POOLlayers. It is common to insert POOLlayers in-between consecutive
CONVlayers in a CNN architectures:
INPUT => CONV => RELU => POOL => CONV => RELU => POOL => FC
The primary function of the POOL layer is to progressively reduce
the spatial size (i.e., width and height) of the input volume. Doing this
allows us to reduce the amount of parameters and computation in the
network — pooling also helps us control overfitting.
POOLlayers operate on each of the depth slices of an input
independently using either the max or average function. Max pooling is
typically done in the middle of the CNN architecture to reduce spatial size,
whereas average pooling is normally used as the final layer of the network
(e.g., GoogLeNet, SqueezeNet, ResNet pooling,.
21
Typically we’ll use a pool size of 2×2, although deeper CNNs that
use larger input images (> 200 pixels) may use a 3×3 pool size early in the
network architecture. We also commonly set the stride to either S = 1
or S = 2. Matrix of Alexnet algorithm (heavily inspired by Karpathy et al.)
follows an example of applying max pooling with 2×2 pool size and a stride
of S = 1. Notice for every 2×2 block, we keep only the largest value, take a
single step (like a sliding window), and apply the operation again — thus
producing an output volume size of 3×3.
Fig-2.7: Left: Our input 4×4 volume. Right: Applying 2×2 max pooling with
a stride of S = 1. Bottom: Applying 2×2 max pooling with S = 2 — this
dramatically reduces the spatial dimensions of our input.
We can further decrease the size of our output volume by increasing
the stride — here we apply S = 2 to the same input. For every 2×2 block in
the input, we keep only the largest value, then take a step of two pixels,
and apply the operation again. This pooling allows us to reduce the width
and height by a factor of two, effectively discarding 75% of activations
from the previous layer.
In summary, POOLlayers Accept an input volume of size
Winput×Hinput×Dinput.
They then require two parameters:
• The receptive field size F (also called the “pool size”).
• The stride S.
22
Applying the POOL operation yields an output
volume of size Woutput×Houtput×Doutput, where:
• Woutput = ((Winput −F) / S) +1
• Houtput = ((Hinput −F) / S) +1
• Doutput = Dinput
In practice, we tend to see two types of max pooling variations:
• Type #1: F = 3, S = 2, which is called overlapping pooling and normally
applied to images/input volumes with large spatial dimensions.
• Type #2: F = 2, S = 2, which is called non-overlapping pooling. This is the
most common type of pooling and is applied to images with smaller spatial
dimensions.
For network architectures that accept smaller input images (in the
range of 32−64 pixels) you may also see F = 2, S = 1 as well.
The work results are presented in comparison table, which includes
the comparison metrics for each model. Additionally, the confusion
matrices for each model can be found matrix of resnet for the CNN model,
matrix of Densenet for the VGG-16 model, and matrix of Alexnet for the
VGG-19 model:
2.3.6 ADAM OPTIMIZER
The Adam optimizer, short for Adaptive Moment Estimation, is a
popular optimization algorithm used in training neural networks, particularly
in the field of deep learning. It was introduced by Diederik P. Kingma and
Jimmy Ba in their paper titled "Adam: A Method for Stochastic
Optimization" published in 2014.
The Adam optimizer combines the advantages of two other popular
optimization algorithms: AdaGrad and RMSProp. Like AdaGrad, Adam
adapts the learning rate for
23
each parameter based on the magnitude of its gradients. However, unlike
AdaGrad, which accumulates squared gradients over time, Adam uses
exponentially decaying averages of past gradients, similar to RMSProp.
This helps Adam handle sparse gradients and non-stationary objectives
more effectively.
The key idea behind Adam is to maintain two moving averages of
the gradients: the first moment (mean) and the second moment
(uncentered variance). These moving averages are used to compute
adaptive learning rates for each parameter during optimization.
Additionally, Adam incorporates bias correction to account for the
initialization bias in the first and second moment estimates, especially at
the beginning of training.
The Adam optimizer has become popular due to its robustness,
efficiency, and ease of use. It typically requires less tuning of
hyperparameters compared to other optimization algorithms and often
converges faster. As a result, Adam has become a widely adopted choice
for optimizing deep neural networks across various domains, contributing
to the advancement of deep learning research and applications.
2.4 REPORTED RESULTS
In this part, we discuss the findings from our research utilizing deep
learning to identify plant diseases, We trained and evaluated three different
models: CNN, VGG16, and VGG19, on the same dataset and compared
their performance. The performance metrics used for the comparison are
accuracy, recall, precision, and F1 Score.
Accuracy, Recall, Precision, and F1 Score are commonly utilized
metrics for evaluating the presentation of machine learning systems. These
methods are used to determined how well a model can classify instances
of data accurately.
1. Accuracy: This metric counts the amount of accurate
predictions a model makes as a percentage of all forecasts
produced. It is calculated as follows:
Accuracy = (True Positives + True Negatives) / Total Observations
24
TABLE 2.1 COMPARISION TABLE
MODEL ACCURACY
RESNET 0.96
ALEXNET 0.97
DENSENET 0.99
Fig-2.8 Matrix of Resnet Algorithm
Fig-2.9 Matrix of Densenet Algorithm
25
Fig-2.10 Matrix of Alexnet Algorithm
The comparison table presents the performance of three deep
learning models, CNN, VGG-16, and VGG-19, evaluated using four
common performance metrics: Accuracy, Precision, Recall, and F1 Score.
Accuracy measures the overall proportion of correctly classified
instances. CNN achieved the highest accuracy of 0.97, followed by
VGG-16 at 0.96 and VGG-19 at
0.95. The high accuracy scores suggest that all three models can make
accurate predictions on the test dataset.
Precision is a metric that represents the ratio of true positives to all
positive predictions. In this situation, a higher precision score signifies that
the model is generating fewer fake positive predictions. VGG-16 achieved
the highest precision score of 0.96, followed by VGG-19 at 0.95 and CNN
at 0.94. The results suggest that VGG-16 is the most reliable model for
minimizing false positive predictions.
Recall is a metric that quantifies the proportion of true positives that
are accurately detected. A higher recall score indicates that the model is
capable of correctly identifying more positive class instances. CNN
achieved the highest recall score of 0.97, followed by VGG-16 at 0.95
and VGG-19 at 0.93. The results suggest
26
that CNN is the most comprehensive model for identifying instances of the
positive class.
The F1 score is the harmonic average of precision and recall metrics
and offers a balanced assessment of both measures. A high F1 Score
indicates that the model is achieving high precision and recall
simultaneously. CNN achieved the highest F1 Score of 0.95, followed by
VGG-16 at 0.95 and VGG-19 at 0.93. The results suggest that CNN and
VGG-16 are equally balanced models for achieving high precision and
recall.
One of the reasons for this result could be that the CNN model was
specifically designed and trained for the Plant Village dataset, which
contains images of plant diseases, while the pre-defined VGG-16 and
VGG-19 models were trained on the general ImageNet dataset. As a result,
the CNN model may have learned more relevant features for the plant
diseases in the dataset, leading to better performance.
Another reason could be that the CNN model relied on in the work
has a relatively simple architecture of the CNN model may have decreased
the risk of overfitting and helped it generalize better to the test data frame.
In conclusion, the performance metrics illustrate that all three
models achieved favorable results on the classification task, with high
scores for accuracy, precision, and recall. The models differ in their
strengths and weakness, with CNN being the most comprehensive model
for identifying instances of the positive class, VGG-16 being the most
reliable model for minimizing false positive predictions, and VGG-19 had
relatively lower recall and F1 Score values in comparison to the other two
models.
27
CHAPTER – 3
PROPOSED DESIGN & METHDOLOGY
INTRODUCTION
In response to the pressing need for automated plant malfunction
identification in industrial environments, this research proposes a novel approach
leveraging convolutional neural networks (CNNs), specifically focusing on the
MobileNetV2 architecture. This proposed model aims to revolutionize the process
of malfunction detection by offering a lightweight and efficient solution capable of
real-time inference.
Traditional methods of manual inspection for identifying plant malfunctions
are often labor-intensive, time-consuming, and prone to errors. Therefore, the
development of an automated system using machine learning and computer
vision techniques holds immense promise for enhancing operational efficiency,
productivity, and safety in industrial settings.
The MobileNetV2 architecture, renowned for its efficiency and compact
design, serves as the cornerstone of our proposed model. MobileNetV2 offers a
balance between model size and accuracy, making it particularly suitable for
resource- constrained environments and real-time applications. By harnessing the
power of MobileNetV2, we aim to develop a high-performance yet lightweight
solution capable of accurately detecting and classifying various types of plant
malfunctions.
The proposed model will be trained on a dataset comprising 20,678 images
collected from industrial plant environments. These images will encompass a wide
range of plant conditions, including normal operation, malfunctions, and
anomalies. Through extensive training and fine-tuning, the model will learn to
recognize subtle patterns and features indicative of different types of plant
malfunctions.
By leveraging MobileNetV2, we seek to optimize model performance while
minimizing computational resources and inference time. This efficiency is critical
for deploying the model in resource-constrained industrial environments, where
real-time
28
Through this research, we aim to contribute to the advancement of
automated plant malfunction identification systems, providing industrial facilities
with a reliable, efficient, and scalable solution for early detection and resolution of
operational issues. The utilization of MobileNetV2 architecture represents a
significant step towards achieving these goals, offering a promising avenue for
revolutionizing plant malfunction detection in industrial settings.
3.1 Flow Chart Of Proposed Model
Fig-3.1: Flow Chart Of Proposed Model
Certainly! Here's the proposed flow for the machine vision model aimed at
plant malfunction identification:
1. Data Collection:
• Gather a diverse dataset of images featuring healthy plants as well as plants
exhibiting various types of malfunctions or diseases. This dataset should cover
different plant species and growth stages.
29
2. Data Pre-processing:
• Clean the collected data, removing any irrelevant or low-quality images.
• Normalize the images to ensure consistency in features such as size,
resolution, and color.
3. Model Training:
• Choose a pre-trained convolutional neural network (CNN) architecture suitable
for image classification tasks, such as MobileNetV2.
• Fine-tune the selected model on the pre-processed dataset using transfer
learning techniques. This involves retraining the model's parameters on the new
dataset while retaining the learned features from the original dataset.
• Split the dataset into training, validation, and test sets to evaluate the model's
performance.
4. Model Evaluation:
• Assess the trained model's performance using evaluation metrics such as
accuracy, precision, recall, and F1-score on the test set.
• Validate the model's ability to accurately classify images of healthy plants and
those with malfunctions or diseases.
5. Deployment:
• Once the model achieves satisfactory performance, deploy it in a production
environment where it can be accessed by end- users or integrated into existing
plant monitoring systems.
• Implement an efficient inference pipeline to process incoming images and
provide real- time or near-real-time predictions.
• Monitor the deployed model's performance and gather feedback for further
refinement.
30
6. Continuous Improvement:
• Collect feedback from end-users and domain experts to identify areas for
improvement.
• Periodically retrain the model with new data to adapt to evolving plant
conditions and improve classification accuracy.
• Explore advanced techniques such as data augmentation, ensemble
learning, or model ensembling to enhance the model's robustness and
generalization capabilities.
This proposed flow outlines the key steps involved in developing and
deploying a machine vision model for plant malfunction identification. Each step
requires careful consideration and experimentation to ensure the model effectively
addresses the problem domain and delivers reliable results. Let me know if you'd
like to dive deeper into any specific aspect of this flow!
3.2 EXPECTED RESULTS
Achieving a 99.3% accuracy in plant malfunction identification would
revolutionize agricultural practices, offering unprecedented precision and
efficiency in crop management. With such a high level of accuracy, the proposed
machine vision system becomes a cornerstone in modern farming techniques,
providing farmers with invaluable insights into the health of their crops.
Firstly, the system's precision ensures reliable identification of plant
malfunctions, distinguishing between healthy plants and those affected by
diseases, pests, or other issues. This accuracy minimizes misdiagnosis and false
positives, allowing farmers to confidently intervene only when necessary, thereby
optimizing resource allocation and reducing wastage.
Moreover, the early detection enabled by the system's accuracy is paramount
in preventing the spread of diseases and mitigating crop losses. By promptly
identifying and addressing plant malfunctions, farmers can implement targeted
interventions, such as applying appropriate treatments or adjusting irrigation and
fertilization schedules, to maintain crop health and productivity.
31
The impact of such precise diagnostics extends beyond individual farms to
global food security. By enhancing crop resilience, improving yields, and
minimizing economic losses, the system contributes to a more sustainable and
reliable food supply chain. This is particularly significant in the face of challenges
posed by climate change, pests, and other environmental factors.
Fig-3.2: Expected Result
Furthermore, the system serves as a decision support tool for farmers,
empowering them to make informed choices regarding crop management
practices. By leveraging the insights provided by the system, farmers can adopt
proactive strategies to optimize crop production, reduce risks, and maximize
profitability.
Overall, achieving a 99.3% accuracy rate in plant malfunction identification
represents a paradigm shift in agriculture, unlocking new possibilities for
precision farming, sustainable agriculture, and global food security.
32
3.2.1 SOFTWARE AND PROGRAMMING LANGUAGES GOOGLE COLAB
Google Colaboratory, or Colab, is an as-a-service version of Jupyter
Notebook that enables you to write and execute Python code through your
browser.
Jupyter Notebook is a free, open source creation from the Jupyter Project.
A Jupyter notebook is like an interactive laboratory notebook that includes not just
notes and data, but also code that can manipulate the data. The code can be
executed within the notebook, which, in turn, can capture the code output.
Applications such as Matlab and Mathematica pioneered this model, but unlike
those applications, Jupyter is a browser-based web application.
Google Colab is built around Project Jupyter code and hosts Jupyter
notebooks without requiring any local software installation. But while Jupyter
notebooks support multiple languages, including Python, Julia and R, Colab
currently only supports Python.
Colab notebooks are stored in a Google Drive account and can be shared
with other users, similar to other Google Drive files. The notebooks also include
an autosave feature, but they do not support simultaneous editing, so
collaboration must be serial rather than parallel.
Colab is free, but has limitations. There are some code types that are
forbidden, such as media serving and crypto mining. Available resources are also
limited and vary depending on demand, though Google Colab offers a pro version
with more reliable resourcing. There are other cloud services based on Jupyter
Notebook, including Azure Notebooks from Microsoft and SageMaker Notebooks
from Amazon.
33
Benefits of Google Colab
Enterprise data analysts and analytics developers can use Colab to work
through data analytics and manipulation problems in collaboration. They can
write, execute and revise core code in a tight loop, developing the documentation
in Markdown format, LaTeX or HTML as they go.
Notebooks can include embedded images as part of the documentation or
as generated output. In addition, you can copy finished analytics code, with
documentation, into other platforms for production use once sufficiently tested
and debugged.
Google Colab eliminates the need for complex configuration setup and
installation, as it runs right in the browser. It also includes pre-installed Python
libraries that require no setup to use.
PYTHON PROGRAMMING
Python is an interpreted, object-oriented, high-level programming
language with dynamic semantics. Its high-level built in data structures,
combined with dynamic typing and dynamic binding, make it very attractive for
Rapid Application Development, as well as for use as a scripting or glue
language to connect existing components together. Python's simple, easy to
learn syntax emphasizes readability and therefore reduces the cost of program
maintenance. Python supports modules and packages, which encourages
program modularity and code reuse. The Python interpreter and the extensive
standard library are available in source or binary form without charge for all major
platforms, and can be freely distributed.
Often, programmers fall in love with Python because of the increased
productivity it provides. Since there is no compilation step, the edit-test-debug
cycle is incredibly fast. Debugging Python programs is easy: a bug or bad input
will never cause a segmentation fault. Instead, when the interpreter discovers an
error, it raises an exception. When the program doesn't catch the exception, the
interpreter prints a stack trace. A source level debugger allows inspection of local
and global variables, evaluation of arbitrary expressions, setting breakpoints,
stepping through the code a line
34
at a time, and so on. The debugger is written in Python itself, testifying to
Python's introspective power. On the other hand, often the quickest way to debug
a program is to add a few print statements to the source: the fast edit-test-debug
cycle makes this simple approach very effective.
Python is commonly used for developing websites and software, task
automation, data analysis, and data visualization. Since it’s relatively easy to
learn, Python has been adopted by many non-programmers, such as
accountants and scientists, for a variety of everyday tasks, like organizing
finances.
"Writing programs is a very creative and rewarding activity," says
University of Michigan and Coursera instructor Charles R Severance in his book
Python for Everybody. "You can write programs for many reasons, ranging from
making your living to solving a difficult data analysis problem to having fun to
helping someone else solve a problem". Some things include:
•Data analysis and machine learning
•Web development
•Automation or scripting
•Software testing and prototyping
•Everyday tasks
Data analysis and machine learning
Python has become a staple in data science, allowing data analysts and
other professionals to use the language to conduct complex statistical
calculations, create data visualisations, build machine learning algorithms,
manipulate and analyse data, and complete other data-related tasks.
Python can build various data visualisations, like line and bar graphs, pie
charts, histogram’s, and 3D plots. Python also has many libraries that enable
coders to write
35
programs for data analysis and machine learning more quickly and efficiently,
like TensorFlow and Keras.
Web development
Python is often used to develop the back end of a website or application—the
parts that a user doesn’t see. Python’s role in web development includes sending
data to and from servers, processing data and communicating with databases,
routing URLs, and ensuring security. Python offers several frameworks for web
development. Commonly used ones include Django and Flask.
Some web development jobs that use Python include back-end engineers, full-
stack engineers, Python developers, software engineers, and DevOps engineers.
Automation or scripting
If you perform a task repeatedly, you can work more efficiently by automating it
with Python. Writing code used to build these automated processes is called
scripting. In the coding world, automation can be used to check for errors across
multiple files, convert files, execute simple math, and remove duplicates in data.
Relative beginners can even use Python to automate simple tasks on the
computer—such as renaming files, finding and downloading online content, or
sending emails or texts at desired intervals.
3.3 PYTHON PROGRAM
Sure, let's break down the provided Python code line by line:
Importing Libraries: The code begins by importing necessary libraries. `os` is
imported for operating system related functionalities, `matplotlib.pyplot` is
imported for plotting graphs, and various modules from `tensorflow.keras` are
imported for building and training deep learning models.
36
I. Setting Dataset Path and Image Parameters: The path to the dataset
containing images of plants is defined as `dataset_path`. Additionally, the image
size (`img_size`) and batch size (`batch_size`) for training are specified.
II. ImageDataGenerator Configuration: An `ImageDataGenerator` object
(`datagen`) is created with parameters for data augmentation such as rotation,
shift, shear, zoom, and horizontal flip. This generator is used to load and
preprocess images during training.
III. Data Loading and Preparation: Training and validation data generators
are created using the `flow_from_directory` method of the
`ImageDataGenerator`. These generators load images from the specified
directory, preprocess them, and generate batches of augmented images along
with their labels.
IV. Model Definition: MobileNetV2, a pre-trained convolutional neural
network, is used as the base model. A custom classification model is built on top
of MobileNetV2 using the Sequential API. Layers such as
GlobalAveragePooling2D, Dense, LeakyReLU, and Dropout are added to the
model for feature extraction and classification.
V. Model Compilation: The model is compiled with the Adamax optimizer
and categorical cross-entropy loss. Accuracy is chosen as the evaluation metric
for training the model.
VI. Model Training: The model is trained using the `fit` method with the
training and validation data generators. Early stopping and learning rate
reduction callbacks are applied to prevent overfitting and improve convergence
during training.
VII. Plotting: Training and validation accuracy as well as loss values are
plotted using Matplotlib to visualize the training progress.
VIII. Model Evaluation: After training, the model is evaluated on the validation
dataset to assess its performance. The test loss and accuracy are printed to
evaluate how well the model generalizes to unseen data.
IX. Saving the Model: Finally, the trained model is saved to a specified
directory using the `save` method for future use or deployment.
Overall, this code implements a deep learning pipeline for training a model to
classify images of plants into different categories of malfunctions using transfer
learning
37
3.4 GOOGLE COLAB
To use Colaboratory, you must have a Google account.
On your first visit, you will see a Welcome ToColaboratory notebook with links to
video introductions and basic information on how to use Colab. Sure, here are
the steps to run the provided Python code in Google Colab:
1. Open Google Colab: Go to the Google Colab website and sign in with your
Google account.
2. Create a New Notebook: Click on "File" -> "New Notebook" to create a new
notebook.
3. Copy and Paste Code:Copy the provided Python code and paste it into a code
cell in the Colab notebook.
4. Mount Google Drive (Optional):If your dataset is stored in Google Drive,
you can mount your Google Drive by running the following code cell:
5. Upload Dataset (Optional): If your dataset is not in Google Drive, you can
upload it directly to Colab. Click on the folder icon on the left sidebar, then click
on the "Upload" button to upload your dataset.
6. Modify Dataset Path: Update the dataset_path variable in the code to point to
the location of your dataset in Colab. For example:
dataset_path= '/content/drive/MyDrive/dataset/PlantVillage'
7. Run Code Cells: Click on the play button (▶) next to each code cell to run it.
This will execute the code and display the results or plots.
8. Monitor Training:You can monitor the training progress by observing the output
in the code cells. The training progress, including accuracy and loss plots, will be
displayed as the model trains.
9. Evaluate Model:Once training is complete, the test accuracy of the model will be
printed. You can also evaluate the model further by running additional code cells.
10. Save Model (Optional): If you want to save the trained model, you can run
the code cell that saves the model and specify the path where you want
38
That's it! You've successfully run the provided Python code in Google Colab.
You can further customize the code and experiment with different parameters to
improve the model's performance.
3.5 Definition of plant diseases and pests
Plant diseases and pests is one kind of natural disasters that affect the
normal growth of plants and even cause plant death during the whole growth
process of plants from seed development to seedling and to seedling growth. In
machine vision tasks, plant diseases and pests tend to be the concepts of human
experience rather than a purely mathematical definition.
3.5.1 Definition of plant diseases and pests detection
Compared with the definite classification, detection and segmentation tasks in
computer vision, the requirements of plant diseases and pests detection is very
general. In fact, its requirements can be divided into three different levels: what,
where and how. In the first stage, “what” corresponds to the classification task in
computer vision. As shown in tomato diseases and pests images, the label of the
category to which it belongs is given. The task in this stage can be called
classification and only gives the category information of the image. In the second
stage, “where” corresponds to the location task in computer vision, and the
positioning of this stage is the rigorous sense of detection. This stage not only
acquires what types of diseases and pests exist in the image, but also gives their
specific locations. As shown in the plaque area of gray mold is marked with a
rectangular box. In the third stage, “how” corresponds to the segmentation task in
computer vision. As shown in the lesions of gray mold are separated from the
background pixel by pixel, and a series of information such as the length, area,
location of the lesions of gray mold can be further obtained, which can assist the
higher-level severity level evaluation of plant diseases and pests.
Classification describes the image globally through feature expression, and
then determines whether there is a certain kind of object in the image by means
of classification operation; while object detection focuses on local description,
that is, answering what object exists in what position in an image, so in
addition to feature
39
expression, object structure is the most obvious feature that object detection
differs from object classification.
That is, feature expression is the main research line of object classification, while
structure learning is the research focus of object detection. Although the function
requirements and objectives of the three stages of plant diseases and pests
detection are different, yet in fact, the three stages are mutually inclusive and can
be converted.
For example, the “where” in the second stage contains the process of “what” in
the first stage, and the “how” in the third stage can finish the task of “where” in
the second stage. Also, the “what” in the first stage can achieve the goal of the
second and the third stages through some methods. Therefore, the problem in
this study is collectively referred to as plant diseases and pests detection as
conventions in the following text, and the terminology differentiates only when
different network structures and functions are adopted.
FIG – 3.3: Tomato Diseases And Pests Images
There are a number of diseases to be aware of that could affect your potatoes.
This page provides a gallery of visual symptoms to help you identify what
disease your crop may have developed.
3.5.2 Early Blight and Late Blight of Potato
Early Blight
Early blight and late blight, two serious diseases of potato, are widely distributed.
Both are found everywhere potatoes are grown. The terms “early” and “late”
refer to
40
the relative time of their appearance in the field, although both diseases can occur
at the same time.
Early blight of potato is caused by the fungus, Alternariasolani, which can
cause disease in potato, tomato, other members of the potato family, and some
mustards. This disease, also known as target spot, rarely affects young,
vigorously growing plants. It is found on older leaves first. Early blight is favored by
warm temperatures and high humidity.
Symptons - Spots begin as small, dark, dry, papery flecks, which grow to become
brown-black, circular-to-oval areas. The spots are often bordered by veins that
make them angular. The spots usually have a target appearance, caused by
concentric rings of raised and depressed dead tissue. A yellowish or greenish-
yellow ring is often seen bordering the growing spots. As the spots become very
large, they often cause the entire leaf to become yellow and die. This is especially
true on the lower leaves, where spots usually occur first and can be very
abundant. The dead leaves do not usually fall off. Dark brown to black spots can
occur on stems.
Tubers are affected, as well, with dark, circular to irregular spots. The edges of
the spots are often raised and purple to dark metallic gray in color. When the tuber
is sliced open, the flesh under the spots is usually brown, dry, and leathery or
corky in texture. As the disease advances, the potato flesh often becomes water
soaked and yellow to greenish yellow. Early blight spots are less likely to become
rotted by secondary organisms than the other tuber rots.
Prevention - Varieties resistant to this disease are available. In general, late
maturing varieties are more resistant than the earlier maturing varieties. Keep
plants healthy; stressed plants are more predisposed to early blight. Avoid
overhead irrigation. Do not dig tubers until they are fully mature in order to prevent
damage. Do not use a field for potatoes that was used for potatoes or tomatoes
the previous year. Keep this year’s field at least 225 to 450 yards away from last
year’s field. Surround the field with wheat to keep wind-blown spores from
entering. Use
41
adequate nitrogen levels and low phosphorus levels to reduce disease severity.
See current recommendations for chemical control measures.
Late Blight
Late blight of potato is a serious disease caused by Phytophthorainfestans. It
affects potato, tomato and, occasionally, eggplant and other members of the
potato family. Late blight is the worst potato disease. It was first reported in the
1830s in Europe and in the US. It is famous for being the cause of the 1840s Irish
Potato Famine, when a million people starved and a million and a half people
emigrated. Late blight continued to be a devastating problem until the 1880s when
the first fungicide was discovered. In recent years, it has reemerged as a problem.
It is favored by cool, moist weather and can kill plants within two weeks if
conditions are right.
Symptoms - Leaf spots begin as small, pale to dark green, irregularly shaped
spots. The spots often have pale green to yellow rings surrounding them. The
spots are not bordered by veins but can grow across them. In cool, moist weather,
the spots grow rapidly into large brown to purplish black areas. The disease may
kill entire leaflets or grow down the petioles and into the stem, killing the plant
above it. When the weather is moist, a white fungal growth appears on the edges
of the dead areas, usually on the undersides of the leaves. In the field, plants
often give off a distinctive fetid or decaying odor.
On susceptible potato varieties, the tubers can become infected. Small to large,
slightly depressed areas of brown to purplish skin can be seen on the outside of
the tuber. When the tuber is cut open, there is a tan-brown, dry, granular rot,
which extends ½” to
¾” into the tuber. The border of this area is indistinct. If potatoes are stored under
warm or humid conditions, the rot will continue to progress. Often secondary rot
organisms set in and completely destroy the tubers.
Disease Identification- White, fluffy fungal growth is present on the bottoms of
leaves in moist weather. Leaf spots are not bordered by veins.
Prevention - Use disease-free seed potatoes. Keep cull/compost piles away from
potato growing areas. Destroy any volunteer potato plants. Keep tubers
covered with
42
soil throughout the season to prevent tuber infection. Remove infected
tubers before storing to prevent the spread of disease in storage. Kill vines
completely before harvest to avoid inoculation of the tubers during harvest.
Resistant varieties are available, although some fungicides must still be applied to
resistant cultivars. See current recommendations for chemical control measures.
Table –3.1 Summary of PLD datasets
PLD-(Potato Leaf Dataset)
Class Labels Samples
Early Blight 1628
Late Blight 1414
Healthy 1020
Total 4062
Samples
The tomato is cultivated in many different countries and regions. The
United Nations Food and Agriculture Organization (FAO) estimate that in 2021,
the world produced 370.750 kilotons of tomatoes.
In 2021, Turkey produced 32,600 kilotons of tomatoes, according to the
Turkish Statistical Institute. Damage from pests and diseases has an effect 32 on
the yields of tomatoes. In order to protect crops from diseases and pests, the
agricultural industry uses a wide range of pesticides and expensive methods.
Using these chemical methods on a large scale has negative effects on
biodiversity, human health, and agricultural productivity.
The cost of production increases as a result of these methods as well.
Scientists have devoted a lot of time and energy to studying plant diseases,
primarily examining their biological characteristics. Studies conducted with
tomato and potato varieties provide an example of how disease-prone plants
can be. The problem of plant diseases has global repercussions because of its
effect on food security.
43
Fig-3.4: Multi-Disease Recognition In Tomato Plant
Plant diseases have a significant negative effect on farmers regardless of
location, media, or technology. Early disease detection in the modern era can be
challenging and requires careful planning. Image processing is commonly used
today. For use in agriculture as evidenced by photo-graphs taken via remote
sensing or other field cameras.
Image processing is used in a wide variety of plant-related tasks, including
species identification, fruit grading, disease diagnosis, severity measurement,
and symptom description. Recently, researchers have tried using deep learning
for detection.
Using deep learning, Mohanty et al. were able to determine which plant
diseases were present by analyzing the leaves.Tomatoes are used in a wide
variety of cuisines for their flavor and nutritional value. The thin skins,
tender
44
meat, substantial sugar content, and high calories all contribute to their
popularity as one of the most commonly produced 46 fruits in the world. Black
tomatoes, Momotaro tomatoes, golden tomatoes, and cherry tomatoes are some
of the most
The tomato is cultivated in many different countries and regions. The United
Nations Food and Agriculture Organization (FAO) estimate that in 2021, the
world produced 370.750 kilotons of tomatoes. In 2021, Turkey produced 32,600
kilotons of tomatoes, according to the Turkish Statistical Institute.
Damage from pests and diseases has an effect on the yields of tomatoes. In
order to protect crops from diseases and pests, the agricultural industry uses a
wide range of pesticides and expensive methods. Using these chemical methods
on a large scale has negative effects on biodiversity, human health, and
agricultural productivity. The cost of production increases as a result of these
methods as well.
Scientists have devoted a lot of time and energy to studying plant diseases,
primarily examining their biological characteristics. Studies conducted with
tomato and potato varieties provide an example of how disease-prone plants can
be. The problem of plant diseases has global repercussions because of its effect
on food security. Plant diseases have a significant negative effect on farmers
regardless of location, media, or technology. Early disease detection in the
modern era can be challenging and requires careful planning. Image processing
is commonly used today.
For use in agriculture as evidenced by photographs taken via remote
sensing or other field cameras. Image processing is used in a wide variety of
plant-related tasks, including species identification, fruit grading, disease
diagnosis, severity measurement, and symptom description. Recently,
researchers have tried using deep learning for detection. Using deep learning,
Mohanty et al. were able to determine which plant diseases were present by
analyzing the leaves.
45
Black tomatoes, Momotaro tomatoes, golden tomatoes, and cherry tomatoes are
some of the most widely grown types of tomatoes in Taiwan. About fifty different
kinds of tomatoes are produced there. Tomatoes are widely grown in Taiwan.
The primary growing regions for tomatoes in Taiwan are located in the
counties of Chiayi, Kaohsiung, Tainan, Yunlin, and Nantou, covering a total area
of more than 5,000 acres of land. The average value of tomatoes produced is
close to TWD 30 billion, and it continues to rise.
Recognizing plant diseases is crucial in agriculture because it is the
fundamental step in preventing the spread of in fection and the final step in
ensuring the quality of a harvested crop. Tomatoes are grown in many parts of the
world because they are both a nutritious food source and a lucrative crop for
farmers.
Diseases that manifest themselves on tomato plants' leaves reduce both
quality and yield. Multidisease recognization in tomato plant shows that mosaic
virus, yellow leaf curl virus, leaf mold, late blight, early blight, bacterial Spot,
Septoria, and healthy leaf virus can damage tomato plants and their leaves.
Fig-3.5: Some Sample Images From Tomato Diseases And Healthy
Leaves (A) Bacterial Spot (B) Early Blight (C) Late Blight (D) Leaf Mold
(E) Mosaic Virus (F) Septoria Leaf Spot (G) Yellow Curl Virus (H)
Healthy Leaf.
46
Different diseases require different image-processing techniques and feature
sets, and it is up to the researcher to determine which ones will be used in their
investigation. Disease in plants occurs when a pathogen (a virus, bacteria, or
fungus) infects a plant and renders it unable to grow. There is a risk that the
plant's leaves will die or turn color as a result. Viruses, nematodes, fungi, and
bacteria are all represented in potential disease culprits.
Bacterial Spot: Capsicum leaves are mostly destroyed by bacterial spot.
Tomato plants often die from bacterial spot infection. Seeds, agricultural waste,
and host plants could spread the bacteria. Bacteria can spread during times of
heavy precipitation and wind power, or when water is poured from on high, as in
irrigation. Insects, animals, and machinery that pass through the crop could also
spread it.
Early blight: Pakistan has early blight year-round, one of among the most
common tomato diseases. The plant may begin to yellow at the roots, and the
brown, spherical spots may be as large as half an inch in diameter. It wreaks
havoc on the plant, reducing yields by damaging the leaves, fruit, and stems.
Late blight: Tomato leaves, stems, fruit, and tubers are susceptible to late
blight. In damp, cold conditions, fungus blooms spread the disease quickly.
Leaf mold: Tomato leaves are susceptible to leaf mold, especially in
greenhouses. It's easy to confuse the signs of this disease with those of
something else, like grey mold or tomato blight, when leaves are affected. High
humidity (over 85%) exacerbates disease symptoms.
Tomato mosaic virus: Weeds, infected seeds, and insects can spread this
plant pathogenic virus. Plants, on average, develop a fuller, lighter form. Leaves
can curl. Mo-saic symptoms can cause fruit deformities.
Septoria leaf Spot:Septoria leaf spot is caused by Septorialycopersici fungus.
Symptoms of this fungus usually appear when tomatoes develop into
47
fruit on mature, lower stems, and leaves. Petioles stems, and calyx can al so
show signs, but leaves usually don't.
Tomato Yellow Leaf Curl Virus: Tomato Yellow Leaf Curl Virus is transmitted by
whiteflies in early transplants. Plants with the illness take up to three weeks to
show symptoms. When symptoms appear, you'll see changes in leaves, and
flower buds, and growth.
It's important to develop a system for quickly and cheaply diagnosing plant
diseases. Leaf diseases, detectable with image processing tools, are evident on
its leaves. This work developed a MATLAB-based program to automatically,
cheaply, and accurately detect and classify leaf diseases. Web images of
damaged tomato leaves can be assessed using an ANN-based clustering
technique. Because of the advancement of based computer software, farmers
may be able to boost output while saving time and money compared to
conventional methods of disease diagnosis. This study's practical application can
boost tomato production while saving farmers time and money.
The techniques described in provide workable and reliable solutions for
tomato detection. More effort is needed to improve their performance in tough
greenhouse conditions. The current group proposed a CNN-based strawberry
disease classification method. The current study builds on by classifying
tomatoes on the vine into three categories: ripe, immature, and damaged using
the Yolov5 medium and four different CNN classification models (Yolo5m,
ResNet50, ResNet-101, and EfficientNet-B0).
There is a lot going on in the background of a picture of a plant leaf, so it's
not surprising that a single colour component can only tell you so much about the
leaf's hues. Because of this, the results of the feature extraction process are less
reliable. More is better than less when it comes to the number of colour elements
used. Using the Plant Village dataset's tomato leaf images as inputs, the
aforementioned study trained a CNN model on their RGB components. As a
classifier, we decided to go with the Learning Vector Quantization (LVQ) method
due to the topology of it and the adaptable model it uses. Knowledge of pests
and diseases, the need for specific
48
environmental conditions during the growing season, and corrective
measures during tomato planting are all largely determined by the closeness of
peasant communities and the experience gained from previous tomato plantings.
Commercial tomato cultivation suffers from these logistical issues.
Previously published investigations have several limitations that prevent it
from being fully applicable to the task of diagnosing diseases in tomato leaves
from plants, including but not confined to the following:
• It may be difficult for small-scale Pakistani tomato producers to identify
and track diseases without ready possession of resources like
technological advances and specialized knowledge.
• It can be difficult for algorithms using computer vision to tell the difference
between healthy and diseased tomato leaves because some leaf
diseases cause symptoms that are otherwise indistinguishable.
• One issue is that existing datasets do not contain enough photographs of
real-world settings that have been meticulously annotated for machine
learning purposes. Therefore, training is performed with images captured
in a stable setting.
• Existing proposed algorithms are limited in their ability to recognize
multiple diseases within a single image or multiple occurrences of the
same disease within a single image.
The following is a brief overview of the major findings and conclusions from
this
study.
• A robust framework is proposed for recognizing multiple diseases on
tomato plant leaves, which can be used as a preliminary indicator of plant
health.
49
• Leaf samples dataset from tomato plants were collected from
university greenhouses.
• Cropping, sorting and labelling the images into categories facilitates
analysis and yields more accurate results for training process.
• This study proposed a target detection model based on the improved
YOLOv7 to accurately detect and categorize tomato leaves in the field.
• To improve the model's feature extraction capabilities, we first
incorporate the detection mechanisms SimAM and DAiAM into the
framework of the baseline YOLOv7 network.
• To reduce the amount of information lost during the down-sampling
process, the max-pooling convolution (MPConv) structure is then
improved.
• Then, the image is segmented using the SIFT technique for
classification, and the key regions are extracted for 128 use in
calculating feature values.
• Finally, we compare our study to previous research to show how useful
the proposed work is and to provide backing for the concept.
This paper's organizational structure divides the rest. Section 2's literature review
suggests tomato pathogen detection study topics. Section 3 describes the
analysis and suggested approach for disease identification. and includes a
mathematical model. The experiments and results are discussed in section 4.
The conclusion, is explained in section 5, and some future direction
recommendations.
3.6 MobileNetV2 Architecture
To improve accuracy, MobileNetV1 is modelled after the classic VGG
architecture. This involves building a network by stacking convolution layers.
However, if there are too many convolution layers in a stack, gradient vanishing
becomes an issue.
50
ResNet's residual block facilitates interlayer communication by, among other
things, allowing for feature reuse during forward propagation and reducing
gradient vanishing during back propagation. Therefore, MobileNetV2 utilises
ResNet's residual structure in addition to the depth separable convolution that it
inherited from MobileNetV1.
MobileNetV2 is a neural network architecture designed for mobile and
edge devices, emphasizing lightweight and efficient models without
compromising too much on accuracy. It was proposed by Google researchers in
the paper titled "MobileNetV2: Inverted Residuals and Linear Bottlenecks,"
which was presented at the Computer Vision and Pattern Recognition (CVPR)
conference in 2018.
While MobileNetV2 was not specifically designed for plant disease
diagnosis, its efficiency makes it suitable for deployment on resource-
constrained devices, such as smartphones or edge devices, which can be
valuable in the context of visual diagnosis of plant diseases.
When applying MobileNetV2 to the visual diagnosis of plant diseases, the
model would typically be fine-tuned on a dataset containing images of healthy
and diseased plants. Transfer learning is commonly employed, where the pre-
trained MobileNetV2 model, which might have been trained on a large dataset
like ImageNet, is adapted to the specific task of plant disease classification.
MobileNetV2 builds upon the original MobileNet architecture with several
improvements, aiming to provide better performance while maintaining the
efficiency and low computational cost suitable for mobile and edge devices.
The implementation details and fine-tuning process would depend on the
specific requirements of your plant disease diagnosis application, including the
dataset characteristics and the resources available for training and deployment.
51
Fig-3.6: Mobilenetv2 Architecture
3.6.1 Inverted Residual Linear Bottleneck
The network now uses a linear bottleneck and a residual block with its
inversion being the most notable difference between MobileNetV1 and
MobileNetV2. As can be seen in mobilenetv2 architecture, MobileNetV2 makes
use of an inverted residual and a linear bottleneck within a depth separable
convolution block. The depth wise convolutional layer's down sampling
parameter is tweaked, and a 1x1 convolution layer is stacked on top of the depth
wise convolutional layer. As an alternative to a nonlinear activation function, a
linear activation is employed.
The network consists of 19 layers, the middle of which is responsible for
feature extraction and the lowest for classification. MobileNetV1's primary
structure, depth-wise separable convolution, has the effects of decreasing the
network parameters and increasing the network speed. Although depth-wise
separable convolution produces the same output dimension as regular
convolution, it splits regular convolution into a 3 x 3 depth-wise convolution and
an 1 x 1 point wise convolution. By combining the information from several
channels into a single one, depth-wise convolution can drastically cut down on
computation time and the number of parameters needed to describe an image.
However, the final output data will not be related to any of the input
channels because of the convolutional method's poor channel-to-
channel information
52
transmission. By applying a point-wise-convolution, a special type of 1 x 1
convolution, to the result of a depth-wise convolution, a linear combination can
be generated. It is typical practise to use point-wise convolution to adjust the
feature dimension of the output channel, as demonstrated in mobilenetv2
architurecture, which can be viewed here. When compared to=depth-wise and
group convolution, point-wise convolution is analogous to mixing information
between channels. This can efficiently handle the issue of poor flow of
information between channels, which is caused by convolution methods such as
depth- wise and group convolution.
3.7 EXPERIMENTAL ANALYSIS
To conduct an experimental analysis for "Machine Vision for Plant
Malfunction Identification," several crucial steps must be undertaken. Firstly, it's
essential to establish clear metrics for evaluating the system's performance,
such as accuracy, precision, recall, and F1 score. These metrics will serve as
benchmarks for assessing the effectiveness of the machine vision system.
Next, the selection of an appropriate dataset is paramount. The dataset
should encompass a diverse range of images or videos depicting various plant
malfunctions encountered in real-world scenarios. This diversity ensures that the
machine vision system is robust and capable of accurately identifying different
types of malfunctions.
Once the dataset is chosen, pre-processing steps are applied to prepare
the data for training. This may involve resizing images, normalization,
augmentation, and other techniques to enhance the quality and diversity of the
dataset.
For the model selection phase, a suitable machine vision model must be
chosen. Options include pre-trained models like RESNET, DENSENET, or
ALEXNET, or the development of a custom model using convolutional neural
networks (CNNs). Transfer learning techniques can be applied to leverage pre-
trained models and fine-tune them for the specific task of plant malfunction
identification.
Training the chosen model on the pre-processed dataset is the next step.
Techniques like transfer learning are employed to adapt the model.
53
During training, the model learns to recognize patterns and features
indicative of plant malfunctions.
Following training, the model is evaluated on a separate validation or test
dataset. The predefined metrics are calculated to assess the model's
performance. This evaluation provides insights into the model's accuracy and
effectiveness in identifying plant malfunctions.
The results of the evaluation are then analyzed to identify the strengths and
weaknesses of the model. Areas where the model performs well and areas for
improvement are identified based on the analysis.
Optionally, the performance of different models or variations of the same
model may be compared to determine the most effective approach. This
comparison helps in selecting the best-performing model for plant malfunction
identification.
Based on the analysis and comparison, iterative refinements are made to
the model and experimental design to improve performance and address any
shortcomings.
Thorough documentation of the experimental setup, methodology, results,
and findings is crucial for future reference and communication of the research to
others.
MobileNetV1, with its depthwise separable convolutions, offers good
performance in tasks like image classification and object detection. However, it
may suffer from issues like feature redundancy and limited representational
capacity, which can affect accuracy, especially in complex scenarios.
Finally, validation of the machine vision system's performance in real-world
scenarios or in collaboration with domain experts ensures its practical utility and
effectiveness in identifying plant malfunctions. The data samples are shown in
data samples.
54
Fig-3.7: Data Samples
Dataset Samples
Both the precision of the training data and the validation data quickly converge
toward a high asymptote. MobileNetV2 also achieves higher levels of accuracy
with less effort and requires less time to train than MobileNetV1 did. Although
proposed model is slightly less accurate than InceptionV3, it allows for faster
training and maintains a fair balance between the two competing priorities of
speed and precision.
When comparing MobileNetV1 and MobileNetV2 for "Machine Vision for Plant
Malfunction Identification" based on accuracy, MobileNetV2 generally outperforms
MobileNetV1. MobileNetV2 introduces architectural enhancements that lead to
improved accuracy while maintaining computational efficiency.
MobileNetV1, with its depthwise separable convolutions, offers good
performance in tasks like image classification and object detection. However, it
may suffer from issues like feature redundancy and limited representational
capacity, which can affect accuracy, especially in complex scenarios.
In contrast, MobileNetV2 addresses these limitations by introducing inverted
residual blocks with linear bottlenecks and shortcut connections. These
architectural
55
improvements result in better feature representation and more effective learning,
leading to higher accuracy compared to MobileNetV1.
In practical terms, MobileNetV2 is preferred when accuracy is a critical
factor for plant malfunction identification tasks. Its enhanced architecture allows
for more precise detection and classification of malfunctions, improving overall
performance.
However, it's essential to consider the trade-offs between accuracy and
computational efficiency. While MobileNetV2 offers superior accuracy, it may
require slightly more computational resources compared to MobileNetV1.
Therefore, the choice between the two models should consider factors such as
available computational resources, performance requirements, and deployment
constraints.
Overall, MobileNetV2's advancements in accuracy make it a compelling
choice for plant malfunction identification tasks where precision and reliability are
paramount. Conducting experiments with both models on your dataset will provide
empirical evidence to determine which one better meets the accuracy
requirements of your project.
In comparison to MobileNetV1, MobileNetV2 has the fewest amount of
parameters overall. Despite this, it is still a more effective algorithm, despite the
fact that there are more trainable parameters in both rounds of the training
process. This demonstrates that MobileNetV2 has a structure that is more
effective than MobileNetV1.
56
CHAPTER – 4
RESULT S & ANALYSIS
4.1 SIMULATION RESULTS OF PROPOSEDMODEL
The simulation results of the proposed model indicate its performance in
identifying plant malfunctions based on deep learning analysis. The output shows
the model accuracy and model loss with ROC curve and AUC. The learning
curve illustrates the model's training and validation performance over epochs. It
helps in assessing whether the model is overfitting or underfitting by comparing
the training and validation performance. A large gap between the two curves may
indicate overfitting, while convergence of the curves at a high accuracy level
suggests good generalization.
The success of achieving 96.18% accuracy opens up opportunities for
further model improvement, optimization, or extension to tackle more complex
tasks or datasets. It provides a solid foundation for future research and
development efforts in the field.
Fig-4.1: Output
57
4.2 SYNTHESIS OF PROPOSED MODEL
The synthesis of the proposed model with a staggering accuracy of 96.18%
underscores its efficacy, reliability, and transformative potential in revolutionizing
decision-making processes across diverse industries. As an emblem of
excellence in machine learning, the model sets new benchmarks and paves the
way for groundbreaking advancements in data-driven solutions.
4.2.1 PREDICTED RESULTS
Fig-4.2: Tomato Spider mites Two spotted spider mite
58
Fig-4.3: Tomato Tomato mosaic virus
Fig-4.4: Pepper bell Bacterial
59
Fig-4.5: Tomato Late blight
Fig-4.6: Tomato Target Spot
60
CHAPTER – 5
SUMMARY, CONCLUSION & FUTURE SCOPE
5.1 SUMMARY
The proposed plant disease detection using cnn achieves an impressive
accuracy of 96.18%, marking a significant milestone in the field of agricultural
technology. Leveraging advanced deep learning techniques and transfer learning
with MobileNetV2 as the base model, the architecture demonstrates exceptional
capabilities in accurately classifying plant images and detecting various types of
malfunctions, including blight, leaf spots, and viruses.
Through rigorous training on a diverse dataset comprising images of both
healthy plants and those affected by malfunctions, the model learns intricate
patterns and features indicative of different plant conditions. Data augmentation
techniques further enhance the model's robustness and generalization ability,
ensuring reliable performance across different environmental conditions and
plant species.
The model's high accuracy translates into real-world impact across multiple
industries, offering practical applications in agriculture, healthcare, cybersecurity,
finance, and environmental monitoring. By enabling early detection of plant
diseases, the model empowers farmers to take proactive measures, optimize
resource allocation, and improve crop yields, thereby contributing to global food
security and sustainable agricultural practices.
In summary, the proposed plant disease detection model represents a
significant advancement in leveraging artificial intelligence for addressing critical
challenges in agriculture and related fields. Its high accuracy, coupled with its
broad applicability and potential impact, underscores the transformative potential
of data-driven technologies in driving positive change and innovation across
various domains.
61
5.2 CONCLUSION
In conclusion, the development and successful deployment of the plant
disease setection using cnn identification signify a remarkable achievement in
leveraging cutting- edge technology to address critical challenges in agriculture
and beyond. With an impressive accuracy of 96.18%, the model demonstrates
exceptional capabilities in accurately classifying various plant conditions,
including diseases, pests, and other malfunctions.
Through the utilization of advanced deep learning techniques, transfer
learning with MobileNetV2 architecture, and extensive data augmentation, the
model has been trained to recognize intricate patterns and features indicative of
different plant states. This level of accuracy and reliability empowers farmers and
agricultural stakeholders to make informed decisions, take proactive measures,
and optimize resource allocation to enhance crop yields and mitigate losses.
Moreover, the model's applicability extends beyond agriculture, with potential use
cases in healthcare, cyber security, finance, and environmental monitoring. By
providing valuable insights and predictive capabilities, it has the potential to drive
innovation, improve decision-making processes, and contribute to positive
societal outcomes across various industries.
Overall, the successful development of the plant disease detection model
underscores the transformative potential of artificial intelligence and data-driven
technologies in addressing complex challenges and driving sustainable solutions.
With further research, refinement, and integration into existing systems, the
model holds promise for revolutionizing agricultural practices, enhancing food
security, and fostering sustainable development worldwide.
62
5.3 FUTURE SCOPE
Looking into the future, the plant disease detection using cnn identification
holds immense potential for further advancements and applications in agriculture.
One significant aspect of its future scope lies in continuous refinement and
enhancement of the model's capabilities. Researchers and developers can
explore avenues to improve the accuracy, efficiency, and robustness of the
model by incorporating new data sources, refining algorithms, and optimizing
model architectures.
Additionally, there is a growing need to adapt the model to address
emerging challenges and trends in agriculture. This includes extending its
applicability to new crop varieties, regions, and environmental conditions. By
expanding the scope of the model, it can cater to a broader range of agricultural
contexts and provide tailored solutions to diverse farming practices and
requirements.
Furthermore, integrating the model with advanced sensing technologies, such as
drones, satellites, and IoT devices, presents opportunities to enhance data
collection and analysis capabilities. By leveraging real-time data streams and
multi-modal inputs, the model can offer more comprehensive insights into crop
health, productivity, and sustainability.
Collaboration and knowledge-sharing among stakeholders, including
researchers, farmers, agronomists, and technology providers, will be essential for
driving innovation and adoption of the model. Collaborative initiatives can
facilitate the exchange of best practices, data sharing, and co-development of
solutions tailored to specific agricultural challenges.
Moreover, the future scope of the model extends beyond crop monitoring
and diagnosis to encompass broader applications in agricultural decision
support, precision farming, and agri-tech innovation. As advancements in artificial
intelligence, deep learning, and sensor technologies continue to evolve, the
model can play a pivotal role in shaping the future of agriculture, driving
efficiency, productivity, and sustainability across the entire food value chain.
63
In conclusion, the future of the plant disease detection model for plant
malfunction identification is characterized by ongoing innovation, collaboration,
and adaptation to meet the evolving needs of agriculture. By harnessing the
power of cutting-edge technologies and interdisciplinary collaboration, the model
has the potential to revolutionize crop management practices, empower farmers,
and contribute to a more resilient and sustainable agricultural future.
64
REFERENCES
[1]. Q. Dongyu, T. Ghebreyesus, and R. Azevedo, ‘‘Mitigating impacts of COVID-
19 on food trade and markets,’’ in Food and Agriculture organization of the
United Nations. Italy: Press Release, Sep. 2020.
[2]. The Impact of Disasters and Crises on Agriculture and Food Security, FAO,
Rome, Italy, 2018.
[3]. New Standards to Curb the Global Spread of Plant Pests and Diseases, Web
Page of the Food and Agriculture Organization of the United Nations, FAO,
Rome, Italy, 2019.
[4]. K.-H. Kim, E. Kabir, and S. A. Jahan, ‘‘Exposure to pesticides and the
associated human health effects,’’ Sci. Total Environ., vol. 575, pp. 525–535,
Jan. 2017.
[5]. D. Laborde, W. Martin, J. Swinnen, and R. Vos, ‘‘COVID-19 risks to global
food security,’’ Science, vol. 369, no. 6503, pp. 500–502, Jul. 2020.
[6]. L. Keeling, H. Tunón, G. O. Antillón, C. Berg, M. Jones, L. Stuardo, J.
Swanson, A. Wallenbeck, C. Winckler, and H. Blokhuis, ‘‘Animal welfare and
the United Nations sustainable development goals,’’ Frontiers Veterinary
Sci., vol. 6, p. 336, Jan. 2019.
[7]. M. Khan, ‘‘Integration of selected novel pesticides with trichogrammachilonis
(hymenoptera: Trichogrammatidae) for management of pests in cotton,’’ J.
Agricult. Sci. Technol., vol. 21, no. 4, pp. 873–882, 2019.
[8]. S. Savary, A. Ficke, J.-N. Aubertot, and C. Hollier, ‘‘Crop losses due to
diseases and their implications for global food production losses and food
security,’’ Food Secur., vol. 4, no. 4, pp. 519–537, Dec. 2012.
[9]. F. Ren, W. Liu, and G. Wu, ‘‘Feature reuse residual networks for insect pest
recognition,’’ IEEE Access, vol. 7, pp. 122758–122768, 2019.
[10]. E. A. Heinrichs and R. Muniappan, ‘‘Integrated pest management
fortropical crops: Soyabeans,’’ CABI Rev., vol. 2018, pp. 1–44, Jan. 2018.
65
[11]. M. Margni, D. Rossier, P. Crettaz, and O. Jolliet, ‘‘Life cycle impact
assessment of pesticides on human health and ecosystems,’’
Agricult.,Ecosyst. Environ., vol. 93, nos. 1–3, pp. 379–392, Dec. 2002.
[12]. A. Sabarwal, K. Kumar, and R. P. Singh, ‘‘Hazardous effects of chemical
pesticides on human health–cancer and other associated disorders’’,
Environ. Toxicol. Pharmacol., vol. 63, pp. 103–114, Oct. 2018.
[13]. S. Savary, L. Willocquet, S. J. Pethybridge, P. Esker, N. McRoberts, andA.
Nelson, ‘‘The global burden of pathogens and pests on major foodcrops,’’
Nature Ecol. Evol., vol. 3, no. 3, pp. 430–439, Feb. 2019.
[14]. P. Udmale, I. Pal, S. Szabo, M. Pramanik, and A. Large, ‘‘Global
foodsecurity in the context of COVID-19: A scenario-based exploratory
analysis,’’ Prog. Disaster Sci., vol. 7, Oct. 2020, Art. no. 100120.
[15]. S. P. Mohanty, D. P. Hughes, and M. Salathé, ‘‘Using deep learning
forimage-based plant disease detection,’’ Frontiers Plant Sci., vol. 7, p. 1419,
Sep. 2016.
[16]. D. Shah, V. Trivedi, V. Sheth, A. Shah, and U. Chauhan, ‘‘ResTS:
Residual deep interpretable architecture for plant disease detection,’’ Inf.
Process. Agricult., vol. 9, no. 2, pp. 212–223, Jun. 2022.
[17]. A. Picon, M. Seitz, A. Alvarez-Gila, P. Mohnke, A. Ortiz-Barredo, and J.
Echazarra, ‘‘Crop conditional convolutional neural networks for massive
multi-crop plant disease classification over cell phone acquired images taken
on real field conditions,’’ Comput. Electron. Agricult., vol. 167, Dec. 2019,
Art. no. 105093.
[18]. S. Huang, G. Zhou, M. He, A. Chen, W. Zhang, and Y. Hu, ‘‘Detection of
peach disease image based on asymptotic non-local means and PCNN-
IPELM,’’ IEEE Access, vol. 8, pp. 136421–136433, 2020.
[19]. J. G. A. Barbedo, ‘‘Plant disease identification from individual lesions and
spots using deep learning,’’ Biosystems Eng., vol. 180, pp. 96–107, Apr.
2019.
66
[20]. J. Chen, J. Chen, D. Zhang, Y. Sun, and Y. A. Nanehkaran, ‘‘Using deep
transfer learning for image-based plant disease identification,’’ Comput.
Electron. Agricult., vol. 173, Jun. 2020, Art. no. 105393.
67
Vol-11 Issue-1 2025 IJARIIE-ISSN(O)-2395-4396
MOBILENETV2 FOR PLANT DISEASE
DETECTION : A SCALABLE DEEP
LEARNING FRAMEWORK
VINEELA THONDURI1 , BODIMALLA TEJA VARDHAN REDDY2 , GUDAPATI
CHITRAHAS BALAJI3 , GANNAMANENI SASANK4 , DESAVATH YASWANTH NAIK5
1
Assistant Professor , Dept of Electronics and Communication Engineering , Vasireddy Venkatadri
Institute of Technology , Nambur , Andhra Pradesh , India
2-5
UG Student , Dept of Electronics and Communication Engineering , Vasireddy Venkatadri Institute of
Technology , Nambur , Andhra Pradesh , India
ABSTRACT
The plant Disease Detection using CNN project is used to early and precise detection of plant diseases is vital for
reducing crop loss and maintaining agricultural sustainability. This research suggests a deep learning-based method
applying convolutional neural networks (CNNs) for automatic plant disease detection. The method adopts a systematic
workflow consisting of data collection, preprocessing, model development, training, and assessment. A carefully
selected dataset of images of healthy and diseased plants is used to train the CNN to identify characteristic patterns
and features with respect to plant diseases. The model develops its ability to classify with an improvement that comes
from recognizing even minute visual indications of various diseases. The new method enhances the effectiveness of
early disease detection, which could reduce losses to agriculture and improve productivity. In addition, incorporating
deep learning in precision agriculture highlights the role of technology-based solutions in alleviating critical
agricultural challenges. Automating disease detection allows intervention strategies to be put in place promptly,
enabling more efficient crop management practices and enhancing worldwide food security. The scalability and
flexibility of deep learning algorithms also offer the potential for ongoing refinement and improvement of disease
detection systems. In general, the use of CNNs in plant disease detection is a major breakthrough in agricultural
technology that provides a robust, scalable, and efficient solution with far-reaching implications for sustainable
agriculture and food production.
Keywords : Plant disease detection, convolutional neural networks, deep learning, image classification, data
collection, data preprocessing.
1. INTRODUCTION
Agriculture is a vital component of food security and economic stability worldwide. However, plant diseases constitute
a major threat to crop yield and quality and result in enormous economic losses and food shortages. Conventional
disease detection is mostly based on visual inspection by experts, which is time-consuming, labor-intensive, and
subject to human error. The requirement of an efficient, accurate, and automated plant disease identification system
has become more evident with the advent of artificial intelligence (AI) and deep learning. Convolutional neural
networks (CNNs), a type of deep learning algorithm, have shown incredible performance in image classification,
which makes them an ideal choice for plant disease detection. Using CNNs, automated disease detection systems can
scan plant images, detect disease symptoms, and classify plant conditions with high accuracy. This method reduces
the need for human expertise and increases early disease detection, enabling timely intervention and better crop
management. This research seeks to create a CNN-based model for the detection of plant diseases using a systematic
26027 ijariie.com 833
Vol-11 Issue-1 2025 IJARIIE-ISSN(O)-2395-4396
approach involving data gathering, preprocessing, training of the model, and testing. A well-selected dataset of healthy
and infected plant images is used to train the model so that it can identify disease-specific features and patterns.
Improved detection accuracy and generalizability are achieved through data augmentation methods and model fine-
tuning. Integrating deep learning into agri-tech presents an affordable and scalable means of monitoring diseases in
real-time. Automatic identification of diseases enables farmers and players in the agricultural sector to put in place
counteractive measures to reduce damage to crops, maximize resource utilization, and enhance the productivity of
agriculture as a whole. This study adds to research in precision agriculture, showcasing the potential for AI-based
solutions in meeting international agricultural challenges.
2. LITERATURE SURVEY
Plant disease detection is still a pressing issue in agriculture, with direct implications on crop health and global food
security. Conventional disease detection is based on visual examination by experts in agriculture, which, though
effective, are prone to subjectivity, labor-intensive, and lack scalability. To address these issues, recent advances in
computational methods, specifically deep learning-based methods, have been gaining momentum for automated plant
disease diagnosis. Convolutional Neural Networks (CNNs) have become a strong means of image-based plant disease
diagnosis, with outstanding classification performance. Research has documented CNN models reaching an accuracy
of as high as 96% in identifying plant diseases. For example, David P. Hughes (2016) created the "Plant Village"
dataset, which allows CNN-based disease detection from images taken on a smartphone, thus opening up diagnostic
tools to farmers. In the same vein, Ferentinos (2018) suggested a CNN-based system for detecting crop diseases in
various plant species, proving the strength of deep learning in agricultural use. Other deep learning methods, including
Support Vector Machines (SVMs), have also been investigated for classification of plant diseases by using
hyperspectral images and spectra. These methods provide lower accuracy than CNNs. Deep learning combined with
Internet of Things (IoT) technology improves the detection of plant diseases even further by providing real-time
monitoring through the collection of environment-based sensor data. These hybrid methods provide an integrated
solution for evaluating plant health and early disease detection. The new research is developed upon previous research
using CNNs, specifically the MobileNetV2 model, to improve precision and efficiency in the detection of plant
diseases. Utilizing a large dataset and integrated optimized training approaches, the research hopes to aid in AI-
powered agricultural applications development, bettering disease diagnosis and promoting environmentally friendly
farming techniques.
3. MATERIALS AND METHODS
The following section describes the dataset used in the suggested model, the building blocks of the model, and the
training procedure followed in order to improve its accuracy for detecting plant diseases.
3.1 DATASET USED
The model is trained on a complete dataset of 21,198 images of both healthy and infected plant leaves. The dataset is
preprocessed using image augmentation methods like rotation, flipping, zooming, and contrast change to enhance
generalization and avoid overfitting.
3.2 PROPOSED SYSTEM
The deep learning model is constructed based on the MobileNetV2 architecture, a light-weight convolutional neural
network (CNN) that is optimized for computational efficiency. The architecture is augmented with more fully
connected layers, such as dense layers, dropout layers (to avoid overfitting), and pooling layers (to downsize spatial
dimensions without losing significant features). LeakyReLU is used as the activation function to introduce non-
linearity, facilitating improved feature extraction. Model optimization is done through the Adamax optimizer, which
is robust for coping with sparse gradients, resulting in an observed classification accuracy of 96 %.
3.3 MODEL TRAINING AND EVALUATION
To train the model, an ImageDataGenerator dynamically creates augmented training samples to enhance model
generalization. The data is divided into 80% training and 20% validation subsets. Model training is executed in a
26027 ijariie.com 834
Vol-11 Issue-1 2025 IJARIIE-ISSN(O)-2395-4396
batch-wise manner using a batch size of 32 and 20 epochs. Adamax is employed as the optimization process with
adapting learning rates to facilitate better convergence. In training, after every epoch performance is evaluated by the
validation set, and crucial parameters like accuracy, precision, recall, and F1-score are noted down. This uniform
process guarantees highly efficient and scalable model development in plant disease classification, paving the way for
research in precision farming and plant disease monitoring.
4. IMPLEMENTATION
The suggested model for the detection of plant diseases uses the PlantVillage dataset, which consists of 21,168 images
with 15 classes of different plant diseases and healthy leaf states. For uniformity, all the images are resized to (224 ×
224) pixels. Moreover, very intense data augmentation processes like rotation, zoom, and flip are used to increase the
diversity of the dataset and provide a better generalization of the model. The data is split into training and validation
sets, 80% for training and 20% for validation to enable strong model testing.
4.1 MODEL SELECTION AND ARCHITECTURE
MobileNetV2 architecture is chosen as the backbone for the classification task because it is efficient, lightweight, and
has better feature extraction capabilities. The model is initialized with ImageNet pre-trained weights so that it can
enable transfer learning to tap into the existing feature representations. To train the plant disease dataset into
MobileNetV2, there is a custom classifier added with:
• A Global Average Pooling (GAP) layer to lower dimensionality and preserve necessary spatial features.
• Dense layers with LeakyReLU activation to enhance feature representation.
• A dropout layer to prevent overfitting by randomly disabling neurons while training.
• A final dense layer with softmax activation to enable multi-class classification.
4.2 IMPLEMENTATION FLOW
The whole process of disease detection has a systematic pipeline, as shown in Figure 1:
Fig-1 : Flowchart
26027 ijariie.com 835
Vol-11 Issue-1 2025 IJARIIE-ISSN(O)-2395-4396
4.3 PERFORMANCE ASSESSMENT
After training, the model is thoroughly validated and tested to determine its ability to correctly identify plant diseases.
The use of deep learning methods in the proposed system shows considerable improvement over traditional
approaches, thus improving disease diagnosis in agriculture.
5.RESULTS AND EVALUATION
The performance of the trained model is stringently evaluated using various evaluation metrics. Training and
validation accuracy/loss curves are plotted to examine learning trends throughout epochs, providing insights on model
optimization and convergence. At training, the model reaches a training loss of 0.0178 with a training accuracy of
96.40%, which reflects good feature extraction. The validation stage captures a validation loss of 0.0286 with an
accuracy of 96.22%, which reflects good generalization to new validation data. Extensive testing is performed on an
independent test set, producing a test loss of 0.0253 and a test accuracy of 96.18%, attesting to the reliability of the
model in actual classification tasks. Such repeatedly high accuracy rates testify to the quality of the MobileNetV2-
based method. The learning rate of 1.0000e-04 provides for controlled weight updates, enabling stable convergence
without the risk of overfitting. The low loss values for all datasets confirm that the model is successful in minimizing
classification errors. The combination of transfer learning and fine-tuning greatly improves the model's ability to adapt
to plant disease classification. Data augmentation methods also enhance generalization, providing for accurate disease
detection on a variety of leaf samples. Overall, the model has outstanding performance in disease detection of plants
and thus can be an excellent tool for precision agriculture. The findings prove deep learning to be capable of
automating the diagnosis of disease, enabling early intervention to reduce crop loss.
Fig-2 : Training and validation accuracy/loss graphs
26027 ijariie.com 836
Vol-11 Issue-1 2025 IJARIIE-ISSN(O)-2395-4396
Fig – 3 : Tomato Healthy Leaf
Fig – 4 : Potato Late Blight Disease Leaf
26027 ijariie.com 837
Vol-11 Issue-1 2025 IJARIIE-ISSN(O)-2395-4396
6. CONCLUSION
The successful deployment and development of the plant disease detection model for plant malfunction detection point
to a major breakthrough in utilizing deep learning to solve agricultural problems. With an accuracy of 96.18%, the
model proves to be highly reliable in classifying plant conditions such as diseases, pests, and abnormalities. By
combining MobileNetV2 architecture, transfer learning, and data augmentation, the model is able to capture complex
patterns in plant images. This ability enables farmers and agricultural stakeholders to make informed decisions,
implement proactive management practices, and maximize resource allocation, thereby enhancing crop health and
yield. Aside from agriculture, the model has the potential for large-scale applications in healthcare, cybersecurity,
finance, and environmental monitoring. Its ability to make predictions can support innovation, improve decision-
making mechanisms, and help solve complicated real-world issues in various sectors of the economy. The study
highlights the revolutionary role of artificial intelligence and machine vision in precision agriculture. More
refinements, further enhancements to the datasets, and interfacing with the current agricultural frameworks can make
the model more robust and scalable. Ongoing research in this area can lead to sustainable solutions, enhance food
security, and support global agricultural progress, making AI-based plant disease detection a precious asset for future
farming.
7. REFERENCES
[1]. Parismita Bharali, Chandrika Bhuyan, Abhijit Boruah, “Plant Disease Detection by Leaf Image Classification
Using Convolutional Neural Network” ICICCT 2019, pp.194-205.
[2]. Rahman Tashakkori, Timothy Jassmann, R. Mitchell Parry, “Leaf Classification Convolutional Neural Network
App”, IEEE Southeast 2015.
[3]. M. B. Riley, M. R. Williamson, and O. Maloy, “Plant disease diagnosis. The Plant Health Instructor,” 2002.
[4]. Srdjan Sladojevic, Marko Arsenovic, Andras Anderla, Dubravko Culibrk and Darko Stefanovic, “Deep Neural
Networks Based Recognition of Plant Diseases by Leaf Image Classification”, 2016.
[5]. H. Al-Hiary, S. Bani-Ahmad, M. Reyalat, M. Braik and Z. ALRahamneh, “Fast and Accurate Detection and
Classification of Plant Diseases”, International Journal of Computer Applications (0975 – 8887) Volume 17– No.1,
March 2011
26027 ijariie.com 838