v1 Covered

Improved Segmentation With Optimization Based
Multilevel Thresholding and K-Means Clustering for

Plant Disease Identi cation
Beulah David (  davidbeulah35@gmail.com )
Saveetha School of Engineering
Gomathi R
University College of Engineering
Research
Keywords: Image processing, classi cation, graph theory, extreme learning machine, segmentation,
optimization
Posted Date: February 16th, 2023
DOI: https://doi.org/10.21203/rs.3.rs-2373358/v1
License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
1
IMPROVED SEGMENTATION WITH OPTIMIZATION BASED MULTILEVEL

THRESHOLDING AND K-MEANS CLUSTERING FOR PLANT DISEASE IDENTIFICATION
D. Beulah David 1*, R.Gomathi2
(1Institute of Information Technology, Saveetha School of Engineering,Thandalam,Chennai, Tamilnadu, India. Email:

davidbeulah35@gmail.com) (* - Corresponding author)
(2 Department of Electronics and Communication Engineering, University College of Engineering-Dindigul, Tamilnadu, India.
Email: gomathiaudece@gmail.com)
ABSTRACT
Plant disease identification is an important application for plant protection in agriculture production. The
early detection of crop disease helps to reduce the effect of disease in cultivation. The detection of disease
should be done precisely. Hence the hyperspectral sensors are extensively used in plant disease detection.
Artificial intelligence and machine learning-based techniques have been presented in many works for
plant disease detection. Deep learning is the latest method used in image processing and pattern
recognition with improved accuracy. For plant disease detection, accurate classification of disease can be
obtained with the utilization of deep learning techniques. In this paper, adaptive extreme learning machine
(AELT) is presented for classifying the disease. Before the classification process, the segmentation and
feature extraction process is performed to improve the disease detection accuracy. Multilevel
thresholding-based K-means clustering with probability-induced butterfly optimization algorithm is
presented for segmentation. The entropy-based features are extracted from plant images. The features are
applied to the AELT classifier. The results are evaluated with the standard dataset and compared with the
state of art techniques.
Keywords: Image processing, classification, graph theory, extreme learning machine, segmentation,
optimization
1. INTRODUCTION
The agriculturist in local areas may hope that it’s hard to differentiate the problems which may be
available in their harvests. It's not possible for them to go to the agribusiness office and discover what the
infection may be. Numerous challenges confront the agricultural sector, including a significant decrease
in crop yield. Plant leaf diseases are one of the major causes of output loss, and identifying plant leaf
diseases is a time-consuming operation. The traditional method of identifying the diseases is the naked
method which involves huge manpower, time-consuming, and not suitable for huge cultivation. In
addition, it requires continuous monitoring by the experts resulting in huge expenses. Hence, machine
learning; a reliable prediction methodology, is used for detecting various diseases in plant leaves caused
by fungus, bacteria, and viruses. Notwithstanding, disease detection utilizing classification algorithms has
been a troublesome task as the detection performance changes for different kinds of input information.
Lately, a server-based and mobile-based methodology has been utilized for plant leave disease detection
[1]. Many elements of these methods being high-pixel cameras, superior execution handling, and broad
built-in accessories are the additional benefits bringing about the automatic disease detection.
2
Present-day approaches, for example, machine learning (ML) and deep learning (DL) algorithms have
been utilized to obtain high identification rate and highly accurate results. Different investigations have
been done with machine learning for plant disease discovery and analysis, such classical machine learning
techniques are random forest [2], Artificial Neural Network (ANN) [3], Support Vector Machine (SVM)
[4], Adaptive Neuro fuzzy Inference System (ANFIS) [5], K-means clustering [6], Convolutional neural
Networks (CNN) [7], etc. SVM [4] have been extensively utilized in numerous fields as it employs the
traditional optimization method for optimal attainment of separating margin of two classes with minimum
training error. For plant disease detection, superior imaging techniques like hyperspectral [8, 9] and
multispectral imaging [10] are employed in some works. Artificial intelligence (AI) with deep learning
techniques has become popular nowadays. Deep learning techniques used to recognize plant infections
based on plant's appearance and visual indications, similar to how human interactions perform disease
identification [11]. Deep learning is an innovative strategy for image processing and object-recognition
with more prominent accuracy in the classification of different leaf diseases [12]. Deep learning networks
like AlexNet [13, 14], DenseNet [15], Inception-v4 [15] and ResNet [15] provides better detection results
in plant disease classification. Thus it is shown that the deep learning techniques get more attention than
conventional ML approaches. CNN method gives excellent classification results comparing with the
conventional ML methods and other DL methods like AlexNet, ResNet and GoogLeNet [16, 17]. In DL
networks the mapping of input layer to the output layer can be improved by tuning the network
parameters in training phase. This task involves computational complexity and the performance is
improved up till now through the conceptual and engineering innovations [18, 19]. The efficacy of DL
methods in plant disease classification can be improved by introducing adaptations in DL networks, using
different training algorithms, cascading of DL networks, optimization algorithms, and hybridization of
DL with other networks. All these process involves more complexity in implementation.
The extreme learning technique (ELT) is first utilized by Huang et al. [20] to train single layer feed
forward neural (SLFN) networks. ELT assigns the input weights and biases randomly and finds the output
weights by Moore-Penrose generalized inverse technique. It is observed from the researches that ELT has
the ability to produce the smallest training error and the smallest norm of weights. The generalization
theory presented in [21] had been applied to feed forward neural networks and found that the network
exhibit less training error and weights if the generalization performance is better. From various researches
conducted with ELT it is observed that it has some benefits such as less training parameters, improved
learning speed and better generalization ability [22]. The improved ELT techniques have been presented
in some works and applied for applications like computerized recognition [23], image processing [24] and
text recognition [25]. In ELT, the training data can be linearly separated with a hyper plane spreading
through the origin having probability one in the ELT feature space. This feature of ELT makes it to attain
better generalization in classification process, comparing with SVM based classification. Unlike other
deep learning techniques [26], ELT techniques do not necessitate to the fine tuning of the parameters
using back propagation which helps to achieve reduced computational cost in the training process. The
generalization performance in ELT is not usually subtle to network parameters. ELT schemes utilize
underlying structure of the data using unlabeled data and manifold structure of the data using manifold
learning. This scheme only makes use of local neighborhood information and does not take the global
structure information into account. To improve the performance of ELT models graph theory adaptive
ELT is presented in this work.
3
In addition, the clustering process takes an imperative role in leaf disease detection process. Many
clustering techniques were presented in literature works; K-means clustering [28], Fuzzy C means (FCM)
clustering [29]. Clustering is an unsupervised segmentation process that segments the images without user
involvement. The only requirement with this clustering process is a priori knowledge on design and the
image related physical, medical or biological information. The details about the number of clusters should
be known prior [28]. The clusters of pixels can be formed by combining similar pixels. The application of
clustering algorithms in image segmentation is useful due to their simplicity and efficacy. In some works
the performance improvement is introduced in clustering process with the utilization of optimization
algorithms. The optimal selection of cluster centers, number of cluster and cluster parameters are
accomplished through optimization techniques. Another segmentation approach is thresholding based
techniques which includes Otsu thresholding, gradient thresholding, binary thresholding and multilevel
thresholding [31]. In thresholding based techniques the segmentation outcomes depend on the value of
threshold set. The optimum threshold for attaining better output results is presented in multilevel
thresholding approach [32]. The authors employed quantum PSO algorithm process. It is proved in that
work that, multilevel thresholding with optimization is suitable for various Real time applications. The
utilization of hybrid segmentation is presented in various works for improving the segmentation accuracy
in complex applications [27, 33]. Finally the most important part of disease detection process is to label
the identified marks. Based on the identified features of the segmented parts the disease type is labeled
using an ingenious classifier.
Based on the above mentioned facts, an effective classifier based on AELT is employed. Since the ELT
has only few parameters in training stage, the speed of classification is increased. In addition, the tuning
of parameters for better performance is attained using the PBOA with ELT. Thus the detection accuracy
is increased. Prior to classification, the image features are obtained from the segmented parts obtained
from a hybrid segmentation technique; multilevel thresholding and K-means clustering using PBOA. For
obtaining optimum thresholds for multilevel thresholding, PBOA is utilized. The parameter selection in
K-means clustering is simultaneously accomplished using PBOA. The convergence of the conventional
BOA is modified for the arrival of better results in very few numbers of iteration can be obtained using
the proposed probability based mutation process. For the purpose of validating the suggested work, the
detection performance of the given approach may be seen using a useful dataset. The multilevel
thresholding method can manipulate the complex images and multimodal images since the image can be
segmented into multilevels, which provide good results for multispectral and hyper spectral images.
2. LITERATURE SURVEY
The previous works on plant disease detection using different methods are discussed in this section. The
classification techniques employed in different papers are also discussed. Recently in many works, DL
based classifiers are employed and results are validated to show their effectiveness. In [34], DL based
disease detection system is presented, which employs AlexNet for disease classification. Additionally,
there are a few associated works wherein most modern representation methods and adjusted/further
developed forms of DL designs were proposed to accomplish superior outcomes. The Plant Village
dataset has been utilized by the authors to evaluate the method [35]. In [36], CNN was utilized for
identifying disease in maize plants and histogram strategies to illustrate the importance of the structure. In
[37], essential CNN designs like AlexNet, GoogLeNet, and ResNet were carried out for distinguishing the
tomato leaf diseases. Training/validation accuracy was observed to analyse the effectiveness of the
4
method and it is proved that ResNet as the most excellent CNN model. To recognize the banana leaf
diseases, LeNet was employed by Amara et al. [38] and F1-score metric is analysed.
In [39], the plant diseases were detected using three classifiers; SVM, ELT and K-Nearest Neighbor
(KNN). These classifiers are employed with the recent DL models like GoogLeNet, ResNet-50, ResNet-
101, Inception-v3, InceptionResNetv2, and SqueezeNet. The observed results stated that the utilization of
ResNet-50 model with SVM classifier outperforms among them. As indicated by [40], another DL
model-Inception-v3-was utilized for the identifying diseases in cassava plant. In [41], DL models are fine
tuned for parameter selection using various techniques and utilized for plant disease detection, whereas in
[42], pre-trained DL is employed for disease detection in tomato plant.
In this proposed work, for improving performance with reduced complexity, the improvement is
introduced with ELT classifier with the implementation of graph theory with it. Another technique for
improving the detection accuracy is correctly segmenting the most required part from the image and
taking features from that segmented part for putting them into classification algorithm. The segmentation
process is done using several techniques in literature. The most traditional segmentation schemes are
histogram techniques and Otsu's and Kapur's thresholding techniques [43]. The segmentation techniques
based on Otsu's and Kapur's thresholding has been considered as the better techniques. Both the
techniques experience a genuine disadvantage of outstanding development in time intricacy and
henceforth can't be reached out to multilevel thresholding issues. Numerous techniques have been
accounted for in the literature to enhance productivity and diminish the time difficulties of multilevel
thresholding strategies [44].
In [45], hybrid clustering process using K-means clustering technique and graph cut segmentation method
is presented. The process makes use of the inherent features graph cut method and K-means method to
improve the segmentation accuracy. However the parameters selection is not optimal with the objective of
minimizing time complexity. As two methods are used, the complexity is more in their work. In some of
the thresholding segmentation methods, the multilevel threshold setting using bio-inspired computing
models have been presented. The conventional GA, PSO, ACO algorithms many dissatisfactory issues
like, the arrival of nearby optima rather than the global optima which creates misclassification problem.
Furthermore, low convergence speed is another significant issue [46].
In works [47–50], multilevel image thresholding utilizing different optimization algorithms like
hybrid differential evolution algorithm, honey-bee-mating strategy, differential evolution algorithm and
hybrid whale optimization algorithm with moth-flame optimization algorithm were presented
respectively. These methods can help in finding optimal threshold values for image segmentation and
have tradeoff in computational complexity. In this paper, PBOA scheme is presented for selecting
threshold value for multilevel thresholding with the objective of reaching higher segmentation accuracy
with good convergence speed.
3. METHODOLOGY
In this paper, graph theory adaptive extreme learning machine (GTAELT) is presented for classifying the
disease. Before classification process, segmentation and feature extraction process are performed to
improve the disease detection accuracy. Multilevel thresholding based K-means clustering with
probability induced butterfly optimization algorithm is presented for segmentation. The entropy based
5
features are extracted from segmented image part. The features are applied to the GTA-ELT classifier for
identifying plant disease. The overall disease identification progress is drawn as various blocks in Figure
1.
3.1. Multilevel thresholding based segmentation
This work presents a hybrid K-means clustering and multilevel thresholding segmentation algorithm. By
accurately segmenting the picture, the adoption of the suggested segmentation approach improves the
accuracy of illness identification.
Figure 2 Block diagram of proposed work
3.1.1. Otsu multilevel thresholding
Otsu method is also termed as between-class maximum variance method. The condition in A minimal
variation within a class and a maximum variance between two classes are the aim and background of an
otsu computation. The principal theory is to separate the histogram into two groups and the value is
considered as threshold if the variance between the two groups is maximum.
The input plant leaf image is denoted as 𝒾ℓ1 and the gray level of the image can be denoted as
{0, 1, 2, … , 𝒢 − 1}. A threshold value of 𝜏 is assumed for separating the image into background ℬ0 and
target ℬ1 . Their gray scale values are represented as [0, 1, 2, … , 𝜏 − 1] and [𝜏, 𝜏 + 1, 𝜏 + 2, … , 𝒢 − 1]
respectively. For target and background, the probability, mean and variance computations are given in
Equations (1), (2) and (3) respectively.
𝜏−1 𝒢−1
𝜔0 (𝜏) = ∑ 𝑝𝑟𝑖 , 𝜔1 (𝜏) = ∑ 𝑝𝑟𝑖 (1)

𝑖=0 𝑖=𝜏
6
𝜏−1 𝒢−1
𝑖. 𝑝𝑟𝑖⁄ 𝑖. 𝑝𝑟𝑖⁄
𝜇0 (𝜏) = ∑ 𝜔0 , 𝜇1 (𝜏) = ∑ 𝜔0 (2)
𝑖=0 𝑖=𝜏
where 𝑝𝑟𝑖 represents the probability of graylevel 𝑖, 𝜔0 (𝜏) and 𝜔1 (𝜏) represent the probability of target
and background respectively. Similarly, 𝜇0 (𝜏) and 𝜇1 (𝜏) denote the probability of target and background
respectively. The variance between ℬ0 and ℬ1 can be computed as given in Equation (3).
2
𝜎 2 (𝜏) = 𝜔0 (𝜏)𝜔1 (𝜏)(𝜇0 (𝜏) − 𝜇1 (𝜏)) (3)
The value of threshold that provides highest variance between ℬ0 and ℬ1 can be stated as the best
threshold and denoted as 𝜏 𝑏 . Thus the best threshold can be found using multiple threshold segmentation
method. In this method, 𝑚 threshold values are assigned as 𝜏1 , 𝜏2 , 𝜏3 , … , 𝜏𝑚 and these will segment the
image into 𝑚 + 1 groups as ℬ0 , ℬ1 , ℬ2 , … , ℬ𝑚+1. The variance between 𝑚 + 1 groups can be found as,
2
𝜎 2 (𝜏1 , 𝜏2 , 𝜏3 , … , 𝜏𝑚 ) = ∑ 𝜔𝑖 𝜔𝑗 (𝜇𝑖 − 𝜇𝑗 ) (4)
0≤𝑖≤𝑗≤𝑠
= 𝜔0 𝜔1 (𝜇0 − 𝜇1 )2 + 𝜔0 𝜔2 (𝜇0 − 𝜇2 )2 + ⋯ + 𝜔0 𝜔𝑘 (𝜇0 − 𝜇𝑘 )2 + 𝜔1 𝜔2 (𝜇0 − 𝜇2 )2 + ⋯

+ 𝜔1 𝜔3 (𝜇1 − 𝜇3 )2 + 𝜔𝑚−1 𝜔𝑚 (𝜇𝑚−1 − 𝜇𝑚 )2 (5)
where
𝜏𝑚 −1 𝜏𝑚 −1
𝜔𝑚−1 = ∑ 𝑝𝑖 , 𝜇𝑚−1 = ∑ 𝑖𝑝𝑖 ⁄𝜔𝑗 (6)

𝑖=𝜏𝑚−1 𝑖=𝜏𝑚−1
The optimal threshold value for each level is given as, 𝜏1𝑏 , 𝜏2𝑏 , … , 𝜏𝑚
𝑏
.
(𝜏1𝑏 , 𝜏2𝑏 , … , 𝜏𝑚
𝑏
)= argmax 𝜎 2 (𝜏1 , 𝜏2 , 𝜏3 , … , 𝜏𝑚 )
0≤𝜏1 ,𝜏2 ,…,𝜏𝑚 ≤𝒢−1
2
= argmax ∑ 𝑝𝑖 𝑝𝑗 (𝜇𝑖 − 𝜇𝑗 ) (7)
0≤𝜏1 ,𝜏2 ,…,𝜏𝑚 ≤𝒢−1
0≤𝑖≤𝑗≤𝑚
With the maximization objective presented in Equation (7), the optimum value for the threshold can be
found. However the traditional method consumes more computational time. Therefore an optimization
algorithm based on butterfly optimization algorithm is presented in this work. The implementation of
proposed PBOA for identifying best threshold value to provide maximum variance is presented as
follows.
3.2. Optimum selection of threshold values for segmentation using the butterfly optimization
algorithm
3.2.1. Butterfly optimization algorithm
Bio-inspired optimization algorithms have acquired a lot of concentration by different examiners in the
earlier period. A similar class of optimization algorithms was introduced namely Butterfly Optimization
Algorithm (BOA). It is fundamentally motivated by the food foraging process of butterflies. These
7
butterflies are used in BOA as search experts to carry out optimization. Butterflies naturally have sensory
organs that they use to detect the fragrance of their food or blooms. Chemoreceptors, a kind of sensing
receptor, are dispersed throughout the butterfly's legs, palps, and antennae [51]. There is an assumption in
this algorithm that a butterfly forms a fragrance with certain amount. This is associated with butterfly’s
fitness. The fitness of the butterfly gets changed when it flies from a location to other location. While
flying, the fragrance emitted by the butterfly is passed over distance to all other butterflies in the region.
This fragrance is observed by the butterflies and a combined social network is developed. When the
fragrance of the best butterfly is sensed by a butterfly, it starts moving in the direction of best butterfly.
This searching process is known as global search phase in BOA. Another flying process will take place
when a butterfly could not sense any fragrance. This phase is local search phase where the butterfly takes
random movement processes. In BOA, fragrance is dissipated from many butterflies at a particular time.
The fragrances can be differentiated by the butterflies using their sensory modality (𝑠𝜇). The fragrances
dissipated by butterflies can be differentiated from each other butterflies. Stimulus intensity is a factor
denoted as (𝑆𝐼) denotes the value of physical stimulus, whereas this value is connected with the fitness of
butterfly. The butterflies with more fragrance have a tendency to attract other species in their search
region.
The parameter 𝛼 denotes response compression which states that when the robustness of stimulus factor
increases, the sensitivity of insects to the stimulus changes progressively decreases [29, 30]. Thus the
optimization algorithm from the inspiration of butterflies follows two imperative concerns; changing of
𝑆𝐼, development of fragrance 𝑓. The objective function in BOA is formed according to 𝑆𝐼 of a butterfly
and the fragrance parameter 𝑓 is relative. Consequently the fragrance expression is developed as a
function of the physical intensity of stimulus.
𝑓𝑖 = 𝑠𝜇. 𝑆𝐼 𝛼 (8)
where 𝑓𝑖 is the sensed amount of fragrance, which represents the robustness of fragrance to be sensed by
𝑖 𝑡ℎ butterfly. 𝑠𝜇 and 𝑆𝐼 denote sensory modality and stimulus intensity respectively. Total number of
butterflies in the solution space is denoted as ℬ.
Like many OAs, BOA has two searching phases; global search and local search phase. The selection of
local search process and global search process is determined by a switching probability 𝑠𝑤 In the global
search phase, the butterfly makes a walk towards the most fitting butterfly, which is also known as
solution vector 𝒷 𝑏𝑒𝑠𝑡 which can be addressed as
𝒷𝑖𝑡+1 = 𝒷𝑖𝑡 + (𝑟𝑎𝑛𝑑 2 × 𝒷 𝑏𝑒𝑠𝑡 − 𝒷𝑖𝑡 ) × 𝑓𝑖 (9)
The solution vector 𝒷𝑖𝑡 is the solution of 𝑖 𝑡ℎ butterfly in iteration 𝑡. 𝒷 𝑏𝑒𝑠𝑡 is the current best solution from
the all of the obtained solutions in the present stage. 𝑓𝑖 is the fragrance of 𝑖𝑡ℎ butterfly and 𝑟𝑎𝑛𝑑 is the
random number generated between 0 and 1. Followed by, the expression for local searching stage can be
expressed as,
𝒷𝑖𝑡+1 = 𝒷𝑖𝑡 + (𝑟𝑎𝑛𝑑 2 × 𝒷𝑗𝑡 − 𝒷𝑖𝑡 ) × 𝑓𝑖 (10)
𝒷𝑗𝑡 and 𝒷𝑖𝑡 denote solution of 𝑗𝑡ℎ and 𝑘𝑡ℎ butterflies from the solution space which are chosen in a random
manner from the present population. The local searching phase and global searching phase can be
8
switched based on the switching probability 𝑠𝑝𝑟𝑜𝑏 selected randomly between 0 and 1. The fitness values
are evaluated and the best values of butterflies are updated in the population. The process is repeated for
the given iterations. The BOA is shown in below algorithm.
Conventional Butterfly optimization algorithm

1. Objective function of the prescribed problem is developed.
2. The initial solution space with butterflies 𝒷𝑖 = {𝒷1 , 𝒷2 , … , 𝒷ℬ }
3. From the set, the fittest solution 𝒷 𝑏𝑒𝑠𝑡 is found.
4. The switching probability 𝑠𝑤 is assigned between 0 and 1.
5. while (stopping condition is not met)
6. for each butterfly 𝑖 = 1 𝑡𝑜 ℬ
7. random values using 𝑟𝑎𝑛𝑑() function is obtained.
8. Fragrance of butterfly 𝑓𝑖 is calculated.
9. if 𝑟𝑎𝑛𝑑 < 𝑠𝑤
10. Global search is performed using (9)
11. else
12. random positions 𝑖 and 𝑗 is chosen from the solutions space.
13. Local search is performed using (10)
14. end if
15. New solutions are computed.
16. Better solutions will be updated in the solution space.
17. end for
18. current best solution → 𝒷 𝑏𝑒𝑠𝑡
19. end while
20. take the best solution for the given objective
3.2.2. Probability induced butterfly optimization algorithm
In the conventional BOA the solution obtained through the local and global search process is
monitored by the switch probability factor. The diversity of solutions in this BOA is maintained very well
since the local and global search processes happens in random manner sequentially. However the
selection of local and global search processes based on the switch probability makes the solution to come
under local optima or to go beyond the global optimum. Therefore in some works, the improvements have
been introduced with the conventional BOA. In [52], local search algorithm based on mutation function is
presented, which alters the butterflies’ value using the mutation operator to improve the solution diversity
and to avoid local optimum points. The authors also presented dynamic BOA to improve the solution by
selecting a random solution from other solutions. This improves the exploitation process of BOA [52]. In
[51], intensive exploitation process is presented to be processed after local and global search. However
this increases computation time. In this paper, a probability induced function based on the deviation
butterflies’ position is developed. The deviation between the updated butterfly position 𝒷𝑖𝑡+1 and the
previous position 𝒷𝑖𝑡 is computed as in Equation (11). Then from deviation obtained for all positions, a
mean value is computed as in Equation (12).
∆𝒷𝑖 = 𝒷𝑖𝑡+1 − 𝒷𝑖𝑡 (11)

9
∆𝒷𝑡ℎ = ∑ ∆𝒷𝑖 (12)

𝑖=1
Consequently, the mutation is performed based on ∆𝒷𝑡ℎ value. If deviation is lower than ∆𝒷𝑡ℎ
higher mutation rate is assigned using Equation (12) and viceversa using Equation (13).
New value is computed as,
𝒷𝑖𝑡+1 (𝑛𝑒𝑤) = 𝒷𝑖𝑡+1 + (𝜇𝑖 × max ∆𝒷𝑥 ) (13)

0≤𝑥≤ℬ
𝒷𝑖𝑡+1 (𝑛𝑒𝑤) = 𝒷𝑖𝑡+1 + (𝜇𝑖 × min ∆𝒷𝑥 ) (14)

0≤𝑥≤ℬ
The new value computed for each butterfly position is updated in search space. Then the global search
process and local search process would be executed. The algorithm for PBOA is given as follows.
Probability induced Butterfly optimization algorithm

1. Objective function of the prescribed problem is developed.
2. The initial solution space with butterflies 𝒷𝑖 = {𝒷1 , 𝒷2 , … , 𝒷ℬ }
3. From the set, the fittest solution 𝒷 𝑏𝑒𝑠𝑡 is found.
4. The switching probability 𝑠𝑤 is assigned between 0 and 1.
5. while (stopping condition is not met)
6. for each butterfly 𝑖 = 1 𝑡𝑜 ℬ
7. random values using 𝑟𝑎𝑛𝑑() function is obtained.
8. Fragrance of butterfly 𝑓𝑖 is calculated.
9. if 𝑟𝑎𝑛𝑑 < 𝑠𝑤
10. Global search is performed using (9)
11. else
12. random positions 𝑖 and 𝑗 is chosen from the solutions space.
13. Local search is performed using (10)
14. end if
15. Deviation between 𝒷𝑖𝑡+1 computed using either of (9), (10) is computed using (11).
16. ∆𝒷𝑡ℎ is computed using (12)
17. if ∆𝒷𝑖 ≤ ∆𝒷𝑡ℎ
18. Mutation is applied using Equation (12)
19. else
20. Mutation is applied using Equation (13)
21. New solutions are computed.
22. Better solutions will be updated in the solution space.
23. end for
24. current best solution → 𝒷 𝑏𝑒𝑠𝑡
25. end while
26. take the best solution for the given objective
With the implementation of PBOA, the optimum threshold value that provides maximum variance is
computed. The threshold values for each level 𝜏1 , 𝜏2 , 𝜏3 , … , 𝜏𝑚 are given as input to the solution space.
Thus the population matrix can be formed as,
10
𝜏11 𝜏12 … 𝜏1ℬ

𝜏21 𝜏22 … 𝜏2ℬ
𝒷ℬ = [ ⋮ ⋮ ⋱ ⋮ ] (15)
𝜏𝑚1 𝜏𝑚2 … 𝜏𝑚ℬ
The optimum values of these multilevel thresholds are found using PBOA to meet the objective given in
Equation (7).
3.3. K-means clustering
K-means algorithm is a conventional method to segment the image into a mentioned quantity of clusters
according to some performance metric. The clustering problem, which the K-means algorithm is proposed
to perform, can be expressed as below-mentioned. For a given illustration of 𝒫 patterns, 𝐾 number of
clusters is computed according to a measure of similarity to such an extent that the given patterns inside a
cluster are more alike to one another (maximum intra-cluster similarity) than patterns of various clusters
(minimum inter-cluster similarity). A set of 𝒫 patterns denoted as, 𝑍 = {𝑧𝑖 , 𝑖 = 1, … , 𝒫} are clustered
into a set of 𝐾 clusters, 𝐶 = {𝑐𝑘 , 𝑘 = 1, … , 𝐾}. Here 𝐾 ≪ 𝒫 and each pattern 𝑧𝑖 is a denoted as 𝑧𝑖 ∈ 𝑅 𝑑 .
The clusters inside the image are formed using K-means algorithm with the aim of reducing the squared
Euclidean distance between the cluster center and the patterns of the cluster. The mean of cluster 𝑐𝑘 is
denoted as 𝑚𝑒𝑎𝑛𝑘 and it can be computed as,
1
𝑚𝑒𝑎𝑛𝑘 = ∑ 𝑧𝑖 (16)
𝒫𝑘
𝑧𝑖 ∈ 𝑐𝑘
Where 𝒫𝑘 is number of patterns in cluster 𝑐𝑘 . The squared error between 𝑚𝑒𝑎𝑛𝑘 and 𝑧𝑖 can be expressed
as,
𝑆𝐸 (𝑐𝑘 ) = ∑ ‖𝑧𝑖 − 𝑚𝑒𝑎𝑛𝑘 ‖2 (17)

𝑧𝑖 ∈ 𝑐𝑘
The ultimate aim of the K-means algorithm is to reduce the sum of the squared error among K clusters as
given in Equation (18).
𝐾
𝑆𝐸(𝐶) = ∑ ∑ ‖𝑧𝑖 − 𝑚𝑒𝑎𝑛𝑘 ‖2 (18)

𝑘=1 𝑧𝑖 ∈ 𝑐𝑘
At the starting phase of K-means clustering, the cluster’s centers are randomly arranged. Each pattern 𝑧𝑖
is allocated to its nearest cluster, according to the distance between the pattern and the cluster center. This
process is repeated in every iteration. The cluster centers in the subsequent iteration are calculated by
from the mean value of the patterns. The process continues until all patterns are reassigned form one
cluster to another cluster.
The algorithm ends when there is no relocation of any design starting from one cluster then onto the next.
In the allocation of patterns to the cluster centers, the objective of minimizing square error is taken as
fitness function and the proposed PBOA is applied for the selection of fittest cluster center from the
randomly initialized cluster center population. Therefore the process of allocating the patterns to the
11
cluster centers for forming clusters in segmentation process can be done with less processing with the
help of PBOA.
3.4. Feature extraction
The features to improve the classification accuracy are taken from the segmented image using entropy
based techniques. Three types of entropy computation strategies, namely, approximate entropy (𝐴𝑝𝐸𝑛),
sample entropy (𝑆𝑎𝑚𝑝𝐸𝑛) and fuzzy entropy (𝐹𝑢𝑧𝑧𝑦𝐸𝑛) have employed in this paper.
3.4.1. Approximate entropy
Approximate Entropy quantifies the difficulty of nonlinear time series [53]. This entropy gives a non-
negative statistical depiction of short-length series. For an 𝑁-length time series 𝑓(𝑖), 𝑖 = 1 ~ 𝑁, a series
of 𝑐 length vectors can be obtained as in Equation (18).
⃗ 𝑐 (𝑖) = [𝑓(𝑖), 𝑓(𝑖 + 1), … , 𝑓(𝑖 + 𝑐 − 1)]

ℎ (18)
Then, 𝐴𝑝𝐸𝑛 was calculated as,
𝐴𝑝𝐸𝑛(𝑐, 𝑠, 𝑁) = 𝑝𝑐 (𝑠) − 𝑝𝑐+1 (𝑠) (19)
where
𝑁−𝑐+1
𝑝 𝑐 (𝑠)
= (𝑁 − 𝑐 + 1) −1
∑ ln 𝐶𝑖𝑐 (𝑠) (20)
𝑖=1
𝑁𝑖𝑐
𝐶𝑖𝑐 (𝑠) = (21)
𝑁−𝑐+1
Here the tolerance threshold is represented by 𝑠 and it set equal to be 20% of the standard deviation of the
amplitude and 𝑁𝑖𝑐 is the number of ℎ ⃗ 𝑐 (𝑖), which follows (𝑑𝑖𝑗
𝑐 𝑐
)/𝑟 ≤ 1. 𝑑𝑖𝑗 is the distance between vectors
⃗ 𝑐 (𝑖) and ℎ
ℎ ⃗ 𝑐 (𝑗). It is defined as the maximum difference of their corresponding scalar components.
𝑐
𝑑𝑖𝑗 = 𝑚𝑎𝑥 |𝑓(𝑖 + 𝑘) − 𝑓(𝑗 + 𝑘)| (22)
𝑘∈(0,𝑐−1)
where 𝑗 varies from 1 to 𝑁– 𝑐 + 1.
3.4.2. Sample Entropy
Sample Entropy (SampEn) is an improved version of 𝐴𝑝𝐸𝑛. For an 𝑁-length time series 𝑓(𝑖), 𝑖 =
𝑐 𝑐
1 ~ 𝑁 − 𝑐 + 1, a series of 𝑐 length vectors ⃗ℎ (𝑖) can be obtained. The distance 𝑑𝑐𝑖𝑗 between vectors ⃗ℎ (𝑖)
𝑐
and ⃗ℎ (𝑖) is obtained as in Equation (22) while the condition here is 𝑗 ≠ 𝑖. For each i, 𝐵𝑖𝑐 (𝑠) is defined as,
𝑁𝑖𝑐
𝐵𝑖𝑐 (𝑠) = = (23)
𝑁−𝑐
12
⃗ 𝑐 (𝑖) which follows (𝑑𝑖𝑗

where 𝑁𝑖𝑐 is the number of ℎ 𝑐
)/𝑟 ≤ 1. Consequently 𝐵𝑐 (𝑠) is computed which
denotes the probability of matching of two sequences match at 𝑐 points. This probability is expressed as,
𝑁−𝑐
𝐵 𝑐 (𝑠)
= (𝑁 − 𝑐) −1
∑ 𝐵𝑖𝑐 (𝑠) (24)
𝑖=1
The 𝑆𝑎𝑚𝑝𝐸𝑛 can be computed using Equation (25).
𝐵𝑐+1 (𝑠)
𝑆𝑎𝑚𝑝𝐸𝑛(𝑐, 𝑠, 𝑁) = − ln (25)
𝐵𝑐 (𝑠)
3.4.3. Fuzzy Entropy
Fuzzy Entropy is a measure that denoting the information of uncertainty. Its description varies compared
to 𝐴𝑝𝐸𝑛 and 𝑆𝑎𝑚𝑝𝐸𝑛. Fuzzy Entropy is denoted as 𝐹𝑢𝑧𝑧𝑦𝐸𝑛. A vector of length 𝑐 can be defined as
𝑧 𝑐 (𝑖) = [𝑓(𝑖), 𝑓(𝑖 + 1), … , 𝑓(𝑖 + 𝑐 − 1)] − 𝑓0 (𝑖) (26)
where 𝑓0 (𝑖) is the average value of the vector 𝑧 𝑐 (𝑖).
𝑐−1
1
𝑓0 (𝑖) = ∑ 𝑓(𝑖 + 𝑙) (27)
𝑐
𝑙=0
𝑐
For each 𝑧 𝑐 (𝑖) the distance (𝑑𝑖𝑗 ) between two vectors can be computed as in Equation (28).
𝑐
𝑑𝑖𝑗 = max |𝑓(𝑖 + 𝑘) − 𝑓0 (𝑖) − 𝑓(𝑗 + 𝑘) + 𝑓0 (𝑗)| (28)
𝑘∈(0,𝑚−1)
Similarity degree between 𝑧 𝑐 (𝑖) and 𝑧 𝑐 (𝑗) can be calculated as,
𝑐 𝑛
𝑐
−(𝑑𝑖𝑗 )
𝐷𝑖𝑗 (𝑛, 𝜎) = 𝑒𝑥𝑝 ( ) (29)
𝜎𝑛
where 𝑒𝑥𝑝(. ), 𝑛 and 𝜎 represent the exponential fuzzy membership function, gradient and exponential
function width. The function
𝑁−𝑐 𝑁−𝑐
𝑐 (𝑛,
1 1 𝑐
𝑝 𝜎) = ∑( ∑ 𝐷𝑖𝑗 ) (30)
𝑁−𝑐 𝑁−𝑐−1
𝑖=1 𝑗=1,𝑗≠1
Then the 𝐹𝑢𝑧𝑧𝑦𝐸𝑛 can be computed as,
ln 𝑝𝑚+1 (𝑛, 𝜎)
𝐹𝑢𝑧𝑧𝑦𝐸𝑛(𝑐, 𝑛, 𝜎, 𝑁) = − ln (31)
𝑝𝑚 (𝑛, 𝜎)
The provided entropy features are computed using the expressions given above, for the segmented plant
leaf image.
13
3.5. Classification
Extreme learning machine (ELT) [55] was initially introduced for the single-secret layer feedforward
neural networks and afterward reached out to the summed up single-secret layer feedforward networks
where the secret layer should not need to be neuron alike. In ELT, all the secret hub parameters are
arbitrarily created (even before ELT sees the preparation information) without tuning.
The result of ELT is

𝐿
𝑓(𝑥) = ∑ 𝛽𝑖 𝐺(𝑎𝑖 , 𝑏𝑖 , 𝑥) = 𝛽. ℎ(𝑥) (32)

𝑖=1
Where 𝛽i is the resulting weight from the ith secret hub to the resulting hub and 𝐺(𝑎𝑖 , 𝑏𝑖 , 𝑥) is the result of
the ith secret hub. ℎ(𝑥) = [𝐺(𝑎1 , 𝑏1 , 𝑥), … . . , 𝐺(𝑎𝐿 , 𝑏𝐿 , 𝑥)]𝑇 is the result vector of the secret layer
regarding the input h(x) matches the information from the d- dimensional information region to the L-
dimensional secret layer highlighted region (ELT include region H. For the twofold arrangement
applications, the choice function of ELT is
𝐿
𝑓(𝑥) = 𝑠𝑖𝑔𝑛 (∑ 𝛽𝑖 𝐺(𝑎𝑖 , 𝑏𝑖 , 𝑥)) = 𝑠𝑖𝑔𝑛 (𝛽. ℎ(𝑥)) (33)

𝑖=1
When compared to conventional learning algorithms, ELT is not just will, in general, arrive at the least
training faults yet, in addition, the least form of result weights.
As indicated by Bartlett's theory[13], for feedforward neural structure gives lesser training fault. the lesser
the standard of weights is, the improved simplification execution the networks will in general have.
We guess that this might be accurate with summed up SLFNs. ELT is to limit the training faults as the
standards of the result weights[2,3]: Minimize: XN
𝑁
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 ∶ ∑‖𝛽. ℎ(𝑥𝑖 ) − 𝑡𝑖 ‖ (34𝑎)

𝑖=1
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 ∶ ‖𝛽 ‖ (34𝑏)
From Equation (30), to limit the standard of the result weights ‖𝛽 ‖ is really to boost the distance of the
isolating edges of the two distinct classes in the ELT include space: 2⁄‖ ‖ lthough the insignificant
𝛽
standard least square technique rather than the standard optimization strategy was utilized in the first
execution of ELT.
The weights between the layers of the network in ELT need to be updated using PBOA technique to get
optimum values. The fitness function for this optimization is given as the minimization of classification
error. The weight value matrix is taken as the population matrix in PBOA which is initialized in a random
manner.
14
4. RESULT AND DISCUSSION
The performance of proposed method is analysed using PlantVillage dataset [54], which is an open source
database available with 54,323 images collected from 14 different plants and contains 38 kinds of plant
disease. The size of the pictures was modified to 224 × 224 × 3 and standardization was recognized by
sectioning the number of pixels by 255 for creating it appropriate for the initial values of the models. For
validating the proposed work, the given images are divided into training and testing sets with the
proportion of 90% for training and 10% for testing purpose. An input leaf image taken from the dataset is
shown in Figure 2.
Figure 3 Input leaf image
The corresponding multilevel thresholding segmentation results for this image are given in Figure 3.
Figure 4 Multilevel thresholding segmentation results a) Level 1 b) Level 2 c) Level 3
The application of multilevel thresholding gives the segmented results in different forms at each level of
threshold. From the above Figure 3, it could be observed that level 3rd segmentation gives better
segmented leaf image for identifying differences effectively. The implementation is done in MATLAB
software. The segmentation results are evaluated by measuring segmentation accuracy. Conventional
Otsu thresholding method, K-means clustering, hybrid Otsu thresholding with K-means clustering and
Multilevel thresholding techniques are considered for comparative analysis (as in Table 1 and plotted in
Figure 4). The evaluated accuracy metric shows that the utilization of multilevel thresholding provides
better accuracy. Furthermore, additional performance can be attained with the application of PBOA. Since
the optimum threshold values are identified, the segmentation process is prompted with good accuracy.
The accuracy obtained with the proposed segmentation technique is 98.82%.
Another significant analysis required is to evaluate the efficacy of proposed PBOA. This can be
accomplished by analyzing the objective function values of various algorithms. In multilevel thresholding
segmentation, the objective is to attain maximum variance. Table 2 displays the values of objective
function obtained using different OAs.
15
Table 1 Comparative analysis of Segmentation Accuracy for various segmentation methods
Method Segmentation Accuracy

Otsu thresholding 97.45
K-means clustering 97.84
Otsu thresholding with K-means clustering 97.57
Multilevel thresholding 98.56
Multilevel thresholding with K-means clustering and PBOA 98.82
The results obtained for 8 images are provided with the levels of thresholds used in multilevel threshold
to achieve the optimum result. For some images it takes 2 levels for optimum segmentation and for some
images even 9 levels are taken for optimum results.
Figure 4 Comparative analysis of Segmentation Accuracy for various methods
Thus, the manual setting of levels and threshold is difficult to determine for a huge dataset. At this point,
the optimization algorithm performs the required task yielding better results with constrained time limits
as tabulated in Table 1.
Table 2 Comparative analysis of objective function value of multilevel Otsu thresholding method
for different optimization algorithms
Input No of Values of Objective Functions

image thresholds GA PSO ACO BBO KHO GOA BOA PBOA
1 4 12.3344 12.3470 12.3459 12.334 12.37 12.3523 12.3613 12.3653
2 2 14.995 15.2206 15.1336 15.1336 15.2367 15.2467 15.2628 15.2678
3 7 17.0892 17.8388 17.0892 17.8388 18.0431 17.9333 18.0431 18.086
4 2 12.3344 12.3470 12.3459 12.334 12.37 12.3523 12.3613 12.3653
5 9 14.995 15.2206 15.1336 15.1336 15.2367 15.2467 15.2628 15.2678
6 7 17.0892 17.8388 17.0892 17.8388 18.0431 17.9333 18.0431 18.086
7 8 12.3344 12.3470 12.3459 12.334 12.37 12.3523 12.3613 12.3653
8 4 14.995 15.2206 15.1336 15.1336 15.2367 15.2467 15.2628 15.2678
16
As discussed previously, the computation time parameter is analysed for difference OAs. In addition, the
segmentation accuracy is measured by employing different OAs with multilevel thresholding and K-
means clustering. The algorithms taken for comparative analysis are genetic algorithm (GA), particle
swarm optimization (PSO), ant colony optimization (ACO), biogeography based optimization (BBO),
krill herd optimization (KHO), grasshopper optimization algorithm (GOA), BOA and PBOA, which are
tabulated and depicted in Table 3 and Figure 5 respectively. The computation time analysis shows that the
convergence is fast in the proposed PBOA with improved accuracy.
Table 3 Comparative analysis of performance of different optimization algorithms
Method Segmentation Accuracy Computation time (ms)

GA 96.49 26419
PSO 96.86 24192
ACO 97.14 23149
BBO 97.45 22489
KHO 97.84 21499
GOA 97.57 19849
BOA 98.56 18456
PBOA 98.82 17494
Figure 5 Comparative segmentation analyses of various optimization methods
The ELT classifier labels the disease based on the extracted features. The performance on classification
process is evaluated by measuring accuracy (as per the Figure 6), sensitivity, specificity, precision and F1
score parameters, which are tabulated in Table 4. It is observed that the proposed method performance is
higher than other techniques.
17
Table 4 Comparative analysis of disease detection for different techniques
Method Accuracy Sensitivity Specificity Precision F1

score
Otsu thresholding+SVM 97.08 98.51 97.79 99.31 98.26
K-means clustering+SVM 98.57 98.71 97.98 99.60 97.96
Otsu+K-means clustering+SVM 98.69 98.61 98.09 99.81 98.34
Otsu+K-means clustering+ANFIS 98.77 98.47 98.24 99.57 98.35
Otsu+K-means clustering+ELT 98.80 98.62 98.31 99.60 98.47
Proposed Multilevel thresholding with
K means+SVM 98.83 98.70 98.35 99.77 98.53
Proposed Multilevel thresholding with 98.85 98.64 98.36 99.79 98.44
K means +ANFIS
Proposed Multilevel thresholding with 98.93 98.74 98.38 99.88 98.54
K means +ELT
Figure 6 Comparative Accuracy analyses of several methodologies
5. CONCLUSION
A new segmentation technique with proficient performance is presented in this work for detecting leaf
disease in plants. The scheme is implemented using PBOA based Multilevel thresholding and PBOA
based K-means clustering for segmentation process. From the segmented leave images, the entropy
features are extracted for applying in classifier. The ELT classifier performance is enhanced by tuning the
weights using PBOA method. Thus the overall detection performance is increased. The application of
PBOA in various steps gives the benefit of meeting the required objectives like less computation time,
more accuracy in disease discrimination and reduced error. The method is validated with a standard Plant
village dataset and the results are compared with different techniques.
18
ABBREVIATIONS
Not applicable
DATA AVAILABILITY STATEMENT
My manuscript has no associated data.
DECLARATION OF COMPETING INTEREST
The authors declare that they do not have any conflict of interest with organizations.
FUNDING STATEMENT
No funding received from any organization.
AUTHOR CONTRIBUTIONS
Conceptualization, Methodology, Software, Visualization, data curation, Writing-original draft
Validation, resources, Investigation, Writing-review and editing, supervision, project administration.
ACKNOWLEDGEMENTS
The authors would like to thank Technical head and team members Saveetha School of Engineering for
their helpful discussions and technical assistance. Also authors would like to thank research scholars of
Computer science, Electrical and Electronics, Electronics and Communication Engineering Saveetha
School of Engineering for frequency pattern development, which is used in software level checking
process.
References
1. Saleem, Muhammad Hammad, Johan Potgieter, and Khalid Mahmood Arif. "Plant disease detection and
classification by deep learning." Plants 8.11 (2019): 468.
2. Phan, Thanh Noi, Verena Kuch, and Lukas W. Lehnert. "Land cover classification using Google Earth
Engine and random forest classifier—the role of image composition." Remote Sensing 12.15 (2020):
2411.
3. Mehdy, M. M., et al. "Artificial neural networks in image processing for early detection of breast
cancer." Computational and mathematical methods in medicine 2017 (2017).
4. Suthaharan, Shan. "Support vector machine." Machine learning models and algorithms for big data
classification. Springer, Boston, MA, 2016. 207-235.
5. Zhou, Jian, et al. "Performance evaluation of hybrid FFA-ANFIS and GA-ANFIS models to predict
particle size distribution of a muck-pile after blasting." Engineering with computers 37.1 (2021): 265-
274.
6. Kanungo, Tapas, et al. "An efficient k-means clustering algorithm: Analysis and implementation." IEEE
transactions on pattern analysis and machine intelligence 24.7 (2002): 881-892.
19
7. Kim, Hankil, Jinyoung Kim, and Hoekyung Jung. "Convolutional neural network based image
processing system." Journal of information and communication convergence engineering 16.3 (2018):
160-165.
8. Chen, Tingting, et al. "Detection of peanut leaf spots disease using canopy hyperspectral
reflectance." Computers and electronics in agriculture 156 (2019): 677-683.
9. Xie, Chuanqi, Ce Yang, and Yong He. "Hyperspectral imaging for classification of healthy and gray
mold diseased tomato leaves with different infection severities." Computers and Electronics in
Agriculture 135 (2017): 154-162.
10. Kobayashi, T., et al. "Detection of rice panicle blast with multispectral radiometer and the potential of
using airborne multispectral scanners." Phytopathology 91.3 (2001): 316-323.
11. Camargo, A., and J. S. Smith. "An image-processing based algorithm to automatically identify plant
disease visual symptoms." Biosystems engineering 102.1 (2009): 9-21.
12. Kamilaris, Andreas, and Francesc X. Prenafeta-Boldú. "Deep learning in agriculture: A
survey." Computers and electronics in agriculture 147 (2018): 70-90.
13. Ferentinos, Konstantinos P. "Deep learning models for plant disease detection and
diagnosis." Computers and electronics in agriculture 145 (2018): 311-318.
14. Türkoğlu, Muammer, and Davut Hanbay. "Plant disease and pest detection using deep learning-based
features." Turkish Journal of Electrical Engineering & Computer Sciences 27.3 (2019): 1636-1651.
15. Too, Edna Chebet, et al. "A comparative study of fine-tuning deep learning models for plant disease
identification." Computers and Electronics in Agriculture 161 (2019): 272-279.
16. Singh, Uday Pratap, et al. "Multilayer convolution neural network for the classification of mango leaves
infected by anthracnose disease." IEEE Access 7 (2019): 43721-43729.
17. Liu, Bin, et al. "Identification of apple leaf diseases based on deep convolutional neural
networks." Symmetry 10.1 (2018): 11.
18. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521.7553 (2015): 436-
444.
19. Schmidhuber, Jürgen. "Deep learning in neural networks: An overview." Neural networks 61 (2015):
85-117.
20. Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: a new learning
scheme of feedforward neural networks." 2004 IEEE international joint conference on neural networks
(IEEE Cat. No. 04CH37541). Vol. 2. Ieee, 2004.
21. BARTLETT, P. "The sample complexity of pattern classification with neural networks." the size of the
weights is more important than the size of the network, Technical report, Australian National
University (1996).
22. Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and
applications." Neurocomputing 70.1-3 (2006): 489-501.
23. Nida, Nudrat, et al. "Instructor activity recognition through deep spatiotemporal features and
feedforward extreme learning machines." Mathematical Problems in Engineering 2019 (2019).
24. Sasank, V. V. S., and S. Venkateswarlu. "Brain tumor classification using modified kernel based
softplus extreme learning machine." Multimedia Tools and Applications 80.9 (2021): 13513-13534.
25. Mukherjee, Himadri, et al. "Line spectral frequency-based features and extreme learning machine for
voice activity detection from audio signal." International Journal of Speech Technology 21.4 (2018):
753-760.
20
26. Chang, Ken, et al. "Distributed deep learning networks among institutions for medical
imaging." Journal of the American Medical Informatics Association 25.8 (2018): 945-954.
27. Vu, Hoai Nam, et al. "Automatic extraction of text regions from document images by multilevel
thresholding and k-means clustering." 2015 IEEE/ACIS 14th International Conference on Computer
and Information Science (ICIS). IEEE, 2015.
28. Patil, Rupali, et al. "Grape leaf disease detection using k-means clustering algorithm." International
Research Journal of Engineering and Technology (IRJET) 3.4 (2016): 2330-2333.
29. Pravin Kumar, S. K., M. G. Sumithra, and N. Saranya. "Artificial bee colony-based fuzzy c means
(ABC-FCM) segmentation algorithm and dimensionality reduction for leaf disease detection in
bioinformatics." The Journal of Supercomputing 75.12 (2019): 8293-8311.
30. Vijay, PatilPriyanka, and N. C. Patil. "Gray scale image segmentation using OTSU Thresholding
optimal approach." Journal for Research 2.05 (2016).
31. Khan, Muhammad Waseem. "A survey: Image segmentation techniques." International Journal of
Future Computer and Communication 3.2 (2014): 89.
32. Cao, Lian-Lian, et al. "Otsu multilevel thresholding segmentation based on quantum particle swarm
optimisation algorithm." International Journal of Wireless and Mobile Computing 10.3 (2016): 272-
277.
33. Xiong, Lu, et al. "Color disease spot image segmentation algorithm based on chaotic particle swarm
optimization and FCM." The Journal of Supercomputing 76.11 (2020): 8756-8770.
34. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep
convolutional neural networks." Advances in neural information processing systems 25 (2012).
35. Mohanty, S. P. "Dataset of diseased plant leaf images and corresponding labels." Available at:<
Available at: https://github. com/spMohanty/PlantVillage-Dataset>. Accessed on: June 28 (2016).
36. Sibiya, Malusi, and Mbuyu Sumbwanyambe. "A computational procedure for the recognition and
classification of maize leaf diseases out of healthy leaves using convolutional neural
networks." AgriEngineering 1.1 (2019): 119-131.
37. Zhang, Keke, et al. "Can deep learning identify tomato leaf disease?." Advances in multimedia 2018
(2018).
38. Amara, Jihen, Bassem Bouaziz, and Alsayed Algergawy. "A deep learning-based approach for banana
leaf diseases classification." Datenbanksysteme für Business, Technologie und Web (BTW 2017)-
Workshopband (2017).
39. Türkoğlu, Muammer, and Davut Hanbay. "Plant disease and pest detection using deep learning-based
features." Turkish Journal of Electrical Engineering & Computer Sciences 27.3 (2019): 1636-1651.
40. Ramcharan, Amanda, et al. "Deep learning for image-based cassava disease detection." Frontiers in
plant science 8 (2017): 1852.
41. Too, Edna Chebet, et al. "A comparative study of fine-tuning deep learning models for plant disease
identification." Computers and Electronics in Agriculture 161 (2019): 272-279
42. Rangarajan, Aravind Krishnaswamy, Raja Purushothaman, and Aniirudh Ramesh. "Tomato crop disease
classification using pre-trained deep learning algorithm." Procedia computer science 133 (2018): 1040-
1047.
43. Kaur, Nirpjeet, and Rajpreet Kaur. "A review on various methods of image thresholding." International
Journal on Computer Science and Engineering 3.10 (2011): 3441.
21
44. Song, Jianhua, Wang Cong, and Jin Li. "A Fuzzy C-means Clustering Algorithm for Image
Segmentation Using Nonlinear Weighted Local Information." J. Inf. Hiding Multim. Signal Process. 8.3
(2017): 578-588.
45. Gandhimathi Alias Usha, S., and S. Vasuki. "Improved segmentation and change detection of multi-
spectral satellite imagery using graph cut based clustering and multiclass SVM." Multimedia Tools and
Applications 77.12 (2018): 15353-15383.
46. Resma, KP Baby, and Madhu S. Nair. "Multilevel thresholding for image segmentation using Krill Herd
Optimization algorithm." Journal of King Saud University-Computer and Information Sciences 33.5
(2021): 528-541.
47. Mlakar, Uroš, Božidar Potočnik, and Janez Brest. "A hybrid differential evolution for optimal multilevel
image thresholding." Expert Systems with Applications 65 (2016): 221-232.
48. Jiang, Yunzhi, et al. "A honey-bee-mating based algorithm for multilevel image segmentation using
Bayesian theorem." Applied Soft Computing 52 (2017): 1181-1190.
49. Muangkote, Nipotepat, Khamron Sunat, and Sirapat Chiewchanwattana. "Rr-cr-IJADE: An efficient
differential evolution algorithm for multilevel image thresholding." Expert Systems with Applications 90
(2017): 272-289.
50. Abd El Aziz, Mohamed, Ahmed A. Ewees, and Aboul Ella Hassanien. "Whale optimization algorithm
and moth-flame optimization for multilevel thresholding image segmentation." Expert Systems with
Applications 83 (2017): 242-256.
51. Arora, Sankalap, Satvir Singh, and Kaan Yetilmezsoy. "A modified butterfly optimization algorithm for
mechanical design optimization problems." Journal of the Brazilian Society of Mechanical Sciences and
Engineering 40.1 (2018): 1-17.
52. Tubishat, Mohammad, et al. "Dynamic butterfly optimization algorithm for feature selection." IEEE
Access 8 (2020): 194303-194314.
53. Chen, Xin, et al. "Entropy-based surface electromyogram feature extraction for knee osteoarthritis
classification." IEEE Access 7 (2019): 164144-164151.
54. Hughes, David, and Marcel Salathé. "An open access repository of images on plant health to enable the
development of mobile disease diagnostics." arXiv preprint arXiv:1511.08060 (2015).
Ge, Hongwei, et al. "Stacked denoising extreme learning machine autoencoder based on graph
embedding for feature representation." IEEE Access 7 (2019): 13433-13444.
on graph embedding for feature representation." IEEE Access 7 (2019): 13433-13444.
Figure Legends
Figure 5 Block diagram of proposed work
Figure 6 Input leaf image
Figure 7 Multilevel thresholding segmentation results a) Level 1 b) Level 2 c) Level 3
Figure 4 Comparative analysis of Segmentation Accuracy for various methods
Figure 5 Comparative segmentation analyses of various optimization methods
Figure 6 Comparative Accuracy analyses of several methodologies

22
Dr. Beulah David is an professor in Institute of Information Technology at Saveetha

School of Engineering,Thandalam,Chennai
R Gomathi is currently working as Assistant Professor at University College of Engineering

Dindigul Campus, Dindigul, Tamilnadu. Her special fields of interest include Digital Image
Processing, Remote Sensing and Neural Networks and Applications.

v1 Covered

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

v1 Covered

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

v1 Covered

Uploaded by

Copyright:

Available Formats

Improved Segmentation With Optimization Based

Multilevel Thresholding and K-Means Clustering for

Posted Date: February 16th, 2023

IMPROVED SEGMENTATION WITH OPTIMIZATION BASED MULTILEVEL

D. Beulah David 1*, R.Gomathi2

(1Institute of Information Technology, Saveetha School of Engineering,Thandalam,Chennai, Tamilnadu, India. Email:

3.1. Multilevel thresholding based segmentation

Figure 2 Block diagram of proposed work

3.1.1. Otsu multilevel thresholding

𝜔0 (𝜏) = ∑ 𝑝𝑟𝑖 , 𝜔1 (𝜏) = ∑ 𝑝𝑟𝑖 (1)

= 𝜔0 𝜔1 (𝜇0 − 𝜇1 )2 + 𝜔0 𝜔2 (𝜇0 − 𝜇2 )2 + ⋯ + 𝜔0 𝜔𝑘 (𝜇0 − 𝜇𝑘 )2 + 𝜔1 𝜔2 (𝜇0 − 𝜇2 )2 + ⋯

𝜔𝑚−1 = ∑ 𝑝𝑖 , 𝜇𝑚−1 = ∑ 𝑖𝑝𝑖 ⁄𝜔𝑗 (6)

𝒷𝑖𝑡+1 = 𝒷𝑖𝑡 + (𝑟𝑎𝑛𝑑 2 × 𝒷 𝑏𝑒𝑠𝑡 − 𝒷𝑖𝑡 ) × 𝑓𝑖 (9)

𝒷𝑖𝑡+1 = 𝒷𝑖𝑡 + (𝑟𝑎𝑛𝑑 2 × 𝒷𝑗𝑡 − 𝒷𝑖𝑡 ) × 𝑓𝑖 (10)

Conventional Butterfly optimization algorithm

3.2.2. Probability induced butterfly optimization algorithm

∆𝒷𝑖 = 𝒷𝑖𝑡+1 − 𝒷𝑖𝑡 (11)

∆𝒷𝑡ℎ = ∑ ∆𝒷𝑖 (12)

New value is computed as,

𝒷𝑖𝑡+1 (𝑛𝑒𝑤) = 𝒷𝑖𝑡+1 + (𝜇𝑖 × max ∆𝒷𝑥 ) (13)

𝒷𝑖𝑡+1 (𝑛𝑒𝑤) = 𝒷𝑖𝑡+1 + (𝜇𝑖 × min ∆𝒷𝑥 ) (14)

Probability induced Butterfly optimization algorithm

𝜏11 𝜏12 … 𝜏1ℬ

3.3. K-means clustering

𝑆𝐸 (𝑐𝑘 ) = ∑ ‖𝑧𝑖 − 𝑚𝑒𝑎𝑛𝑘 ‖2 (17)

𝑆𝐸(𝐶) = ∑ ∑ ‖𝑧𝑖 − 𝑚𝑒𝑎𝑛𝑘 ‖2 (18)

3.4. Feature extraction

3.4.1. Approximate entropy

⃗ 𝑐 (𝑖) = [𝑓(𝑖), 𝑓(𝑖 + 1), … , 𝑓(𝑖 + 𝑐 − 1)]

Then, 𝐴𝑝𝐸𝑛 was calculated as,

𝐴𝑝𝐸𝑛(𝑐, 𝑠, 𝑁) = 𝑝𝑐 (𝑠) − 𝑝𝑐+1 (𝑠) (19)

where 𝑗 varies from 1 to 𝑁– 𝑐 + 1.

3.4.2. Sample Entropy

⃗ 𝑐 (𝑖) which follows (𝑑𝑖𝑗

The 𝑆𝑎𝑚𝑝𝐸𝑛 can be computed using Equation (25).

3.4.3. Fuzzy Entropy

𝑧 𝑐 (𝑖) = [𝑓(𝑖), 𝑓(𝑖 + 1), … , 𝑓(𝑖 + 𝑐 − 1)] − 𝑓0 (𝑖) (26)

where 𝑓0 (𝑖) is the average value of the vector 𝑧 𝑐 (𝑖).

Similarity degree between 𝑧 𝑐 (𝑖) and 𝑧 𝑐 (𝑗) can be calculated as,

Then the 𝐹𝑢𝑧𝑧𝑦𝐸𝑛 can be computed as,

The result of ELT is

𝑓(𝑥) = ∑ 𝛽𝑖 𝐺(𝑎𝑖 , 𝑏𝑖 , 𝑥) = 𝛽. ℎ(𝑥) (32)

𝑓(𝑥) = 𝑠𝑖𝑔𝑛 (∑ 𝛽𝑖 𝐺(𝑎𝑖 , 𝑏𝑖 , 𝑥)) = 𝑠𝑖𝑔𝑛 (𝛽. ℎ(𝑥)) (33)

𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 ∶ ∑‖𝛽. ℎ(𝑥𝑖 ) − 𝑡𝑖 ‖ (34𝑎)

4. RESULT AND DISCUSSION

Figure 3 Input leaf image

Figure 4 Multilevel thresholding segmentation results a) Level 1 b) Level 2 c) Level 3

Table 1 Comparative analysis of Segmentation Accuracy for various segmentation methods

Method Segmentation Accuracy

Figure 4 Comparative analysis of Segmentation Accuracy for various methods

Input No of Values of Objective Functions

Table 3 Comparative analysis of performance of different optimization algorithms

Method Segmentation Accuracy Computation time (ms)

Figure 5 Comparative segmentation analyses of various optimization methods

Table 4 Comparative analysis of disease detection for different techniques

Method Accuracy Sensitivity Specificity Precision F1

Figure 6 Comparative Accuracy analyses of several methodologies

DATA AVAILABILITY STATEMENT