Bacterial Image Classification Using
Convolutional Neural Networks
2020 IEEE 17th India Council International Conference (INDICON) | 978-1-7281-6916-3/20/$31.00 ©2020 IEEE | DOI: 10.1109/INDICON49873.2020.9342356
Tumun Shaily and Kala S
Department of Computer Science and Engineering,
Indian Institute of Information Technology, Kottayam, Kerala, India
Email: {tumunshaily2017, kala}@iiitkottayam.ac.in
Abstract—Bacteria classification is an essential task in family. In this paper we have focused on classifica-
medical field, for the diagnosis and treatment of various tion of one such micro-organism, ie., bacteria, using
diseases. Typically, classification has been done by clin- computer aided techniques. Typically, microbiologists
ical specialists using conventional techniques, which do
not rely on prediction approaches. Manual classification use standard equipments and testing methodologies for
of bacteria is a time consuming and challenging task recognizing the species of bacteria. This is a time-
which requires huge human efforts. As technology has consuming process and requires large human effort.
advanced, classification of micro-organisms have been Also, manual classification are prone to errors. In the
possible with the aid of novel machine learning algorithms field of microbiology, computer aided methods can be
implemented on computers. Deep Neural Network (DNN)
is one such promising technology which has been widely used in analyzing various micro-organisms and also
used for image classification. One of the variant of DNN for faster diagnosis. With the advancement of tech-
is Convolutional Neural Network (CNN), which is an nology, image processing and pattern recognition has
efficient technique for classification problems, has been gained significant improvement which resulted in high
used in this paper for bacteria classification. We have used performance classification algorithms and approaches.
ResNet-50 CNN model for classifying bacterial images into
twenty categories, which are medically relevant. Using our This requires efficient machine learning algorithms and
approach, we could get an accuracy of 99.9% for classifi- techniques which can provide accurate results, based on
cation. Experimental results show that our technique gives the features extracted.
better results compared to the state-of-art approaches for
bacteria classification. Bacteria are micro-organisms without nucleus and
Index Terms—Microbes Classification, Machine learn- their size ranges in micrometers. Classification of bac-
ing, Computer vision, GPU, Convolutional Neural Net- teria is not a trivial task, since its shape vary from spiral
works and sphere to rod. Therefore classification is generally
done based on their cell structure and components. For
I. I NTRODUCTION the past few years, machine learning algorithms are pop-
ular for image classification. Deep learning networks
The pandemic of COVID-19 and horrific incidents have become a topic of wide interest for researchers all
like Influenza (1847-1848), Bubonic plague (1855- over the world due to its high performance and accuracy.
1860), Cholera(1817 - 1824) and more has shown the Convolutional Neural Network (CNN) is a promising
need of better medical equipment and faster testing. deep learning approach for classification tasks and is
More lives were lost due to lack of proper diagno- widely used in medical imaging, cancer detection and
sis. Most of the pandemic were caused by microbes other emerging medical applications [4], [15], [20]. In
especially Bacteria and Virus. After doing a research this paper we focus on classifying bacterial images
on these incidents one could realize that, one of the using Convolutional Neural Nets. We have used two
major problem was identification of the micro-organism variants of ResNet CNN model ie., ResNet-34 and
978-1-7281-6916-3/20/$31.00 © 2020 IEEE ResNet-50 for classification of microscopic bacterial
Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 20:37:17 UTC from IEEE Xplore. Restrictions apply.
images. Key contributions of this paper are: paper. Authors in [9] have used transfer learning tech-
• We train the CNN models ResNet-34 and ResNet- nique to retrain Inception CNN model, where they have
50 using DIBaS dataset used dataset of 500 bacteria images of five different
• Execution and training of ResNet models were species and achieved 95% of accuracy. Nie et.al in [10]
performed on GPU used convolutional deep belief networks which follows
• Classification of bacteria into twenty categories has an unsupervised learning method. One of the demerits
been done of the model discussed in [10] was that their approach
• Our approach could outperform existing ap- was less efficient for multiple bacterial colonies. Au-
proaches with an accuracy of 99.9% thors in [3] used Bag-of-Words technique for feature
Rest of the paper is organized as follows. Section II extraction. Classification of bacterial images into ten
gives the background and related works on microscopic categories has been done in this paper. Both [2] and
image classification. Section III discusses about the [3] used DIBaS dataset for classification. VGGNet and
dataset used and pre-processing of CNN model. Pro- AlexNet CNN models were used for classification in
posed approach and the system set-up is explained in [5], where thirty three classes of bacterial images were
Section IV. Implementation and results are discussed in obtained.
Section V. Finally the paper is concluded in Section VI.
Several popular CNN models like AlexNet [13],
II. BACKGROUND AND R ELATED WORKS VGGNet [14], ResNet [12] etc. are available in research
papers. Deep Residual Networks (ResNet) has become
The size of a bacteria fall on the scale from 0.2
the most popular CNN model due to its accuracy for
to 20 micrometers and therefore electron microscopes
classification problems. It has less number of parameters
are used to identify or classify them. Microscopic
and complexity compared to VGGNet model and the
observation and different types of chemical testing is the
number of layers ranges to 1000 layers. Filter sizes used
current approach used in practice for microbe detection
for convolutions vary widely in this network. As our
and this requires expensive equipment. These processes
problem contains many different classes of bacteria, we
need more time under human observation and hence
are using Residual Neural Network for more accuracy
they are slow process. When there are lot of samples for
in classifying different species. In this paper, we present
testing, this approach gets slower and might be more ex-
ResNet-34 and ResNet-50 CNN models to solve the mi-
pensive. In today’s scenario, IT sectors are using image
croscopic bacteria image classification problem. ResNet
processing and many medical labs are also integrating
architecture shown in Fig. 2.
their equipment with it for fast and efficient microbe
classification. Artificial Neural Networks (ANN) were
widely used for classification problems. A typical ANN
structure is shown in Fig. 1.
Image processing based bacteria classification has
been presented in some of the research papers re-
cently [1], [2], [6], [16]–[19]. In [1], authors have
used Naive Bayes classifier to identify bacteria from
microscopic morphology. There are several ways for
automatic recognition of bacteria species i.e statistical
methods as presented in [2], artificial neural networks
as discussed in [6], [7] or other machine learning classi-
fiers [8]. In [2], statistical methods like fisher vector and
local image descriptors were used. For classification,
Support Vector Machines (SVM) and Random Forest
(RF) approaches were used. Classification of bacteria
into thirty three categories has been performed in this Fig. 1. Typical Artificial Neural Network Model
Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 20:37:17 UTC from IEEE Xplore. Restrictions apply.
TABLE I images, where blank images were removed. Finally, an
S OME OF THE BACTERIA SPECIES FROM DIBA S DATASET augmentation process has been applied which involves
Species Name Number
random flipping, horizontal or vertical translation. At
the end of the pre-processing step, total number of
Veionella 22
microscopic bacteria image was increased from 400 to
Lactobacillus johnsonii 20
3,43,000 where each species contains minimum 17,000
Lactobacillus gasseri 20
images.
Proteus 20 After the data is divided into training and test dataset
Neisseria gonorrhoeae 23 using python we check about over-fitting, under-fitting
Escherichia coli 20 and just right dataset split. Then we check for cross val-
Staphylococcus epidermidis 20 idation using K-Folds Cross Validation method which
give us the final training and validation data set. After
searching about most common bacteria disease we have
III. P RE - PROCESSING chosen 20 such species of each class for classification
A. DIBaS dataset which caused more disease and were common in most
of testing cases.
Digital Images of Bacteria Species dataset (DIBaS)
contains 33 bacteria species with an average 20 images IV. P ROPOSED A RCHITECTURE AND S YSTEM
for each of them. It was collected by the Chair of S ET-U P
Microbiology of the Jagiellonian University in Krakow,
Poland (http://www.km.cm-uj.krakow.pl/). Bacteria im- In ResNet architecture, a residual mapping is per-
ages were collected from DIBaS database. Table I formed instead of fitting a desired mapping layer.
summarizes some of the genera and species of the Here, residual blocks are stacked together and each of
bacteria in this dataset while Fig. 3 presents fragments the residual block has two 3×3 convolutional layer.
of the images. All of the samples were stained using There is only one fully connected layer at the end
Gramms method. The images were taken with Olympus to classify into 1000 output categories. In Fig. 2, a
CX31 Upright Biological Microscope equipped with typical ResNet model is shown. There are several vari-
a SC30 camera (Olympus Corporation, Japan). They ants for ResNet model, such as ResNet-18, ResNet-34,
were evaluated using a 100 times objective under oil- ResNet-50, ResNet-100 and ResNet-150, based on the
immersion (Nikon50, Japan). DIBaS dataset is publicly
available to other researchers.
We use python for cropping the bacteria images (224
pixel size) and then we manually analyze the result
Fig. 2. Resnet-34 Architecture Fig. 3. Bacteria Data set Sample
Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 20:37:17 UTC from IEEE Xplore. Restrictions apply.
convolutional layers present in it. These models bring framework, integrated with python modules. For all the
much better classification performance compared to self dataset analysis and feature realization we have used
made extraction models in image processing. General ggplot module(R). All dataset is refined to 224×224
architecture for residual network model is shown in pixel size using python script for easy feature extraction.
Fig. 4. Here, computing residues provide significant For training of ResNet-34 CNN model, we have
improvement in the accuracy of classification. used the following system configuration. Configurations
We have used ResNet-34 and ResNet-50 as feature used in this work are: Google Colab with 12GB RAM,
selector and classifier for different scenarios. ResNet- 68Gb Disk memory and Python 3 Google computer
50 has 53 convolutional layers and a max-pooling and engine backhand (GPU) for training both ResNet-34
average pooling layer. At the end, one fully connected and ResNet-50.
layer is present and there are sixteen element-wise
layers. These element-wise layers computes element- V. R ESULTS AND D ISCUSSIONS
wise addition to connect the skipped branch with the We have used two CNN models i.e ResNet-34 and
residual layers. In between convolution layers, batch ResNet-50 for bacteria classification purpose. After
normalization, scaling and ReLU activation layers are Using the ResNet-34 for our training model we could
also present. achieve 99.35 percent accuracy for 20 different bacterial
species. We also trained ResNet-50, with the same
A. Classification and Training
dataset, which indeed proved to be a better solution for
We classify bacteria on the basis of color classifier the classification purposes. We implemented ResNet-
i.e purple for Gram-positive (G+) and pink for Gram- 50 with 20 different classes and the accuracy is 99.99
negative (G-). We also classify based on the shape of a percent which can be treated as remarkable accuracy
single bacterial cell, ie., round, rod-shaped, stick, club, values.
donut and boat, and also the size classifier, ie., large Learning rate of ResNet-34 and ResNet-50 model is
and small. Classifier related to the number of clusters shown in Fig. 5 and Fig. 6 respectively.
formed, ie., single cells, diplococci, tetrads, larger. Final epoch cycle values of ResNet-34 and ResNet-
50 are shown in Fig. 7 and Fig. 8 respectively.
B. System Set-up
For a detailed performance analysis, the confusion
Cloud service is used for training and processing pur- matrix as shown in Fig. 9 has been obtained during
pose i.e. Google TPU (Tensor Processing Unit) facility the testing and validation of data. Fig. 9 shows the
is used. TPUs are highly optimized for large scale inputs confusion matrix for both ResNet-34 and ResNet-50
and CNNs. As our Model is used for classification models.
purpose only, GPU is faster than TPU for training
and testing. In background we are using Tensorflow as
Fig. 4. General ResNet Architecture Fig. 5. Learning Rate of ResNet-34
Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 20:37:17 UTC from IEEE Xplore. Restrictions apply.
of the model has been presented and for validation,
confusion matrix also has been presented. We have
classified twenty categories of bacteria with an accuracy
of 99.9% using ResNet-50 CNN model. Comparisons
with some of the existing works show that our approach
gives better accuracy.
TABLE II
C OMPARISON OF VARIOUS BACTERIA C LASSIFIERS
Reference Method Accuracy
Fig. 6. Learning Rate of ResNet-50 [7] Decision Tree 83.77%
[9] CNN 95%
[1] CNN- Naive Bayes 95.5%
[3] Bag-of-Words and SVM 97%
[5] CNN 98.25%
[2] CNN, SVM and Random Forest 97.24%
[11] CNN SVM 98.7%
Our work CNN 99.9%
Fig. 7. Final Epoch cycle values of ResNet-34
R EFERENCES
Comparison of our approach with existing classifi- [1] N. A. Mohamad, N. A. Jusoh, Z. Z. Htike, S. L. Win, Bacteria
Identification from Microscopic Morphology using Nave Bayes,
cation approaches is given in Table II. Table II shows International Journal of Computer Science Engineering and
that our approach, where we have used ResNet-50 for Information Technology, Vol. 4, No. 2, April 2014.
classification, shows better accuracy compared to other [2] Bartosz Zieliski, Anna Plichta, Krzysztof Misztal1, Przemy-
saw Spurek ,Monika BrzychczyWoch, Dorota Ochoska, ”Deep
approaches. learning approach to bacterial colony classification”., PloS one,
12(9), p.e 0184554, 2017
VI. C ONCLUSION [3] Mohamed, B.A. and Afify, H.M., Automated classification of
In this work, we have presented a computer aided Bacterial Images extracted from Digital Microscope via Bag
of Words Model. In 2018 9th Cairo International Biomedical
approach for classification of bacteria for diagnosis of Engineering Conference (CIBEC) (pp. 86-89). IEEE. 2018,
diseases. We have used deep learning neural network for December
this classification problem. For feature extraction and [4] S. Kala, B. R. Jose, J. Mathew and S. Nalesh, ”High-
Performance CNN Accelerator on FPGA Using Unified
classification we have used deep residual networks, ie., Winograd-GEMM Architecture,” in IEEE Transactions on Very
different variants of ResNet CNN models like ResNet- Large Scale Integration (VLSI) Systems, vol. 27, no. 12, pp.
34 and ResNet-50 models were used. Learning rate 2816-2828, Dec. 2019.
[5] Nasip, .F. and Zengin, K., Deep Learning Based Bacteria
Classification. In 2018 2nd International Symposium on Mul-
tidisciplinary Studies and Innovative Technologies (ISMSIT)
(pp.1-5). IEEE. 2018, October.
[6] Trattner S, Greenspan H, Tepper G, Abboud S. Automatic
identication of bacterial types using statistical imaging methods.
IEEE transactions on medical imaging. 2004; 23(7):807820.
https://doi.org/10. 1109/TMI.2004.827481 PMID: 15250633
[7] Anna Plichta, Methods of Classification of the Genera
and Species of Bacteria Using Decision Tree. Journal
of Telecommunications and Information Technology, 2019.
https://doi.org/10.26636/jtit.2019.137419
[8] Perner P. Classification of HEp-2 cells using uoroscent image
analysis and data mining. International Symposium on Medical
Fig. 8. Final Epoch cycle values of ResNet-50 Data Analysis. Springer; 2001. p. 219224.
Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 20:37:17 UTC from IEEE Xplore. Restrictions apply.
Fig. 9. Confusion Matrix: a) ResNet-34 model b) ResNet-50 model
[9] Md. Ferdous Wahid, Tasnim Ahmed and Md. Ahsan Habib., Nearest Neighbor , International Journal of Advanced Computer
Classification of Microscopic Images of Bacteria Using Deep Research, Vol. 2, No. 4, December 20l2.
Convolutional Neural Network. 10th IEEE International Confer- [18] Mahbub Hussain, Jordan J. Bird, and Diego R., A Study on
ence on Electrical and Computer Engineering 20-22 December, CNN Transfer Learning for Image Classification. Advances
2018, Dhaka, Bangladesh in Computational Intelligence Systems, Springer International
[10] Dong Nie, Elizabeth A. Shank, Vladimir Jojic, A Deep Publishing, 2019.
Framework for Bacterial Image Segmentation and Classifica- [19] De Bruyne K, Slabbinck B, Waegeman W, Vauterin P, De Baets
tion. BCB15, September 9 - 12, 2015, Atlanta, GA, USA., B, Vandamme P. Bacterial species identification from MALDI-
http://dx.doi.org/10.1145/2808719.2808751. TOF mass spectra through data analysis and machine learn-
[11] Md. Ferdous Wahid, Md. Jahid Hasan, Md. Shahin Alom, ing. Systematic and applied microbiology. 2011; 34(1):2029.
Shamim Mahbub. Performance Analysis of Machine Learning https://doi.org/10.1016/j.syapm.2010.11.003 PMID: 21295428
Techniques for Microscopic Bacteria Image Classification. 10th [20] Huang, Gao; Liu, Zhuang; Weinberger, Kilian Q.; van der
ICCCNT 2019 July 6-8, 2019 , IIT - Kanpur, Kanpur, India Maaten, Laurens (2017). ”Densely Connected Convolutional
[12] K. He, X. Zhang, S. Ren and J. Sunand., Deep Residual Networks”. Proc. Computer Vision and Pattern Recognition
Learning for Image Recognition, Computer Vision and Pattern (CVPR), IEEE.
Recognition, CVPR, 2016.
[13] Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton. (2012).
ImageNet Classification with Deep Convolutional Neural Net-
works. NIPS.
[14] K. Simonyan and A. Zisserman. ”Very Deep Convolutional
Networks for Large-Scale Image Recognition”. ICLR 2015.
[15] Kala S, J. Mathew, B. R. Jose and Nalesh S, ”UniWiG:
Unified Winograd-GEMM Architecture for Accelerating CNN
on FPGAs,” 2019 32nd International Conference on VLSI
Design and 2019 18th International Conference on Embedded
Systems (VLSID), Delhi, NCR, India, 2019, pp. 209-214. doi:
10.1109/VLSID.2019.00055
[16] D. Castro and J. New, The Promise of Articial Intelligence,
Center for Data Innovation (2016), MIT Press.
[17] Dr. R J. Ramteke and Khachane Monali Y,Automatic Medi-
cal Image Classification and Abnormality Detection Using K
Authorized licensed use limited to: University of Exeter. Downloaded on May 27,2021 at 20:37:17 UTC from IEEE Xplore. Restrictions apply.