Base Paper
Base Paper
Yellapragada SS Bharadwaj1, Rajaram P2, Sriram V.P3, Sudhakar S4, Kolla Bhanu Prakash5
1
Dept. of CSE, Koneru Lakshmaiah Education Foundation, India,
saisrinivasabharadwaj@gmail.com
2
Dept. of CSE, AKT Memorial College of Engineering and Technology, Kallakurichi, Tamil Nadu, India,
rajaramnov82@gmail.com
3
Dept. of Management Studies, Acharya Bangalore B School (ABBS), Bengaluru,India
dr.vpsriram@gmail.com
4
Dept. of CSE, Sree Sakthi Engineering College, Coimbatore, Tamil Nadu India,
sudhasengan@gmail.com
5
Dept. of CSE, Koneru Lakshmaiah Education Foundation, India,
drkbp@kluniversity.in
1335
Yellapragada SS Bharadwaj et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1335 – 1339
the pixels, and the weight carried by that neuron is called Mahmoud M. Abu Ghosh [8] compared CNN, DNN, DBN
activation. The last layer, aka the output layer, contains approaches to determine which neural network is useful in the
neurons equal to the number of target classes; in case of the field of computer vision, specifically in image classification
handwritten digit, recognition number of categories are ten like OCR. Their analysis claims that DNN's(Deep Belief
ranging from 0 to 9 with probability values.[3][4] The number Network) accuracy beats other neural networks with 98.08%
of input neurons and output neurons depends on the task, but accuracy but falls behind CNN in terms of execution time.
the hidden layers are arbitrary and are not dependent on the they concluded that any algorithm would only have 1 or 2
task. That's why they are hidden layers, but the propagations percent error rate towards the similarity of digits [13] [14]
between the hidden layers depend on the activation of the
previous layers.[11] [12] Intuitively, the pattern of activations A.K. Jain [9] have presented an approximation based KNN
in the presentation layer causes some specific patterns in the classification algorithm for handprinted digit recognition. He
next layer; thus, the highest activation neuron is the network's performed matching on two character's deforming edges and
choice of class. In this paper, we are implementing dissimilarities. Their proposed work is on patterns in
convolution neural network architecture with relu and low-dimensional space where the scaling is 2000 times lesser
sigmoid activation functions to predict a real-world results 99.25% accuracy.
handwritten digit by training the network with the MNIST
dataset.[10] R.Alhajj [10] has applied a completely new approach called
the agent-oriented approach. In a sentence, they appoint
2. RELATED WORK agents to each character where their only purpose is to
identify hills and valleys, which are nothing but blacks and
Many techniques have been developed to recognize whites. The ability of agents to socialize with each other is
handwritten digits; most of the A.I. practitioners use this to highlighting features compared with any other image
test their model's performance. In the past decades, a classification techniques. Overlapping digits are identified
segmentation-based approach was used to solve this problem based on the cut-points, which is nothing but the intersection
later with the advancements in machine learning a of agent paths. The results of their methods are surprisingly
segmentation less approach was introduced. Even though the higher when compared to many other ANN-based approaches
implementation changes, the issue still remains the same and at about 97% accuracy.
open for anyone to solve.
Bailing Zhang [5] utilizes the ASSOM technique to provide 3. METHODOLOGY
numerical stability that can predict precise digits. Their main
idea was to use the SOM clustering algorithm with In this paper, we used MNIST as a primary dataset to train the
autoencoder neural networks in a nonlinear approach. The model, and it consists of 70,000 handwritten raster images
modularity of SOM helped in extracting several features in a from 250 different sources out of which 60,000 are used for
digit. For each digit, individual ASSOM was constructed and training, and the rest are used for training validation. MNIST
compared with ten several construction-related errors to data is represented in the IDX file format and are look like in
minimize the misclassification. Their model shows promising figure 1. Our proposed method mainly separated into stages,
results, even with small training samples. as shown in Figure 2, pre-processing, Data Encoding, Model
Construction, Training & Validation, Model Evaluation &
Saleh Aly [3] proposed a technique for handwritten numerical Prediction. Since the loading dataset is necessary for any
strings of arbitrary length recognition using SVM and PCA, process, all the steps come after it.
addressing the major challenge in word detection, which is
overlapping characters. Their method uses hybrid PCA called
PCANet for segmentation and SVM for segmentation
classification together called PCA-SVMNet. Their
experiment shows high efficiency in recognizing unknown
handwritten number classification without any segmentation
method applied [15]
Yue Yin; Wei Zhang [1] have concluded that out of all neural
network implementations CNN method is valid for OCR
based image classification systems. They claimed that OCR
had become a preliminary technique in the field of computer
vision; they state that if an image classification model
performs well in OCR, then it can be used for any image
classification systems. Figure 1: Sample MNIST data
1336
Yellapragada SS Bharadwaj et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1335 – 1339
3.1 Pre-Processing
Figure 2: Flowchart
This is an optional step since we are using the 3.4 Training & Validation
cross-categorical entropy as loss function; we have to specify
the network that the given labels are categorical in nature. After building the model [24], we compiled a model with
adam optimizer and particular cross-entropy loss function,
3.3 Model Construction which are standard in making a convolution neural network.
After data encoding, the images and labels are ready to be Once the model is successfully assembled, then we can train
fitted into our model [22] [23]. Summary of the model can be the model with training data for 100 iterations, but as the
seen in Figure 4 number of iteration increases, there is a chance for overfitting.
Therefore we limit the training up to 98% accuracy, as we are
using real-world data for prediction, test data was used to
validate the model [16].
1337
Yellapragada SS Bharadwaj et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1335 – 1339
Our model stopped training at the 2nd epoch as it reached The performance of CNN for handwritten recognition
98.21% training accuracy and 98.51% validation accuracy performed significantly. The proposed method obtained 98%
with 5% training loss and 4% validation loss. The progression accuracy and is able to identify real-world images as well; the
of accuracy and loss are represented in Figure 5. loss percentage in both training and evaluation is less than
0.1, which is negligible. The only challenging part is the
noise present in the real-world image, which needs to look
after. The learning rate of the model is much dependent on the
number of dense neurons and the cross-validation measure
REFERENCES
1. Y. Yin, W. Zhang, S. Hong, J. Yang, J. Xiong, and G.
Gui, "Deep Learning-Aided OCR Techniques for
Chinese Uppercase Characters in the Application of
Internet of Things," in IEEE Access, vol. 7, pp.
47043-47049, 2019.
https://doi.org/10.1109/ACCESS.2019.2909401
2. N. B. Venkateswarlu, "New raster, adaptive document
binarization technique," in Electronics Letters, vol. 30,
no. 25, pp. 2114-2115, 8 Dec. 1994.
https://doi.org/10.1049/el:19941439
3. S. Aly and A. Mohamed, "Unknown-Length
Figure 5: Loss and Accuracy Learning Curves Handwritten Numeral String Recognition Using
Cascade of PCA-SVMNet Classifiers," in IEEE
From the above curve, we can observe that the loss and Access, vol. 7, pp. 52024-52034, 2019.
accuracy are cooperatively changed at every fold during 4. Seong-Whan Lee and Sang-Yup Kim, "Integrated
k-fold cross-validation. Before two folds, efficiency almost segmentation and recognition of handwritten
reached 98%, and that's why the number of iterations stopped numerals with cascade neural network," in IEEE
at the 2nd epoch. Stability inaccuracy score can be observed Transactions on Systems, Man, and Cybernetics, Part C
from 2nd iteration. (Applications and Reviews), vol. 29, no. 2, pp. 285-290,
May 1999.
4.2 Prediction https://doi.org/10.1109/5326.760572
5. Bailing Zhang, Minyue Fu, Hong Yan, and M. A. Jabri,
Our model is able to recognize computer-generated digits as "Handwritten digit recognition by adaptive-subspace
well as handwritten digits. Computer-generated digit self-organizing map (ASSOM)," in IEEE Transactions
prediction is more accurate compared to real-world digit on Neural Networks, vol. 10, no. 4, pp. 939-945, July
prediction, which can be observed in Figure 6. 1999.
6. L. Deng, "The MNIST Database of Handwritten Digit
Images for Machine Learning Research [Best of the
Web]," in IEEE Signal Processing Magazine, vol. 29,
no. 6, pp. 141-142, Nov. 2012.
https://doi.org/10.1109/MSP.2012.2211477
7. S.Sudhakar, V.Vijayakumar, C.SathiyaKumar, V.Priya,
LogeshRavi, V.Subramaniyaswamy, Unmanned Aerial
Vehicle (UAV) based Forest Fire Detection and
monitoring for reducing false alarms in forest-fires,
Elsevier- Computer Communications 149 (2020) 1–16,
https://doi.org/10.1016/j.comcom.2019.10.007
8. M. M. A. Ghosh and A. Y. Maghari, "A Comparative
Study on Handwriting Digit Recognition Using
Figure 6: Raster vs. Real image prediction Neural Networks," 2017 International Conference on
Promising Electronic Technologies (ICPET), Deir
El-Balah, 2017, pp. 77-81.
9. A. K. Jain and D. Zongker, "Representation and
recognition of handwritten digits using deformable
1338
Yellapragada SS Bharadwaj et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1335 – 1339