Numberplate Detection Final Report
Numberplate Detection Final Report
Submitted By:
THAMIZHARASAN B 212922106048
VIGNESH O 212922106051
THIRUVENGATAKRISHNAN M 212922106050
VIKRAM S 212922106053
SANJAY V 212922106036
1
Anna University Chennai 602 117
BONAFIDE CERTIFICATE
Certified that this project report (VEHICLE NUMBER PLATE DETECTION) is the
Bonafide work of THAMIZHARASAN B (212922106048) VIGNESH O
(212922106051) THIRUVENGATAKRISHNAN M (212922106050) VIKRAM S
(212922106053) SANJAY V (212922106036) Carried out the project work under my
supervision
SIGNATURE: SIGNATURE:
ENGINEERING ENGINEERING
Submitted for the semester Mini Project Viva Voice Examination held on _______
2
ABSTRACT
In the last couple of decades, number of vehicles has been increased drastically.
Hence it has become very difficult to keep track of each and every vehicle for the
purpose of traffic management and the law enforcement. An automated, fast, accurate
and robust vehicle plate recognition system has become need for traffic control and law
enforcement of traffic regulations; and the solution is ANPR. Automatic Number Plate
plates of vehicles. In this work, methods such as number plate detection, optical
character recognition (OCR) were implemented. The number plate is detected using
YOLOv7 object detection model. This model is trained using more than 1500 images.
The proposed system can be mainly used to monitor road traffic activities such as the
identification of vehicle during traffic violations such as speed of vehicle and to detect
at the street traffic signals lane violation. And thereby can be traced every vehicle for
traffic rule violation and can provide the information to the concern authority to take
further effective action, so we can have smooth traffic flow and also, we can avoid
results that the proposed method will keep track of large number of vehicles and lead
3
TABLE OF CONTENTS
ABSTRACT 3
LIST OF ABBREVIATIONS 6
LIST OF FIGURES 7
1. INTRODUCTION
1.1 OVERVIEW 8
1.2 INTRODUCTION TO YOLO ALGORITHM 9
1.2.1 FEATURES 10
1.3 YOLOv7 12
1.4.3 PADDLEOCR 18
2.1 METHODOLOGY 19
4
2.2 DETECTION OF LICENSE PLATE USING YOLOv7 20
2.2.1 DATASET 20
2.2.3 TRAINING 21
2.3.4 IMPLEMENTATION 26
4. CONCLUSION
4.1 DISCUSSION 34
REFERENCES 36
LIST OF ABBREVIATIONS
TERM ABBREVIATIONS
5
YOLO YOU ONLY LOOK ONCE
NETWORKS
NETWORKS
LIST OF FIGURES
NO.
6
1.2 E-ELAN ARCHITECTURE 13
2.2 DATASET 20
7
CHAPTER 1
INTRODUCTION
1.1 OVERVIEW
The license plate number can be used to retrieve more information about the
vehicle and its owner, which can be used for further processing. Such an automated
system should be small in size, portable and be able to process data at sufficient
rate. Various number plate detection algorithms have been developed in past few
years.
8
objective of the proposed design is to detect a license plate number from an image
which is captured from camera. An efficient algorithm is proposed to detect a
license plate under various conditions.
1.2.1 FEATURES
• Speed: This algorithm improves the speed of detection because it can predict
objects in real-time.
9
• High accuracy: YOLO is a predictive technique that provides accurate results
with minimal background errors.
• Learning capabilities: The algorithm has excellent learning capabilities that
enable it to learn the representations of objects and apply them in object
detection.
• Residual blocks
• Bounding box regression
• Intersection Over Union (IOU)
Residual blocks
First, the image is divided into various grids. Each grid has a dimension of
S x S. The following image shows how an input image is divided into grids. The
grid cells are of equal dimension. Every grid cell will detect objects that appear
within them. For example, if an object centre appears within a certain grid cell,
then this cell will be responsible for detecting it.
10
Bounding box regression
• Width (bw)
• Height (bh)
• Class (for example, person, car, traffic light, etc.)- This is represented by the
letter c.
• Bounding box centre (bx, by)
YOLO uses a single bounding box regression to predict the height, width,
centre, and class of objects. It represents the probability of an object appearing
in the bounding box.
Each grid cell is responsible for predicting the bounding boxes and their
confidence scores. The IOU is equal to 1 if the predicted bounding box is the same
as the real box. This mechanism eliminates bounding boxes that are not equal to
the real box.
1.3 YOLOv7:
11
• YOLOv7 is the fastest and most accurate real-time object detection model for
computer vision tasks. YOLOv7 is the basic model that is optimized for
ordinary GPU computing.
• YOLOv7 requires several times cheaper computing hardware than other deep
learning models. It can be trained much faster on small datasets without any
pre-trained weights.
13
• Stage (number of feature pyramids)
14
• Using different training data but the same settings, train multiple models. Then
average their weights to obtain the final model.
• Take the average of the weights of models at different epochs.
With the help of an assistant loss, the weights of the auxiliary heads are
updated. It allows for Deep Supervision and the model learns better. These
concepts are closely coupled with the Lead Head and the Label Assigner.
Lead Head Guided Label Assigner and Coarse-to-Fine Lead Head Guided Label
Assigner
15
The Lead Head Guided Label Assigner encapsulates the following three concepts.
• Lead Head
• Auxiliary Head
• Soft Label Assigner
The Lead Head in the YOLOv7 network predicts the final results. Soft labels
are generated based on these final results. The important part is that the loss is
calculated for both the lead head and the auxiliary head based on the same soft
labels that are generated.
The OCR engine or OCR software works by using the following steps:
16
Image acquisition
A scanner reads documents and converts them to binary data. The OCR
software analyzes the scanned image and classifies the light areas as background
and the dark areas as text.
Pre processing
The OCR software first cleans the image and removes errors to prepare it
for reading. These are some of its cleaning techniques:
The two main types of OCR algorithms or software processes that an OCR
software uses for text recognition re called pattern matching and feature extraction.
Pattern matching
Feature extraction
17
features to find the best match or the nearest neighbor among its various stored
glyphs.
• Operational efficiency
• Artificial intelligence solutions
• Simple optical character recognition software
• Intelligent character recognition software
• Intelligent word recognition
• saves time
• decreases errors
• minimizes effort
1.4.3 PADDLEOCR
PaddleOCR offers users multilingual practical OCR tools that help the users to
apply and train different models in a few lines of code. PaddleOCR offers a lot of
models in its toolkit, including PP-OCR, a series of high-quality pretrained OCR,
the latest algorithms such as SRN, and popular OCR algorithms like CRNN.
CHAPTER 2
18
2.1 METHODOLOGY:
2.2.1 DATASET:
19
Data is the core of any AI application and one of the first and most important
steps. For training the YOLOv7 number plate detector, dataset of vehicles will be
used. Our dataset consists of 1200 training images and 300 validation images and
100 validation images in the YOLO format.
20
FACTORS IN TRAINING A YOLOv7 MODEL:
image = size of images on which model will train; the default value is 640.
21
(a)
Feature Extraction
The first layer is the convolutional neural network (CNN) which consists of
Convolutional and max-pooling layers. These are responsible for extracting
features from the input images and producing feature maps as outputs. To feed
output to the next layer, feature maps are first converted into a sequence of feature
vectors. According to the original paper, “Each feature vector of a feature sequence
23
is generated from left to right on the feature maps by column. This means the i-th
feature vector is the concatenation of the i-th columns of all the maps.”
Due to the feature extraction, each column of the feature maps corresponds to a
rectangular region of the input image, that region is called a receptive field. Each
feature vector in the feature sequence is associated with the receptive field and can
be called an appearance descriptor for that region. The feature sequence is now
passed to the next layer of RNNs.
Sequence Labelling
This layer is the Recurrent Neural Network (RNN) which is built on top of the
Convolutional Neural Network. In CRNN, two Bi-directional LSTMs are used in
the architecture to address the vanishing gradient problem and to have a deeper
network. The recurrent layers predict the label for each feature vector or frame in
the feature sequence received from CNN layers. Mathematically, the layers predict
label y for each frame x in feature sequence x = x1,.., xt.
Transcription
This layer is responsible for translating the per-frame predictions into a final
sequence according to the highest probability. These predictions are used to
compute CTC or Connectionist Temporal Classification loss which makes the
model learn and decode the output.
CTC loss
The output received from the RNN layer is a tensor that contains the probability of
each label for each receptive field. when Connectionist Temporal Classification
(CTC) loss comes in. CTC loss is responsible for training the network as well as
24
the inference that is decoding the output tensor. CTC works on the following major
principles:
• Text encoding: CTC solves the issue when a character takes more than one
time step. CTC solves this by merging all the repeating characters into one.
And, when that word ends it inserts a blank character “-”. This goes on for
further characters. For example, in fig -04, ‘S’ in ‘STATE’ has three time
steps. The network might predict those time steps as ‘SSS’. Now, the CTC
will merge those outputs and predict the output as ‘S’. For the word, a
possible encoding could be SSS-TT-A-TT-EEE, Hence the output ‘STATE’.
2.3.4 Implementation:
25
In the above code snippet, we have initialized PP-OCRv3 and the required weights
will be downloaded automatically. This package by default provides all of the
models of the system which are detection, angle classification and recognition. It
provides several arguments to access only the required functionalities.
• lang: The language which we want to recognise is passed here. For example,
en for English, ch for Chinese, french for French, etc. The OCR can
recognise English and Chinese by default.
• rec_algorithm: Takes the recognition algorithm to be used as arguments. The
OCR uses CRNN as its default recognition algorithm. det_algorithm:
Takes the text detection algorithm to be used as arguments. The OCR uses a
DB text detector as its default detector.
• use_angle_cls: Specifies if angle classifier is to be used or not and takes bool
as the argument.
The OCR is now initialized and can be used in just one line of code.
• img: This is the first parameter in the ocr function. In this, the image array
or the image path is passed to perform OCR.
• det: Takes bool as an argument and specifies whether to use a detector or
not. rec: Takes bool as argument and specifies whether to use a recognizer
or not.
26
• cls: Takes bool as argument and specifies whether to use an angle classifier
or not.
It detected all the fields like the boat number, date, ID number and more which
are the key information here even while the text was at an angle.
DISCUSSION
27
object detection tasks, one such metric is mean average precision in short known
as mAP.
• Confusion Matrix,
• Recall,
• Precision
CONFUSION MATRIX:
True Positives (TP): The model predicted a label and matches correctly as per
ground truth.
True Negatives (TN): The model does not predict the label and is not a part of the
ground truth.
False Positives (FP): The model predicted a label, but it is not a part of the ground
truth (Type I Error).
False Negatives (FN): The model does not predict a label, but it is part of the
ground truth. (Type II Error).
28
(a)
(b)
Precision:
Precision measures how well you can find true positives (TP) out of all positive
predictions. (TP+FP).
29
Figure 3.2 PRECISION CURVE
Recall:
Recall measures how well you can find true positives (TP) out of all predictions
(TP+FN).
Precision-Recall Curve:
30
Figure 3.4 PRECISION-RECALL CURVE
F1 SCORE:
The F-measure is the weighted harmonic mean of precision (P) and recall
(R) of a classifier, taking α=1 (F1 score). It means that both metrics have the same
importance.
31
Figure 3.6 RESULTS OF TRAINED MODEL
F1 SCORE = 0.77
PRECISION = 1.00
RECALL = 0.93
32
Figure 3.7 ANPR OUTPUTS
CHAPTER 4 CONCLUSION
4.1 DISCUSSION:
But the ideal method for our ALPR is to use a tracker with it, which keeps
the best OCR result out of all. Various other trackers like OpenCV trackers,
CenterTrack, Tracktor etc. which tackle different advanced problems like
occlusion, Re-id etc.
34
This ANPR system works quite well however, there is still room for
improvement. This ANPR system speed can be increase with high resolution
camera. Which can be able to capture clear images of the vehicle. The OCR method
is sensitive to misalignment and to different sizes, so we have to create different
kind of templets for different Reginal transport office (RTO) specifications. The
statistical analysis can also be used to define the probability of detection and
recognition of the vehicle number plate. At present there are certain limits on
parameters like speed of the vehicle, script on the vehicle number plate, skew in
the image which can be removed by enhancing the algorithms further.
REFERENCES:
35
[3] R. P. van Heerden, E. C. Botha, Optimization of Vehicle License Plate
Segmentation and Symbol Recognition, Department of Electrical, Electronic
and Computer engineering, University of Pretoria, South Africa, 2010.
[8] Sushruth Shastry, Gunasheela G, Thejus Dutt, Vinay D S and Sudhir Rao
Rupanagudi, i - A novel algorithm for Optical Character Recognition (OCR),
IEEE 2013.
36
[11] T. Naito, T. Tsukada, K. Yamada, K. Kozuka, and S. Yamamoto, Robust
licenseplate recognition method for passing vehicles under outside
environment, Trans. Veh. Technol., vol. 49, no. 6, pp. 23092319, Nov. 2000.
[12] M. Usman Akram, Zabeel Bashir, Anam Tariq and Shoab A Khan, Geometric
Feature Points Based Optical Character Recognition, IEEE Symp. Industrial
Elec. & App., Sept. 2013.
[13] Dr Savita Gael and Savita Dabas, Vehicle Registration Plate Recognition
System Using Template Matching, IEEE 97B-1 -4799-1 607-B/13, 2013.
37
[18] M. R. Bai, V.V. Krishna, and J. Sreedevi, "A new morphological approach
for noise removal cum edge detection," IJCSI International Journal of
Computer Science Issues, Vol. 7, Issue 6, November 2010.
[19] Hontani, H., and Koga, T., "Character extraction method without prior
knowledge on size and information",Proceedings of the IEEE International
Vehicle Electronics Conference (IVEC'01), pp. 67-72, 2001.
38