Road Traffic Sign Detection and Classification
Road Traffic Sign Detection and Classification
Road Traffic Sign Detection and Classification
I. INTRODUCTION
Manuscript received July 23, 1996; revised September 3, 1997. This work
was supported by the Spanish Government under CICYT Project TAP940711-C03-02.
The authors are with the Area de Ingenieria de Sistemas y Automatica, Universidad Carlos III de Madrid, 28911 Madrid, Spain (e-mail: escalera@ing.uc3m.es).
Publisher Item Identifier S 0278-0046(97)08490-6.
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
849
The algorithm presented here has two steps. The first one
localizes the sign in the image depending on the color and the
form. The second one recognizes the sign through a neural
network.
II. TRAFFIC SIGN DETECTION
There are four types of traffic signs that are shown in the
traffic code: 1) warning; 2) prohibition; 3) obligation; and 4)
informative. Depending on the form and the color, the warning
signs are equilateral triangles with one vertex upwards. They
have a white background and are surrounded by a red border.
Prohibition signs are circles with a white or blue background
and a red border. Both warning signs and prohibition signs
have a yellow background if they are located in an area where
there are public works. To indicate obligation, the signs are
circles with a blue background. Informative signs have the
same color. Finally, there are two exceptions: 1) the yield
sign, an inverted triangle; and 2) the stop sign, a hexagon.
They were not studied here. To detect the position of the sign
in the image, we must know the two properties we talked
about before, i.e., color and shape.
A. Color Thresholding
The most intuitive color space is the RGB system. The color
of every pixel is defined by three components: red, green, and
blue. Because of this, the color threshold has the following
expression:
Fig. 1.
where
,
, and
are, respectively, the
functions that give the red, green, and blue levels of each
point of the image. One of the greatest inconveniences of the
previous color space is that it is very sensitive to lighting
changes.
That is the reason why other color spaces are used in
computer vision applications, especially the hue, saturation,
intensity (HSI) system that is very invariant to lighting changes
[13]. The problem with HSI is that its formulas are nonlinear,
and the computational cost is prohibitive if special hardware
is not used. That is why we have modified the approach
suggested by Kamada and Yoshida [14], i.e., the color ratio
between the intensity of the specified color and the sum of
intensity of RGB. Instead, we have used the relation between
the components. Thus, the thresholding is (assuming that the
red component is chosen as a reference)
(1)
(2)
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
850
(3)
(5)
(6)
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
851
(7)
where
the image.
C. Corner Extraction
The mask for the lower right corner (type T3) is symmetrical
with respect to a vertical axis of the mask for the lower
left corner. To detect the corners of the stop sign (T4,
T5, and T6 types), one has to use symmetrical masks with
respect to a horizontal axis of the ones used for the warning
9 masks for triangular signs.
signs. Then, there are six 9
To reduce the number of masks, the 90 masks for the
detection of the lower left and right corners can be used
for the upper left and right corners of the triangular signs.
If the masks are observed, we can see that the difference
is not very large, and, since the grey level of the background is going to be low, the results, which follow, are very
similar.
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
(8)
852
TABLE I
90 MASK DECOMPOSITION IN SMALLER ONES
Sbm3
Sbm3
Sbm1
Sbm4
Sbm3
Height
Height
Height
(9)
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
853
Fig. 5.
Height
(10)
Height
(11)
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
854
Fig. 6.
TABLE II
IMAGE NORMALIZATION
A. Image Normalization
otherwise, it is taken as a rectangular one (Fig. 7). The result
of applying the algorithm to a real image can be observed in
Fig. 8(b).
A consequence of the algorithms described above is that
occlusions have not been considered in the detection, and every
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
(a)
855
(b)
(c)
Fig. 8.
Real signs detection. (a) Triangular sign detection. (b) Circular sign detection. (c) Rectangular sign detection.
Instead of this nearest neighbor approach, bilinear interpolation has been tried, but with insignificant improvement. As
bilinear interpolation is costlier computationally, the nearest
neighbor method has been used.
B. Training Patterns
Nine ideal signs were chosen for the net training (Fig. 9).
The training patterns are obtained from these signs through
the following modifications.
1) We mentioned before that the slope accepted for a sign
was 6 . From every one of the nine signs, another five
were obtained by covering that draft range.
2) Three Gaussian noise levels were added to each of the
previous signs. This way, during the training of the net,
low weights were associated with the background pixels
of the inner part of the sign.
3) Four different thresholds were applied to the resulting
image, in order to obtain the information located in the
inner part of the sign. In this way, the system is adapted
to various lighting conditions that the real images will
present.
TABLE III
NEURAL NETWORK DIMENSIONS SELECTION
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
856
TABLE IV
TRIANGULAR SIGNS CLASSIFICATION
neural network in a digital signal processor (DSP) is undergoing research, and the expected speed is between 3040 ms.
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
Fig. 10.
857
IV. CONCLUSION
A method for the perception of traffic signs by image
analysis has been tested successfully. The algorithm has two
main parts, the detection and the classification. For the first
part, the color and the corners of the shape of the sign were
chosen as features to extract the sign from the environment.
It has been proved with different signs and conditions. For
the classification, the detected sign was used as the input
pattern for a neural network. The multilayer perceptron was
chosen. Several networks with different number or layers and
nodes were trained and compared. All the algorithms can
be achieved in real time with a PC and a pipeline image
processing board. Above all, some improvements are the study
of partial occlusions and the use of other paradigms of neural
networks.
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
858
Fig. 10.
ACKNOWLEDGMENT
The authors gratefully acknowledge C. Gagnon for her help
during the preparation of this paper.
REFERENCES
[1] J. Crissman and C. E. Thorpe UNSCARF, a color vision system for
the detection of unstructured roads, in Proc. IEEE Int. Conf. Robotics
and Automation, Sacramento, CA, Apr. 1991, pp. 24962501.
[2] L. Davis, Visual navigation at the University of Maryland, in Proc. Int.
Conf. Intelligent Autonomous Systems 2, Amsterdam, The Netherlands,
1989, pp. 119.
[3] E. Dickmans, Machine perception exploiting high-level spatio-temporal
models, presented at the AGARD Lecture Series 185, Madrid, Spain,
Sept. 1718, 1992.
[4] D. Pomerleau, Neural network based autonomous navigation, in
Vision and Navigation The Carnegie Mellon Navlab, C. E. Thorpe, Ed.
Norwell, MA: Kluwer, 1990, ch. 5.
[5] I. Masaki, Ed., Vision Based Vehicle Guidance. Berlin, Germany:
Springer-Verlag, 1992.
[6] R. Luo, H. Potlapalli, and D. Hislop, Autocorrelation, in Proc. Int.
Conf. Industrial Electronics, Control, Instrumentation and Automation,
San Diego, CA, Nov. 913, 1992, pp. 700705.
[7]
, Translation and scale invariant landmark recognition using
receptive field neural networks, in Proc. Int. Conf. Intelligent Robots
and Systems (IROS92), 1992, pp. 527533.
[8] R. Luo and H. Potlapalli, Landmark recognition using protection learning for mobile robot navigation, in Proc. Int. Conf. Neural Networks,
Orlando, FL, June 2729, 1994, pp. 27032708.
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.
859
Jose Maria Armingol received the Degree in automation and electronics engineering from the Universidad Politecnica de Madrid, Madrid, Spain, in
1992. He is currently working toward the Ph.D. degree in the Department of Engineering, Universidad
Carlos III de Madrid, Madrid, Spain.
He is also currently an Assistant Professor in the
Department of Engineering, Universidad Carlos III
de Madrid. His research interests are in the areas of
image processing and pattern recognition for mobile
robot relocalization.
Authorized licensed use limited to: Univ Carlos III. Downloaded on March 03,2010 at 04:13:08 EST from IEEE Xplore. Restrictions apply.