Characterization of Road Condition With Data Mining
Characterization of Road Condition With Data Mining
Research Article
Characterization of Road Condition with Data Mining Based on
Measured Kinematic Vehicle Parameters
Copyright © 2018 Johannes Masino et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
This work aims at classifying the road condition with data mining methods using simple acceleration sensors and gyroscopes
installed in vehicles. Two classifiers are developed with a support vector machine (SVM) to distinguish between different types of
road surfaces, such as asphalt and concrete, and obstacles, such as potholes or railway crossings. From the sensor signals, frequency-
based features are extracted, evaluated automatically with MANOVA. The selected features and their meaning to predict the classes
are discussed. The best features are used for designing the classifiers. Finally, the methods, which are developed and applied in this
work, are implemented in a Matlab toolbox with a graphical user interface. The toolbox visualizes the classification results on
maps, thus enabling manual verification of the results. The accuracy of the cross-validation of classifying obstacles yields 81.0% on
average and of classifying road material 96.1% on average. The results are discussed on a comprehensive exemplary data set.
the complete road section in front of the vehicle, including Time Evalua-
the neighboring lane. However, they are integrated into high- Classification Visualization
series tion
class vehicles only. Cameras currently used in vehicles are of
limited accuracy and can detect potholes with a minimum Parameter
depth of about 3 cm only.
Data Mining
Presently, the state of motorways is measured automat-
ically using expensive and complex measurement vehicles,
while that of roads in urban and rural areas is determined Ground truth
manually [9]. These methods are associated with a high Road Time Reference Evalua-
Visualization
expenditure. Due to manual evaluation, it takes a long time measurement series data set tion
until the road network quality is updated. Safety-relevant
damage may be detected too late. This may have severe Figure 1: Overview of method.
consequences, such as traffic accidents or cost-intensive and
complete renewal of the road.
For road maintenance, some countries determine the 2. Methods
stochastic road profile depth or international roughness
index (IRI), as outlined in [10]. However, the latter is often 2.1. Design. Figure 1 presents an overview of the method to
calculated for 100 m intervals only. As a result, certain evaluate the road state [15]. In a first step, the road state is to
obstacles, such as potholes, are not detected. In countries be measured by suitable sensors. For measurement, acoustic
pursuing a systematic road maintenance scheme, not only the sensors, such as the sensors described in [16], acceleration
IRI but also individual obstacles are measured. This also is the sensors and gyroscopes, cameras, and similar devices, can
objective of the present study. be used. As a result, several synchronized time series will be
Approaches to automatic road state monitoring using obtained. To obtain a representative reference data set, sensor
inertial sensors exist, e.g., [6, 11–13]. They only concentrate data have to cover a maximum of framework conditions, e.g.,
on single road features (such as potholes), do not have any variations of external temperature, driver, and speed. Every
representative dataset, or are based on data measured under point of time/road section has to be assigned a label, e.g. type
restricted conditions, e.g., in speed limit areas or on certain of road surface, simultaneously or afterwards. In this way, a
sections only. Moreover, the validation phase only covers data set with correct allocations of sensor data to labels is
checks as to whether the road damage detected actually is obtained (ground truth). By means of data mining, models
damage or not (true or false positives), but not whether road can be designed (offline) for retrospective evaluation (offline)
damage was overseen (false negatives). or classification during driving operation (online). The results
Road construction offices also need to know the material of the classification models then have to be visualized and
(road surface), as repairs on different surfaces produce evaluated on the basis of map material. To estimate the
different results and may cause different types of damage [14]. information on the road surface and event or damage that
It is also important to distinguish between safety-relevant is of relevance to road construction offices, two separate
damage that has to be repaired within 24 hours and damage classification routines have to be developed.
that is not relevant to safety and the repair of which can be
planned and postponed. 2.2. Data Acquisition. The data measured by the sensors
The main contribution of this paper is to evaluate the installed in the vehicle, e.g., GPS and inertial sensors, are
principle feasibility of automatic road surface and road encoded on the CAN bus and cannot be read without
damage measurement with an inertial sensor in the vehicle the communication matrix that is available to the control
body. Therefore, this work is aimed at system developer and automotive manufacturer only. Hence,
an inexpensive measurement system similar to the inertial
(i) designing a processing chain to evaluate road data sensor incorporated in the vehicle is proposed for the easy
based on measurements of inertial sensors, measurement and readout of data. Measurements cover the
position and dynamics of the vehicle, in particular vertical
(ii) automatically recording an adequate dataset, dynamics caused by unevenness [17]. In addition, the data
(iii) developing and evaluating a method to estimate road may be assigned labels during measurement already. The
surfaces and damage, and at measurement system (Figure 2) mainly consists of a GPS
receiver (Adafruit ultimate GPS Hat) and a MEMS inertial
(iv) integrating the algorithms developed into a graphic sensor (LSM9DS1) measuring accelerations and rotation rates
user interface for evaluation of datasets with alterna- of the vehicle along all three axes. The sensor data are
tive parameterizations by nonexperts as well. acquired using a Raspberry Pi and stored as a csv-table in
fused form. As soon as the engine of the vehicle is turned off,
The methodology will be presented in Section 2. Section 3 the UPS is activated and data can be transmitted via WiFi to a
will outline the implementation derived, while Section 4 central data base, if the Raspberry Pi is connected to a known
will explain the results based on a first dataset. The result, WiFi network.
its applicability, and open problems will be discussed in The GPS receiver has a sample rate of 10 Hz, a position
Section 5. resolution of 3 m, and a speed resolution of 0.1 m/s. As the
Journal of Advanced Transportation 3
speeds in longitudinal and transverse direction (𝜔𝑥 and 𝜔𝑦 ) Of the GPS latitude 𝑙𝑎𝑡 and longitude 𝑙𝑜𝑛 time series, the
are very important [6]. Furthermore, the roll and pitch medians in every window are used for later visualization.
acceleration (𝜔̇ 𝑥 and 𝜔̇ 𝑦 ) as well as the jerk (𝑎𝑧̇ ) of the vehicle
are done using the derivation of the vertical acceleration 2.3.4. Classification. Based on the extracted individual fea-
in time domain. The space series data of the vertical roll tures and the corresponding labels, two classifiers are
and pitch acceleration is transformed into frequency domain designed for material and event. For the design and appli-
with the short-time Fourier transform, which contains the cation of classification, a combination of feature selection,
short-term distance-localized frequency content of the signal. feature aggregation, and classifier is chosen.
Hereby, features based on specific frequency bands can For the surface classification, the five best individual
be investigated. Hence, the three distance series data are features each are determined using the multivariate analysis
extended by the following data streams, which lead us to 7 of variances (MANOVA) method, for event classification the
data streams in total: ten best features are selected. For visualization purposes,
(i) vertical acceleration, the selected individual features are then aggregated to two
features using linear discriminant analysis (DA), which can
(ii) roll acceleration, also minimize the calculation expenditure. A support vector
(iii) pitch acceleration, machine (SVM) classifier with polynomial kernel function
(iv) deviation of vertical acceleration, with order 2 is used. Validation is carried out with the help
of cross-validation with 5-folds.
(v) short-time Fourier transformed vertical acceleration,
(vi) short-time Fourier transformed pitch acceleration 2.3.5. Performance Measures. From the correct and false
and predicted instances, we can calculate a confusion matrix 𝑀 =
(vii) short-time Fourier transformed roll acceleration. (𝑚𝑖𝑗 ) ∈ N𝑘×𝑘 for classes 𝐾𝑖 , 𝑖 = 1, . . . , 𝑘. In the confusion
matrix, 𝑚𝑖𝑖 presents the true positives for class 𝑖. The other
2.3.3. Feature Extraction. The features are calculated for elements in column 𝑗 are called false negatives, in row 𝑖 false
windows with a specific length in distance domain and a positives and in the diagonal true negatives.
specific overlap. A window 𝑈𝑖 denotes all indexes [𝑛𝑖 −𝐿/2, 𝑛𝑖 + From the confusion matrix, one can calculate multiple
𝐿/2 − 1] with the running index 𝑛, window index 𝑖, and performance measures to evaluate the model, such as recall
window length 𝐿. The window overlap 𝑟 corresponds to those with 𝑚𝑖𝑖 / ∑𝑛𝑗=1 𝑚𝑗𝑖 for class 𝐾𝑖 , the overall accuracy of
values from the window 𝑈𝑖 that are contained in the previous the classifier with ∑𝑛𝑖=1 𝑚𝑖𝑖 / ∑𝑛𝑖=1 ∑𝑛𝑗=1 𝑚𝑖𝑗 , or the precision
window 𝑈𝑖−1 , i.e., 𝑛𝑖+1 = 𝑛𝑖 + 𝐿(1 − 𝑟). If a longer distance is 𝑚𝑖𝑗 / ∑𝑛𝑗=1 𝑚𝑖𝑗 = 𝜋𝑖𝑗 . The precision presents the fraction of
chosen, short amplitudes, for example, due to potholes, have
retrieved instances that are relevant and can be seen as the
a weaker impact on the value of features, which incorporate
probability 𝜋𝑖𝑗 of the classifier to predict class 𝑖 as class 𝑗 for
the overall signal, such as the standard deviation. These short
𝑖, 𝑗 = 1, . . . , 𝑙. An overview for performance measures for
amplitudes can be captured by shortening the window size or
different calculation problems can be found in [20].
using features, which calculate extrema.
For the feature extraction for material and events we use
window sized of 50 m or 5 m, respectively, and an overlap of 3. Implementation
20%.
From the distance series data, we calculate the standard To facilitate operation by non-experts, the methods are
deviation as well as peak-to-peak. The root mean square implemented in a graphical user interface called Vehicle
value or effective value for specific frequencies and the Learner Toolbox, which is available in [21]. It is based on
spectral centroid is extracted from the short-time Fourier Matlab and implements several machine learning opera-
transformed data streams for the following spatial frequency tions of the freely available toolbox SciXMiner [22] (formerly,
bands (1 m−1 ): Gait-CAD [23]). The Vehicle Learner Toolbox provides the
possibility to
[0.1, 0.5] [0.5, 15] [15, 20] [0.1, 25] [0.1, 50] (1)
(i) import vehicle sensor data in different file formats,
The vehicle velocity has a strong sensitivity on the vehicle (ii) compress the imported data and automatically extract
vibration. Previous research suggests performing a linear various features,
regression with each feature as the dependent variable and
the velocity as the independent variable [12]. The velocity (iii) train a classifier model with a wide-ranging set of
dependency is then removed by subtracting the estimated options,
linear equation from the corresponding feature. However, the (iv) test the trained classifier with a test set,
vehicle vibration and the extracted features are not linear (v) visualize the results with the help of plots and maps.
dependent on the velocity. The dependent parameters are
incorporated and the mean velocity is calculated for each A project folder can be selected and sensor data can
window as additional feature. To allow nonlinear relation- be imported in the corresponding frame Data (Figure 3).
ships a kernel function of higher order can be applied for the There is the option to assign the sensor data to specific
classification. vehicles, since they vary in suspensions, damping, and other
Journal of Advanced Transportation 5
Figure 3: The import data frame of the Vehicle Learner Toolbox. Figure 4: The train classifier frame of the Vehicle Learner Toolbox.
Different vehicles can be chosen and the data type can be set. Multiple classifiers can be trained with a wide range of options.
1 1
0.8 0.8
aggregated feature 2
aggregated feature 2
0.6 0.6
0.4 0.4
0.2 0.2
0 0
Figure 5: Classification results with two aggregated features and borders of the classifier in black.
by setting the k-folded-cross-validation value to higher than (i) peak-to-peak of pitch acceleration
1. The last section offers a variety of settings for the classifier, (ii) peak-to-peak of roll acceleration
e.g. for a SVM, including the kernel function and penalty
term. Afterwards, the classifier can be trained and data can (iii) maximum of jerk in vertical direction
be plotted on open street maps. Furthermore, the confusion (iv) root mean square (RMS) of the vertical acceleration
matrix and the total loss is shown in the Matlab console. (v) speed
For testing new data, a data set with modifications in time
range and area to be analyzed can be generated as described By comparing each class with each other, it emerges that
for training, and a trained classifier must be selected. If the the peak-to-peak value of pitch and roll acceleration are
test data set is labeled, the output of the prediction is again mainly responsible to separate events, which occur on
a confusion matrix and the classification error. Moreover, the
results can be visualized and plotted on open street maps, as it (i) both vehicle lanes (railway crossing, speed bump),
will be presented in Section 4. The trajectories will be cut into (ii) on only one side of the vehicle (manhole cover,
segments of different color referring to the corresponding pothole),
classes, which are predicted. (iii) or have only little impact on the vehicle vibration
(light damages, road segments in good condition).
1
2 1
good potholes
light damages railway crossing
The first example shows the event classification results Figure 7: Event classification results for road segments in the city of
on two different high speed roads (Figure 6). The upper Karlsruhe, Germany.
one with Label 1 is a freshly renovated asphalt highway with
close to no damages and the lower one with Label 2 is a
poorly patched asphalt road with a lot of medium and severe
damages. The classification successfully predicted the upper
roadway as good street. Most parts of the lower street were
predicted as light damage and some points even as potholes.
The results represent the road condition very accurate. The
only noticeable misclassification is railway crossing that was
predicted once (Label 3). 1
The second example presents data acquired in an urban
area in Karlsruhe, the predictions are shown in Figure 7. The 2 3
roads in this area are poorly preserved and there is a speed
bump at a pedestrian crossing (Label 1). The classification
model correctly predicts the speed bump (Label 1) for all
overdrives and a pothole (Label 2) on both driving directions.
The third interesting sector is shown in Figure 8. Potholes good speed bump
(Labels 2 and 3), which were at the edge of the driving line, light damages railway crossing
were overdriven multiple times and the classifier predicts the potholes
severe damage accordingly. Sometimes the output at the road
Figure 8: Event classification results of road segments outside of
segments is not pothole but light damages or even good road
Karlsruhe, Germany, for the events potholes and railways crossing.
condition. The reason might be that the pothole was avoided
by the driver.
8 Journal of Advanced Transportation
3
1
2
useful. Transferability to other vehicles with different chassis Services (MobiSys ’08), pp. 29–39, Breckenridge, CO, USA, June
and dimensions has not been examined so far. Presumably, 2008.
the algorithms of parameters are adapted to the vehicle with [7] C. Koch and I. Brilakis, “Pothole detection in asphalt pavement
which the learning data set was recorded. Here, fusion of images,” Advanced Engineering Informatics, vol. 25, no. 3, pp.
learning data sets from several vehicles and an accordingly 507–515, 2011.
adapted classification routine might help. It can be assumed [8] S. C. Radopoulou and I. Brilakis, “Patch detection for pavement
that the results will be slightly worse. assessment,” Automation in Construction, vol. 53, pp. 95–104,
Generally, the inertial sensor represents a very good 2015.
option to collect information on the tire/road contact at low [9] S. C. Radopoulou and I. Brilakis, “Improving Road Asset
costs and over wide areas. Use of information of several vehi- Condition Monitoring,” pp. 3004–3012.
cles can compensate the drawback of some drivers passing [10] ISO, ISO 8608: Mechanical Vibration - Road Surface Profiles
- Reporting of Measured Data, International Organization for
by safety-relevant damage that, hence, is not measured by the
Standardization, 2016.
sensor. Moreover, obstacles at the roadside are not crossed
[11] K. Chen, G. Tan, M. Lu, and J. Wu, “CRSM: a practical
and, hence, cannot be detected.
crowdsourcing-based road surface monitoring system,” Wire-
Fusion of camera and inertial sensor data probably would less Networks, vol. 22, no. 3, pp. 765–779, 2016.
be the optimum solution for a mobile determination of the [12] M. Perttunen, O. Mazhelis, F. Cong et al., “Distributed road sur-
state of road traffic infrastructure. For road construction face condition monitoring using mobile phones,” in Proceedings
offices, use of a low-cost and computationally efficient system, of the International Conference on Ubiquitous Intelligence and
consisting of an inertial sensor, Raspberry Pi, and simple Computing, pp. 64–78, Springer, 2011.
signal processing, is sufficient and can be recommended. [13] F. Seraj, A. Dilo, T. Luarasi et al., “RoADS: A Road Pave-
ment Monitoring System for Anomaly Detection Using Smart
Phones,” in Big data analytics in the social and ubiquitous
Data Availability context, pp. 128–146, 2016.
The raw data used to support the findings of this study [14] I. Tekin, Economic investigations of the current state of the
have been deposited in http://doi.org/10.5281/zenodo.1461243 road system in order to determine the advantages and areas
of application of a road monitoring system [Bachelors Thesis],
[28]. The data can be processed with the presented toolbox
Karlsruhe Institute of Technology, Institute of Vehicle Systems
available in http://doi.org/10.5281/zenodo.1216187 [21]. Technology.
[15] J. Masino, G. Levasseur, M. Frey, F. Gauterin, R. Mikut, and M.
Conflicts of Interest Reischl, “Charakterisierung der Fahrbahnbeschaffenheit durch
Data Mining von gemessenen kinematischen Fahrzeuggrößen,”
The authors declare that they have no conflicts of interest. Automatisierungstechnik, vol. 65, no. 12, 2017.
[16] J. Masino, J. Pinay, M. Reischl, and F. Gauterin, “Road surface
prediction from acoustical measurements in the tire cavity
Acknowledgments using support vector machine,” Applied Acoustics, vol. 125, pp.
41–48, 2017.
This work was funded by the Federal Ministry for the
[17] J. Masino, M. Frey, F. Gauterin, and R. Sharma, “Development
Environment, Nature Conservation, Building and Nuclear of a highly accurate and low cost measurement device for Field
Safety, Germany, within the Environmental Research Plan Operational Tests,” in Proceedings of the 3rd IEEE International
2014 (Project no. 3714541000). Symposium on Inertial Sensors and Systems, ISS 2016, pp. 74–77,
USA, February 2016.
References [18] M. Sayers, T. Gillespie, and C. Queiroz, “The international
road roughness experiment: Establishing correlation and a
[1] P. De Gonneville and G. Martin, “Le mécanisme d’ accident,” calibration standard for measurements,” World Bank Technical
in Sétra: Service d’ études techniques des routes et autoroutes, Paper, 1986.
Bagneux, 2006. [19] C. C. Ward and K. Iagnemma, “Speed-independent vibration-
[2] “Destatis. Verkehrsunfälle. Destatis, Mar. 2017. Artikelnummer: based terrain classification for passenger vehicles,” Taylor &
2080700161124”. Francis, Vehicle System Dynamics, vol. 47, no. 9, 2009.
[20] M. Sokolova and G. Lapalme, “A systematic analysis of perfor-
[3] P. Mohan, V. N. Padmanabhan, and R. Ramachandran, “Neri-
mance measures for classification tasks,” Information Processing
cell: Rich monitoring of road and traffic conditions using mobile
& Management, vol. 45, no. 4, pp. 427–437, 2009.
smartphones,” in Proceedings of the The ACM Conference on
Embedded Networked Sensor Systems, 2008. [21] J. Thumm and J. Masino, Vehicle Learner Toolbox for Road
Condition Estimation, Matlab files, 2018.
[4] M. Schade, “Der intelligente Löcher-Sucher,” Auto Bild, 2016.
[22] R. Mikut, A. Bartschat, and W. Doneit, “The MATLAB toolbox
[5] E. Esmailzadeh and F. Fahimi, “Optimal adaptive active suspen- SciXMiner: User’s manual and programmer’s guide,” 2017,
sions for a full car model,” Vehicle System Dynamics, vol. 27, no. https://arxiv.org/abs/1704.03298.
2, pp. 89–107, 1997. [23] R. Mikut, O. Burmeister, S. Braun, and M. Reischl, “The open
[6] J. Eriksson, L. Girod, B. Hull, R. Newton, S. Madden, and source MATLAB toolbox Gait-CAD and its application to
H. Balakrishnan, “The pothole patrol: using a mobile sensor bioelectric signal processing,” in Proceedings of the DGBMT-
network for road surface monitoring,” in Proceedings of the 6th Workshop Biosignalverarbeitung, pp. 109–111, Potsdam, Ger-
International Conference on Mobile Systems, Applications, and many, 2008.
10 Journal of Advanced Transportation
Rotating Advances in
Machinery Multimedia
The Scientific
Engineering
Journal of
Journal of
Hindawi
World Journal
Hindawi Publishing Corporation Hindawi
Sensors
Hindawi Hindawi
www.hindawi.com Volume 2018 http://www.hindawi.com
www.hindawi.com Volume 2018
2013 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
Journal of
Control Science
and Engineering
Advances in
Civil Engineering
Hindawi Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
Journal of
Journal of Electrical and Computer
Robotics
Hindawi
Engineering
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
VLSI Design
Advances in
OptoElectronics
International Journal of
International Journal of
Modelling &
Simulation
Aerospace
Hindawi Volume 2018
Navigation and
Observation
Hindawi
www.hindawi.com Volume 2018
in Engineering
Hindawi
www.hindawi.com Volume 2018
Engineering
Hindawi
www.hindawi.com Volume 2018
Hindawi
www.hindawi.com www.hindawi.com Volume 2018
International Journal of
International Journal of Antennas and Active and Passive Advances in
Chemical Engineering Propagation Electronic Components Shock and Vibration Acoustics and Vibration
Hindawi Hindawi Hindawi Hindawi Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018