An LSTM-Based Method with Attention Mechanism for Travel Time Prediction
"> Figure 1
<p>An architecture of LSTM unit.</p> "> Figure 2
<p>An architecture of LSTM NN for travel time prediction.</p> "> Figure 3
<p>An architecture of attentive neural network.</p> "> Figure 4
<p>Architecture of the LSTM-based model with attention mechanism for travel time prediction.</p> "> Figure 5
<p>A change of MAPE when the amount of LSTM units in the proposed model increase. The experiment is made based on AL1896.</p> "> Figure 6
<p>Comparisons between the predicted travel times and its observed travel times through the links under different traffic conditions: (<b>a</b>) a comparison under low traffic conditions; (<b>b</b>) a comparison under medium traffic conditions; and (<b>c</b>) a comparison under heavy traffic conditions.</p> "> Figure 7
<p>A comparison between the predicted travel times and the observed travel times through an medium length link <math display="inline"><semantics> <mi mathvariant="script">L</mi> </semantics></math>.</p> "> Figure 8
<p>A comparison of convergence speed between the proposed model and the LSTM NN. The comparison is based on Link AL1896. The unfold size of the LSTM NN and the proposed model were 7.</p> "> Figure 9
<p>A road network consists of the links AL3069A, AL3070, AL2202, AL1900, AL1896, AL1891, AL1885, AL1883, AL1877 and AL2991 in <a href="#sensors-19-00861-f009" class="html-fig">Figure 9</a>. The medium length link <math display="inline"><semantics> <mi mathvariant="script">L</mi> </semantics></math> consists of the links from link AL3070 to link AL 1877. A comparison between the proposed model and the proposed model was made based on link <math display="inline"><semantics> <mi mathvariant="script">L</mi> </semantics></math>.</p> "> Figure 10
<p>A comparison of time cost between the proposed model and the baseline methods.</p> "> Figure 11
<p>The residuals of the test set on link AL1996 over time. The horizontal axis is time interval and the vertical axis is travel time. The horizontal axis contains time intervals of three days and there are 96 time intervals per day.</p> "> Figure 12
<p>Heat plots of travel times during three days. The horizontal axis are time intervals that spans three days and 96 time intervals per day. The color depth expresses the travel time value on link AL1996.</p> "> Figure 13
<p>Travel time correlations on link AL1167 with length of 18.48 km. The best results in the table appear in bold.</p> ">
Abstract
:1. Introduction
- We propose an LSTM-based method with attention mechanism for travel time prediction. In the proposed model, the traditional recurrent way to construct the depth of LSTM NN for modeling long-term dependence is substituted by an attention mechanism that is over the output layers of LSTM. The departure time is used as the aspect of the attention mechanism and is integrated into the proposed model.
- Experiments were performed based on the dataset provided by Highways England. The experimental results show that the proposed model can achieve better accuracy than the existing LSTM NN and other baseline methods.
- The case study suggest that the attention mechanism can effectively concentrate on the differences of input features for travel time prediction. The proposed model is feasible and effective.
2. Related Work
2.1. Data-Driven Models for Traffic Prediction
2.2. Attention Mechanism
3. Methodology
3.1. LSTM NN for Travel Time Prediction
3.2. Attention Mechanism
3.3. LSTM-Based Method with Attention Mechanism
3.4. Models Training
4. Experiment
4.1. Dataset
4.2. Task Definition
4.3. Parameters Setting for the Proposed Method
4.4. Parameters Setting for the Baseline Methods
4.5. Similarity between the Prediction Value and the Observation Value
4.6. Accuracy Comparison Based on a Short Length Link
4.7. Accuracy Comparison Based on a Medium Length Link
4.8. A Further Evaluation on Link AL1167
4.9. Result Analysis
4.10. Case Study
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Lint, H. Reliable Travel Time Prediction for Freeways. Ph.D. Thesis, The Netherlands TRAIL Research School, Delft, The Netherlands, May 2004. [Google Scholar]
- Ma, X.; Dai, Z.; He, Z.; Ma, J.; Wang, Y.; Wang, Y. Learning traffic as images: A deep convolutional neural network for large-scale transportation network speed prediction. Sensors 2017, 17, 818. [Google Scholar] [CrossRef] [PubMed]
- JKe, J.; Zheng, H.; Yang, H.; Chen, X.M. Short-term forecasting of passenger demand under on-demand ride services: A spatio-temporal deep learning approach. Transp. Res. Part C Emerg. Technol. 2017, 85, 591–608. [Google Scholar] [Green Version]
- Wang, J.; Gu, Q.; Wu, J.; Liu, G.; Xiong, Z. Traffic speed prediction and congestion source exploration: A deep learning method. In Proceedings of the IEEE International Conference on Data Mining, Barcelona, Spain, 12–15 December 2016; pp. 499–508. [Google Scholar]
- Duan, Y.; Lv, Y.; Wang, F.Y. Travel time prediction with LSTM neural network. In Proceedings of the IEEE International Conference on Intelligent Transportation Systems, Rio de Janeiro, Brazil, 1–4 November 2016; pp. 1053–1058. [Google Scholar]
- Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. Part C 2015, 54, 187–197. [Google Scholar] [CrossRef]
- Ahmed, M.S.; Cook, A.R. Analysis of Freeway Traffic Time-Series Data by Using Box-Jenkins Techniques; Transportation Research Board: Washington, DC, USA, 1979; pp. 1–9. [Google Scholar]
- Hamed, M.M.; Al-Masaeid, H.R.; Said, Z.M.B. Short-Term Prediction of Traffic Volume in Urban Arterials. J. Transp. Eng. 1995, 121, 249–254. [Google Scholar] [CrossRef]
- Williams, B.M.; Hoel, L.A. Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results. J. Transp. Eng. 2003, 129, 664–672. [Google Scholar] [CrossRef]
- Rice, J.; Van Zwet, E. A simple and effective method for predicting travel times on freeways. IEEE Trans. Intell. Transp. Syst. 2004, 5, 200–207. [Google Scholar] [CrossRef]
- Davis, G.A.; Nihan, N.L. Nonparametric Regression and ShortTerm Freeway Traffic Forecasting. J. Transp. Eng. 1991, 117, 178–188. [Google Scholar] [CrossRef]
- Smith, B.L.; Williams, B.M.; Oswald, R.K. Comparison of parametric and nonparametric models for traffic flow forecasting. Transp. Res. Part C Emerg. Technol. 2002, 10, 303–321. [Google Scholar] [CrossRef]
- Chang, H.; Lee, Y.; Yoon, B.; Baek, S. Dynamic near-term traffic flow prediction: Systemoriented approach based on past experiences. IET Intell. Transp. Syst. 2012, 6, 292–305. [Google Scholar] [CrossRef]
- Nikovski, D.; Nishiuma, N.; Goto, Y.; Kumazawa, H. Univariate short-term prediction of road travel times. In Proceedings of the Intelligent Transportation Systems, Vienna, Austria, 16 September 2005; pp. 1074–1079. [Google Scholar]
- Drucker, H.; Burges, C.J.C.; Kaufman, L.; Smola, A.J.; Vapnik, V. Support Vector Regression Machines. Adv. Neural Inf. Process. Syst. 1996, 28, 779–784. [Google Scholar]
- Wu, C.H.; Ho, J.M.; Lee, D.T. Travel-time prediction with support vector regression. IEEE Trans. Intell. Transp. Syst. 2004, 5, 276–281. [Google Scholar] [CrossRef]
- Su, H.; Zhang, L.; Yu, S. Short-term Traffic Flow Prediction Based on Incremental Support Vector Regression. In Proceedings of the International Conference on Natural Computation, Haikou, China, 24–27 August 2007; pp. 640–645. [Google Scholar]
- Castro-Neto, M.; Jeong, Y.S.; Jeong, M.K.; Han, L.D. Online-SVR for short-term traffic flow prediction under typical and atypical traffic conditions. Expert Syst. Appl. Int. J. 2009, 36, 6164–6173. [Google Scholar] [CrossRef]
- Lv, Y.; Duan, Y.; Kang, W.; Li, Z.; Wang, F.Y. Traffic Flow Prediction With Big Data: A Deep Learning Approach. IEEE Trans. Intell. Transp. Syst. 2015, 16, 865–873. [Google Scholar] [CrossRef]
- Polson, N.G.; Sokolov, V.O. Deep learning for short-term traffic flow prediction. Transp. Res. Part C Emerg. Technol. 2017, 79, 1–17. [Google Scholar] [CrossRef] [Green Version]
- Lint, J.W.C.V. Reliable Real-Time Framework for Short-Term Freeway Travel Time Prediction. J. Transp. Eng. 2006, 132, 921–932. [Google Scholar] [CrossRef]
- Elman, J. Finding structure in Time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
- Mnih, V.; Heess, N.; Graves, A.; Kavukcuoglu, K. Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 2014, 3, 2204–2212. [Google Scholar]
- Luong, M.T.; Pham, H.; Manning, C.D. Effective Approaches to Attention-based Neural Machine Translation. arXiv, 2015; arXiv:1508.04025. [Google Scholar]
- Chen, J.; Zhang, H.; He, X.; Liu, W.; Liu, W.; Chua, T.S. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 August 2017; pp. 335–344. [Google Scholar]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 2002, 5, 157–166. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Duchi, J.; Hazan, E.; Singer, Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res. 2011, 12, 257–269. [Google Scholar]
- Dean, J.; Corrado, G.S.; Monga, R.; Chen, K.; Devin, M.; Le, Q.V.; Mao, M.Z.; Ranzato, M.; Senior, A.; Tucker, P. Large scale distributed deep networks. In Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1223–1231. [Google Scholar]
- Vanhoucke, V.; Mao, M.Z. Improving the speed of neural networks on CPUs. In Proceedings of the NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 16–17 December 2011. [Google Scholar]
- Highways England. Highways Agency Network Journey Time and Traffic Flow Data. Available online: http://data.gov.uk/dataset/highways-england-network-journey-time-and-traffic-flow-data (accessed on 10 April 2017).
- Fortmannroe, S. Understanding the Bias-Variance Tradeoff. 2012. Available online: http://scott.fortmann-roe.com/docs/BiasVariance.html (accessed on 5 June 2017).
Link Ref | Date | Time Period (0–95) | Average JT (s) | Link Length (km) |
---|---|---|---|---|
AL1896 | 1 March 2014 | 0 | 626.12 | 18.32 |
AL1896 | 1 March 2014 | 1 | 612.60 | 18.32 |
AL1896 | 1 March 2014 | 2 | 604.23 | 18.32 |
… | … | … | … | … |
Variable | Description | Value |
---|---|---|
w | The unfold size of the proposed model | 7 |
d | The dimension of hidden layers and output layers of LSTM | 4 |
The mini-batch size | 50 | |
The initial learning rate of | 0.1 | |
The initial learning rate of other parameters | 0.01 | |
The range of the dataset normalization | (0, 1) |
Models | Departure Time in 15 min | |||
---|---|---|---|---|
RMSE | MAE | MAPE (%) | ||
RW | 9.787 | 2.5648 | 8.87 | |
k-NNR | uniform | 9.800 | 2.566 | 8.73 |
distance | 9.800 | 2.566 | 8.73 | |
SARIMA | 8.637 | 2.582 | 9.01 | |
SVR | linear | 8.202 | 2.487 | 8.30 |
rbf | 8.180 | 2.486 | 8.30 | |
poly | 8.216 | 2.497 | 8.36 | |
LR | 10.323 | 2.786 | 8.92 | |
CNN | 9.270 | 2.716 | 8.41 | |
SSNN | 9.205 | 2.650 | 8.37 | |
LSTM NN | 8.472 | 2.485 | 7.34 | |
LSAM | 8.133 | 2.356 | 7.01 |
Models | 15-min on Training Set | 15-min on Test Set | Variance (%) | |||||
---|---|---|---|---|---|---|---|---|
RMSE | MAE | MAPE (%) | RMSE | MAE | MAPE (%) | |||
RW | 87.499 | 8.018 | 3.33 | 78.7770 | 7.5445 | 2.99 | 0.34 | |
k-NNR | uniform | 77.955 | 7.645 | 3.03 | 78.854 | 7.545 | 2.99 | 0.04 |
distance | 14.313 | 1.663 | 0.14 | 78.854 | 7.545 | 2.99 | 2.85 | |
SARIMA | 413.931 | 19.963 | 20.51 | 93.271 | 8.508 | 3.85 | 16.66 | |
SVR | linear | 80.628 | 7.778 | 3.14 | 72.983 | 7.408 | 2.90 | 0.24 |
rbf | 80.646 | 7.775 | 3.14 | 73.036 | 7.427 | 2.92 | 0.22 | |
poly | 84.016 | 7.892 | 3.25 | 74.670 | 7.571 | 3.04 | 0.21 | |
LR | 83.963 | 7.897 | 2.96 | 92.880 | 8.565 | 3.45 | 0.49 | |
CNN | 83.093 | 7.878 | 3.20 | 72.131 | 7.399 | 2.87 | 0.33 | |
SSNN | 83.631 | 7.873 | 3.18 | 72.060 | 7.361 | 2.84 | 0.34 | |
LSTM NN | 80.885 | 7.781 | 3.14 | 71.631 | 7.318 | 2.83 | 0.31 | |
LSAM | 79.498 | 7.650 | 2.81 | 70.146 | 7.188 | 2.69 | 0.22 |
Models | Time Lag 1 | Time Lag 2 | Time Lag 3 | Time Lag 4 | ||||
---|---|---|---|---|---|---|---|---|
MAE | MAPE | MAE | MAPE | MAE | MAPE | MAE | MAPE | |
CNN | 6.432 | 6.87 | 6.947 | 7.92 | 7.080 | 8.28 | 7.267 | 8.75 |
SSNN | 6.338 | 6.10 | 6.912 | 7.87 | 7.265 | 8.70 | 7.415 | 9.03 |
LSTM NN | 5.979 | 5.95 | 6.858 | 7.78 | 7.197 | 8.47 | 7.342 | 8.98 |
LSAM | 5.788 | 5.61 | 6.796 | 7.66 | 7.069 | 8.28 | 7.294 | 8.76 |
Departure Time | 8:15 | 8:30 | 8:45 | 9:00 | 9:15 | 9:30 | 9:45 | 10:00 | |||
---|---|---|---|---|---|---|---|---|---|---|---|
Steps Ahead | 7-Step | 6-Step | 5-Step | 4-Step | 3-Step | 2-Step | 1-Step | Observed | Predicted | MAPE (%) | |
03-29 | AverageJT | 70.82 | 71.82 | 79.10 | 78.77 | 83.30 | 83.30 | 83.30 | 74.32 | 82.77 | 11.4 |
Weights | 0.227 | 0.222 | 0.142 | 0.133 | 0.096 | 0.090 | 0.089 | ||||
03-30 | AverageJT | 86.38 | 86.91 | 91.71 | 77.73 | 79.47 | 89.04 | 83.73 | 96.87 | 84.80 | 12.5 |
Weights | 0.148 | 0.118 | 0.083 | 0.199 | 0.203 | 0.109 | 0.140 | ||||
03-31 | AverageJT | 83.63 | 85.33 | 84.57 | 89.38 | 87.10 | 91.99 | 95.78 | 97.21 | 91.05 | 6.3 |
Weights | 0.222 | 0.170 | 0.170 | 0.124 | 0.137 | 0.100 | 0.076 |
Time Period | 8:15 | 8:30 | 8:45 | 9:00 | 9:15 | 9:30 | 9:45 | 10:00 | |||
---|---|---|---|---|---|---|---|---|---|---|---|
Steps Ahead | 7-Step | 6-Step | 5-Step | 4-Step | 3-Step | 2-Step | 1-Step | Observed | Predicted | MAPE (%) | |
03-29 | AverageJT | 70.82 | 71.82 | 79.10 | 78.77 | 83.30 | 83.30 | 83.30 | 74.32 | 82.80 | 11.4 |
Weights | 0.181 | 0.179 | 0.140 | 0.140 | 0.120 | 0.120 | 0.120 | ||||
03-30 | AverageJT | 86.38 | 86.91 | 91.71 | 77.73 | 79.47 | 89.04 | 83.73 | 96.87 | 84.54 | 12.7 |
Weights | 0.134 | 0.121 | 0.104 | 0.189 | 0.180 | 0.124 | 0.147 | ||||
03-31 | AverageJT | 83.63 | 85.33 | 84.57 | 89.38 | 87.10 | 91.99 | 95.78 | 97.21 | 90.81 | 6.6 |
weights | 0.170 | 0.150 | 0.161 | 0.137 | 0.150 | 0.125 | 0.107 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ran, X.; Shan, Z.; Fang, Y.; Lin, C. An LSTM-Based Method with Attention Mechanism for Travel Time Prediction. Sensors 2019, 19, 861. https://doi.org/10.3390/s19040861
Ran X, Shan Z, Fang Y, Lin C. An LSTM-Based Method with Attention Mechanism for Travel Time Prediction. Sensors. 2019; 19(4):861. https://doi.org/10.3390/s19040861
Chicago/Turabian StyleRan, Xiangdong, Zhiguang Shan, Yufei Fang, and Chuang Lin. 2019. "An LSTM-Based Method with Attention Mechanism for Travel Time Prediction" Sensors 19, no. 4: 861. https://doi.org/10.3390/s19040861