Enhancing PM2.5 Prediction Using NARX-Based Combined CNN and LSTM Hybrid Model
<p>NARX model.</p> "> Figure 2
<p>1D CNN process.</p> "> Figure 3
<p>LSTM RNN elemental network structure.</p> "> Figure 4
<p>An overview of CNN–LSTM NARX proposed layers.</p> "> Figure 5
<p>Imputation flowchart for every feature in Piccadilly station, Manchester, UK.</p> "> Figure 6
<p>Sample of data from 1 May 2016 to 1 September 2016 data of Piccadilly station, Manchester, UK, before imputation.</p> "> Figure 7
<p>Sample of data from 1 May 2016 to 1 September 2016 data of Piccadilly station, Manchester, the UK, after imputation.</p> "> Figure 8
<p>A sample dataset showing how data shifting is done for two look-back hours.</p> "> Figure 9
<p>Training vs. testing in timeseries split cross-validation <span class="html-italic">n</span>=10 for the Beijing dataset.</p> "> Figure 10
<p>Training vs. testing in timeseries split cross-validation <span class="html-italic">n</span>=10 for the Manchester dataset.</p> "> Figure 11
<p>CNN–LSTM layers for the seventh iteration with (d0, o1) for the Beijing dataset.</p> "> Figure 12
<p>CNN–LSTM layers for the seventh iteration with (d0, o1) for the Manchester dataset.</p> "> Figure 13
<p>Real PM<sub>2.5</sub> data of part of the seventh iteration results comparing the Beijing vs. Manchester datasets ranges.</p> "> Figure 14
<p>Real vs. CNN–LSTM and its NARX variants in part of the seventh iteration results for the Beijing dataset.</p> "> Figure 15
<p>Real vs. CNN–LSTM and its NARX variants in part of the seventh iteration results for the Manchester dataset.</p> "> Figure 16
<p>Real vs. LSTM and its NARX variants in part of the seventh iteration results for the Beijing dataset.</p> "> Figure 17
<p>Real vs. LSTM and its NARX variants in part of the seventh iteration results for the Manchester dataset.</p> "> Figure 18
<p>Real vs. Extra Trees and its NARX variants in part of the seventh iteration results for the Beijing dataset.</p> "> Figure 19
<p>Real vs. Extra Trees and its NARX variants in part of the seventh iteration results for the Manchester dataset.</p> "> Figure 20
<p>Real vs. XGBRF and its NARX variants in part of the seventh iteration results for the Beijing dataset.</p> "> Figure 21
<p>Real vs. XGBRF and its NARX variants in part of the seventh iteration results for the Manchester dataset.</p> "> Figure 22
<p>Evaluation results of non-NARX and NARX in terms of coefficient of determination for the Beijing dataset.</p> "> Figure 23
<p>Evaluation results of non-NARX and NARX in terms of index of agreement for the Beijing dataset.</p> "> Figure 24
<p>Evaluation results of non-NARX and NARX in terms of root mean square error for the Beijing dataset.</p> "> Figure 25
<p>Evaluation results of non-NARX and NARX in terms of normalised root mean square error for the Beijing dataset.</p> "> Figure 26
<p>Evaluation results of non-NARX and NARX in terms of offline training time for the Beijing dataset.</p> "> Figure 27
<p>Evaluation results of non-NARX and NARX in terms of coefficient of determination for the Manchester dataset.</p> "> Figure 28
<p>Evaluation results of non-NARX and NARX in terms of index of agreement for Manchester dataset.</p> "> Figure 29
<p>Evaluation results of non-NARX and NARX in terms of root mean square error for the Manchester dataset.</p> "> Figure 30
<p>Evaluation results of non-NARX and NARX in terms of normalised root mean square error for the Manchester dataset.</p> "> Figure 31
<p>Evaluation results of non-NARX and NARX in terms of offline training time for the Manchester dataset.</p> ">
Abstract
:1. Introduction
- Proposing an enhanced version of CNN–LSTM using NARX architecture.
- Evaluating multiple configurations of NARX using CNN–LSTM, LSTM, Extra Trees, and XGBRF.
- Executing our experiments on different cities in two separate locations located on distant continents (Beijing, China; Manchester, UK) and proving that our hybrid model can work well, regardless of the location.
2. Related Work
3. Prediction Algorithms
3.1. Nonlinear Autoregression with Exogenous Input (NARX)
3.2. 1-D Convolution Neural Network (1-D CNN)
3.3. Long Short-Term Memory (LSTM)
3.4. Extra Trees (ET)
3.5. Random Forests in XGBoost (XGBRF)
4. Proposed Algorithm
Algorithm 1: CNN–LSTM NARX Architecture Steps |
|
5. Performance Evaluation
5.1. Validation Metrics
5.1.1. Coefficient of Determination R2
5.1.2. Index of Agreement (IA)
5.1.3. Root Mean Square Error (RMSE)
5.1.4. Normalised Root Mean Square Error (NRMSE)
5.2. Data Description and Preprocessing
5.2.1. Beijing, China Dataset
5.2.2. Manchester, UK Dataset
5.2.3. Data Preprocessing before Feeding to ML Algorithms
5.3. Results Analysis and Discussion
6. Conclusions
7. Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Goujon, A. Human Population Growth. In Encyclopedia of Ecology; Elsevier: Amsterdam, The Netherlands, 2019; pp. 344–351. [Google Scholar]
- Natural Resources Defense Council. Air Pollution Facts, Causes and the Effects of Pollutants in the Air|NRDC. Available online: https://www.nrdc.org/stories/air-pollution-everything-you-need-know (accessed on 7 January 2022).
- Manisalidis, I.; Stavropoulou, E.; Stavropoulos, A.; Bezirtzoglou, E. Environmental and Health Impacts of Air Pollution: A Review. Front. Public Health 2020, 8, 14. [Google Scholar] [CrossRef] [PubMed]
- United States Environmental Protection Agency. Air Quality and Climate Change Research|US EPA. Available online: https://www.epa.gov/air-research/air-quality-and-climate-change-research (accessed on 21 December 2021).
- United States Environmental Protection Agency. Criteria Air Pollutants|US EPA. Available online: https://www.epa.gov/criteria-air-pollutants (accessed on 21 December 2021).
- United States Environmental Protection Agency. Particulate Matter (PM) Basics|US EPA. Available online: https://www.epa.gov/pm-pollution/particulate-matter-pm-basics#PM (accessed on 7 January 2022).
- Air Quality and Health. Available online: https://www.who.int/teams/environment-climate-change-and-health/air-quality-and-health/health-impacts/types-of-pollutants (accessed on 3 June 2022).
- Yang, M.; Guo, Y.M.; Bloom, M.S.; Dharmagee, S.C.; Morawska, L.; Heinrich, J.; Jalaludin, B.; Markevychd, I.; Knibbsf, L.D.; Lin, S.; et al. Is PM1 Similar to PM2.5? A New Insight into the Association of PM1 and PM2.5 with Children’s Lung Function. Environ. Int. 2020, 145, 106092. [Google Scholar] [CrossRef] [PubMed]
- Xing, X.; Hu, L.; Guo, Y.; Bloom, M.S.; Li, S.; Chen, G.; Yim, S.H.L.; Gurram, N.; Yang, M.; Xiao, X.; et al. Interactions between Ambient Air Pollution and Obesity on Lung Function in Children: The Seven Northeastern Chinese Cities (SNEC) Study. Sci. Total Environ. 2020, 699, 134397. [Google Scholar] [CrossRef] [PubMed]
- United States Environmental Protection Agency. National Ambient Air Quality Standards Table|US EPA. Available online: https://www.epa.gov/criteria-air-pollutants/naaqs-table (accessed on 2 March 2022).
- World Health Organization. WHO Global Air Quality Guidelines: Particulate Matter (PM2.5 and PM10), Ozone, Nitrogen Dioxide, Sulfur Dioxide and Carbon Monoxide; World Health Organization: Geneva, Switzerland, 2021. [Google Scholar]
- Plaia, A.; Ruggieri, M. Air Quality Indices: A Review. Rev. Environ. Sci. Bio/Technol. 2011, 10, 165–179. [Google Scholar] [CrossRef]
- Peng, H. Air Quality Prediction by Machine Learning Methods; The University of British Columbia: Vancouver, BC, Canada, 2015. [Google Scholar]
- Liang, Y.-C.; Maimury, Y.; Chen, A.H.-L.; Juarez, J.R.C. Machine Learning-Based Prediction of Air Quality. Appl. Sci. 2020, 10, 9151. [Google Scholar] [CrossRef]
- Aljanabi, M.; Shkoukani, M.; Hijjawi, M. Comparison of Multiple Machine Learning Algorithms for Urban Air Quality Forecasting. Period. Eng. Nat. Sci. 2021, 9, 1013–1028. [Google Scholar] [CrossRef]
- Bellinger, C.; Mohomed Jabbar, M.S.; Zaïane, O.; Osornio-Vargas, A. A Systematic Review of Data Mining and Machine Learning for Air Pollution Epidemiology. BMC Public Health 2017, 17, 907. [Google Scholar] [CrossRef]
- Hsieh, W.W. Machine Learning Methods in the Environmental Sciences; Cambridge University Press: Cambridge, CA, USA, 2009; ISBN 9780511627217. [Google Scholar]
- Machine Learning for Ecology and Sustainable Natural Resource Management; Humphries, G.; Magness, D.R.; Huettmann, F. (Eds.) Springer International Publishing: Cham, Switzerland, 2018; ISBN 978-3-319-96976-3. [Google Scholar]
- Huang, C.-J.; Kuo, P.-H. A Deep CNN-LSTM Model for Particulate Matter (PM2.5) Forecasting in Smart Cities. Sensors 2018, 18, 2220. [Google Scholar] [CrossRef]
- Moursi, A.S.; El-Fishawy, N.; Djahel, S.; Shouman, M.A. An IoT Enabled System for Enhanced Air Quality Monitoring and Prediction on the Edge. Complex Intell. Syst. 2021, 7, 2923–2947. [Google Scholar] [CrossRef]
- Liang, X.; Zou, T.; Guo, B.; Li, S.; Zhang, H.; Zhang, S.; Huang, H.; Chen, S.X. Assessing Beijing’s PM2.5 Pollution: Severity, Weather Impact, APEC and Winter Heating. Proc. R. Soc. A Math. Phys. Eng. Sci. 2015, 471, 20150257. [Google Scholar] [CrossRef]
- Qin, D.; Yu, J.; Zou, G.; Yong, R.; Zhao, Q.; Zhang, B. A Novel Combined Prediction Scheme Based on CNN and LSTM for Urban PM2.5 Concentration. IEEE Access 2019, 7, 20050–20059. [Google Scholar] [CrossRef]
- Kaya, K.; Gündüz Öğüdücü, Ş. Deep Flexible Sequential (DFS) Model for Air Pollution Forecasting. Sci. Rep. 2020, 10, 3346. [Google Scholar] [CrossRef]
- Li, T.; Hua, M.; Wu, X. A Hybrid CNN-LSTM Model for Forecasting Particulate Matter (PM2.5). IEEE Access 2020, 8, 26933–26940. [Google Scholar] [CrossRef]
- O’Neil, C.; Schutt, R. Doing Data Science: Straight Talk from the Frontline; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2013; ISBN 978-1-449-35865-5. [Google Scholar]
- Kapasi, H. Modeling Non-Linear Dynamic Systems with Neural Networks. Available online: https://towardsdatascience.com/modeling-non-linear-dynamic-systems-with-neural-networks-f3761bc92649 (accessed on 4 May 2020).
- Xie, J.; Wang, Q. Benchmark Machine Learning Approaches with Classical Time Series Approaches on the Blood Glucose Level Prediction Challenge. In Proceedings of the CEUR Workshop Proceedings, Stockholm, Sweden, 13 July 2018; Volume 2148, pp. 97–102. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Nelles, O. Nonlinear Dynamic System Identification. In Nonlinear System Identification; Springer: Berlin/Heidelberg, Germany, 2001; pp. 547–577. [Google Scholar]
- Irani, T.; Amiri, H.; Deyhim, H. Evaluating Visibility Range on Air Pollution Using NARX Neural Network. J. Environ. Treat. Tech. 2021, 9, 540–547. [Google Scholar] [CrossRef]
- Liu, B.; Jin, Y.; Xu, D.; Wang, Y.; Li, C. A Data Calibration Method for Micro Air Quality Detectors Based on a LASSO Regression and NARX Neural Network Combined Model. Sci. Rep. 2021, 11, 21173. [Google Scholar] [CrossRef]
- Kodogiannis, V.S.; Lisboa, P.J.G.; Lucas, J. Neural Network Modelling and Control for Underwater Vehicles. Artif. Intell. Eng. 1996, 10, 203–212. [Google Scholar] [CrossRef]
- Zhao, J.; Mao, X.; Chen, L. Speech Emotion Recognition Using Deep 1D & 2D CNN LSTM Networks. Biomed. Signal Process. Control 2019, 47, 312–323. [Google Scholar] [CrossRef]
- Abdeljaber, O.; Avci, O.; Kiranyaz, S.; Gabbouj, M.; Inman, D.J. Real-Time Vibration-Based Structural Damage Detection Using One-Dimensional Convolutional Neural Networks. J. Sound Vib. 2017, 388, 154–170. [Google Scholar] [CrossRef]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Shin, H.-C.; Roth, H.R.; Gao, M.; Lu, L.; Xu, Z.; Nogues, I.; Yao, J.; Mollura, D.; Summers, R.M. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning. IEEE Trans. Med. Imaging 2016, 35, 1285–1298. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Q.; Han, Y.; Li, V.O.K.; Lam, J.C.K. Deep-AIR: A Hybrid CNN-LSTM Framework for Fine-Grained Air Pollution Estimation and Forecast in Metropolitan Cities. IEEE Access 2022, 10, 55818–55841. [Google Scholar] [CrossRef]
- Kim, T.-Y.; Cho, S.-B. Predicting Residential Energy Consumption Using CNN-LSTM Neural Networks. Energy 2019, 182, 72–81. [Google Scholar] [CrossRef]
- Ahlawat, S.; Choudhary, A. Hybrid CNN-SVM Classifier for Handwritten Digit Recognition. Procedia Comput. Sci. 2020, 167, 2554–2560. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Azzouni, A.; Pujolle, G. NeuTM: A Neural Network-Based Framework for Traffic Matrix Prediction in SDN. In Proceedings of the NOMS 2018–2018 IEEE/IFIP Network Operations and Management Symposium, Taipei, Taiwan, 23–27 April 2018; pp. 1–5. [Google Scholar]
- Li, X.; Peng, L.; Hu, Y.; Shao, J.; Chi, T. Deep Learning Architecture for Air Quality Predictions. Environ. Sci. Pollut. Res. 2016, 23, 22408–22417. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Peng, L.; Yao, X.; Cui, S.; Hu, Y.; You, C.; Chi, T. Long Short-Term Memory Neural Network for Air Pollutant Concentration Predictions: Method Development and Evaluation. Environ. Pollut. 2017, 231, 997–1004. [Google Scholar] [CrossRef] [PubMed]
- Kovincic, N.; Gattringer, H.; Müller, A.; Brandstötter, M. A Boosted Decision Tree Approach for a Safe Human-Robot Collaboration in Quasi-Static Impact Situations. In Proceedings of the International Conference on Robotics in Alpe-Adria Danube Region, Kaiserslautern, Germany, 19 June 2020; Volume 84, pp. 235–244. [Google Scholar]
- Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A Survey on Ensemble Learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar] [CrossRef]
- Random Forests in XGBoost. Available online: https://xgboost.readthedocs.io/en/latest/tutorials/rf.html (accessed on 8 May 2020).
- Bhatele, K.R.; Bhadauria, S.S. Glioma Segmentation and Classification System Based on Proposed Texture Features Extraction Method and Hybrid Ensemble Learning. Traitement Du Signal 2020, 37, 989–1001. [Google Scholar] [CrossRef]
- Rybarczyk, Y.; Zalakeviciute, R. Machine Learning Approaches for Outdoor Air Quality Modelling: A Systematic Review. Appl. Sci. 2018, 8, 2570. [Google Scholar] [CrossRef]
- Willmott, C.J.; Ackleson, S.G.; Davis, R.E.; Feddema, J.J.; Klink, K.M.; Legates, D.R.; O’Donnell, J.; Rowe, C.M. Statistics for the Evaluation and Comparison of Models. J. Geophys. Res. 1985, 90, 8995. [Google Scholar] [CrossRef]
- Shcherbakov, M.V.; Brebels, A.; Shcherbakova, N.L.; Tyukov, A.P.; Janovsky, T.A.; Kamaev, V.A. A Survey of Forecast Error Measures. World Appl. Sci. J. 2013, 24, 171–176. [Google Scholar] [CrossRef]
- Data Selector—Defra, UK. Available online: https://uk-air.defra.gov.uk/data/data_selector_service (accessed on 15 May 2022).
- Brownlee, J. How to Convert a Time Series to a Supervised Learning Problem in Python. Available online: https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/ (accessed on 14 July 2019).
- Moursi, A.S.; Shouman, M.; Hemdan, E.E.; El-Fishawy, N. PM2.5 Concentration Prediction for Air Pollution Using Machine Learning Algorithms. Menoufia J. Electron. Eng. Res. 2019, 28, 349–354. [Google Scholar] [CrossRef]
- Buitinck, L.; Louppe, G.; Blondel, M.; Pedregosa, F.; Mueller, A.; Grisel, O.; Niculae, V.; Prettenhofer, P.; Gramfort, A.; Grobler, J.; et al. API Design for Machine Learning Software: Experiences from the Scikit-Learn Project. In Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Databases, Prague, Czech Republic, 13–17 September 2013. [Google Scholar]
- Lee, V.W.; Kim, C.; Chhugani, J.; Deisher, M.; Kim, D.; Nguyen, A.D.; Satish, N.; Smelyanskiy, M.; Chennupaty, S.; Singhal, R.; et al. Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU. In Proceedings of the 37th Annual International Symposium on Computer Architecture—ISCA ’10, Saint-Malo, France, 19–23 June 2010. [Google Scholar] [CrossRef]
- Does Not Work with CPU: Grouped Convolution Issue #1 Hoangthang1607/Nfnets-Tensorflow-2. Available online: https://github.com/hoangthang1607/nfnets-Tensorflow-2/issues/1 (accessed on 14 May 2022).
Reference | Algorithms | Prediction Horizon | Evaluation Metrics | Pros | Cons |
---|---|---|---|---|---|
[19] | APNet (CNN–LSTM with normalised batching) | Used past 24 h to predict next hour | RMSE, MAE, IA | Viability and usefulness were validated experimentally for predicting PM2.5 using their proposal. | Algorithmic forecasts did not precisely follow real trends and were shifted and distorted. |
[22] | CNN–LSTM | Used past 24–72 h to predict next 3 h | RMSE, correlation coefficient | Their model is used for processing input from many sites in a city. | They did not verify that their model can be applied to other cities than the one experimented upon. |
[23] | CNN–LSTM | Used past 4, 12, and 24 h to predict next hour | MAE, RMSE | They combined data from meteorological and traffic sources and air pollution stations to compare the effectiveness of adding external sources for better air-quality prediction. | They used all the data and features available, which would incur a high computation cost and long execution time. |
[24] | Multivariate CNN–LSTM | Used past week to predict next 24 h | MAE, RMSE | CNN obtained air-quality features, decreasing training time; meanwhile, long-term historical input data aided LSTM in the prediction process. | More evaluation metrics could have been applied to verify their models’ performance, stating proximity to actual values such as R2 or IA. |
[20] | LSTM | Used past 24 h to predict next hour | RMSE, NRMSE, R2, IA | Using NARX minimised data input to a lower limit speeding up the process and improving accuracy in LSTM. | Evaluation using K-Fold is inaccurate. |
PM2.5 | Cumulated Hours of Rain | Cumulated Wind Speed | |
---|---|---|---|
Count | 41,757 | 43,824 | 43,824 |
Mean | 98.61321 | 0.194916 | 23.88914 |
Standard Deviation | 92.04928 | 1.415851 | 50.01006 |
Minimum | 0 | 0 | 0.45 |
Percentile (25%) | 29 | 0 | 1.79 |
Percentile (50%) | 72 | 0 | 5.37 |
Percentile (75%) | 137 | 0 | 21.91 |
Maximum | 994 | 36 | 585.6 |
Empty Count | 2067 | 0 | 0 |
Loss Percentage | 4.95% | 0.00% | 0.00% |
Coverage Percentage | 95.28% | 100.00% | 100.00% |
PM2.5 | M_DIR | M_SPED | M_T | NO | NO2 | O3 | |
---|---|---|---|---|---|---|---|
Count | 39,962 | 42,768 | 42,768 | 42,768 | 42,801 | 42,710 | 42,790 |
Mean | 10.2795 | 197.5673 | 3.3021 | 9.1598 | 18.0077 | 37.2121 | 28.2244 |
Standard Deviation | 10.2253 | 82.0140 | 1.8266 | 5.6743 | 29.9828 | 18.2559 | 19.3880 |
Minimum | −4 | 0.1 | 0 | −6.9 | 0 | 1.5181 | 0.0998 |
Percentile (25%) | 4.3 | 138.9 | 1.9 | 5.2 | 3.3162 | 22.9991 | 11.8744 |
Percentile (50%) | 7.6 | 205.4 | 2.9 | 8.9 | 8.1880 | 34.7902 | 26.4430 |
Percentile (75%) | 13.1 | 258.1 | 4.4 | 13.1 | 19.7654 | 49.0941 | 41.8099 |
Maximum | 404.3 | 360 | 13.8 | 30.6 | 671.7575 | 256.1077 | 138.5515 |
Empty Count | 3862 | 1056 | 1056 | 1056 | 1023 | 1114 | 1034 |
Loss Percentage | 8.81% | 2.41% | 2.41% | 2.41% | 2.33% | 2.54% | 2.36% |
Coverage Percentage | 91.19% | 97.59% | 97.59% | 97.59% | 97.67% | 97.46% | 97.64% |
PM2.5 | M_DIR | M_SPED | M_T | NO | NO2 | O3 | |
---|---|---|---|---|---|---|---|
Count | 43,824 | 43,824 | 43,824 | 43,824 | 43,824 | 43,824 | 43,824 |
Mean | 10.4240 | 197.6653 | 3.3172 | 9.1687 | 17.9776 | 37.3214 | 28.2042 |
Standard Deviation | 9.7384 | 81.0646 | 1.8110 | 5.6170 | 29.6567 | 18.1270 | 19.2519 |
Minimum | 0 | 0.1 | 0 | −6.9 | 0 | 1.5181 | 0.0998 |
Percentile (25%) | 4.8 | 141.6 | 1.9 | 5.3 | 3.4017 | 23.2407 | 12.0241 |
Percentile (50%) | 7.9 | 205.6 | 3 | 9 | 8.4868 | 34.9435 | 26.4929 |
Percentile (75%) | 12.7940 | 256.6 | 4.4 | 13 | 19.8489 | 49.2579 | 41.6438 |
Maximum | 404.3 | 360 | 13.8 | 30.6 | 671.7575 | 256.1077 | 138.5515 |
No | Algorithm Name | R2 ↑ | IA ↑ | RMSE (µg/m3) ↓ | NRMSE ↓ | Ttr (Seconds) ↓ |
---|---|---|---|---|---|---|
1 | CNN–LSTM | 0.93151 | 0.98237 | 23.22744 | 0.03776 | 31.83709 |
2 | (d0, o1) | 0.93498 | 0.98304 | 22.56670 | 0.03670 | 33.48102 |
3 | (d0, o4) | 0.93358 | 0.98264 | 22.88752 | 0.03715 | 31.94185 |
4 | (d0, o24) | 0.93136 | 0.98185 | 23.23515 | 0.03780 | 30.90029 |
5 | (d8, o1) | 0.93472 | 0.98309 | 22.60365 | 0.03677 | 34.92095 |
6 | LSTM | 0.93000 | 0.98157 | 23.45492 | 0.03817 | 23.10278 |
7 | (d0, o1) | 0.93372 | 0.98266 | 22.81122 | 0.03709 | 24.75054 |
8 | (d0, o4) | 0.93329 | 0.98270 | 22.86120 | 0.03719 | 24.73670 |
9 | (d0, o24) | 0.92800 | 0.98108 | 23.77952 | 0.03870 | 23.65251 |
10 | (d8, o1) | 0.93119 | 0.98220 | 23.30740 | 0.03764 | 25.30951 |
11 | ET | 0.92624 | 0.98027 | 24.21871 | 0.03926 | 3.86640 |
12 | (d0, o1) | 0.92583 | 0.98018 | 24.27789 | 0.03936 | 1.67357 |
13 | (d0, o4) | 0.92609 | 0.98028 | 24.21005 | 0.03927 | 1.97124 |
14 | (d0, o24) | 0.92633 | 0.98030 | 24.15589 | 0.03921 | 4.09777 |
15 | (d8, o1) | 0.92482 | 0.97992 | 24.43607 | 0.03964 | 1.75481 |
16 | XGBRF | 0.92051 | 0.97881 | 25.32772 | 0.04087 | 1.39812 |
17 | (d0, o1) | 0.92106 | 0.97893 | 25.24395 | 0.04061 | 0.90726 |
18 | (d0, o4) | 0.92137 | 0.97904 | 25.19564 | 0.04058 | 0.98165 |
19 | (d0, o24) | 0.92124 | 0.97901 | 25.21104 | 0.04064 | 1.70556 |
20 | (d8, o1) | 0.92116 | 0.97897 | 25.22721 | 0.04060 | 1.11933 |
21 | APNet [19] | N/A | 0.97831 | 24.22874 | N/A | N/A |
22 | NARX LSTM (d8, o1) [20] | 0.9291 | 0.98150 | 23.64560 | 0.03750 | 15.518 |
No | Algorithm Name | R2 ↑ | IA ↑ | RMSE (µg/m3) ↓ | NRMSE ↓ | Ttr(Seconds) ↓ |
---|---|---|---|---|---|---|
1 | CNN–LSTM | 0.73343 | 0.91014 | 4.60168 | 0.04338 | 45.65250 |
2 | (d0, o1) | 0.75676 | 0.92043 | 4.41522 | 0.04093 | 62.93308 |
3 | (d0, o4) | 0.75719 | 0.92129 | 4.40502 | 0.04082 | 63.98762 |
4 | (d0, o24) | 0.72561 | 0.90614 | 4.68568 | 0.04383 | 68.29444 |
5 | (d8, o1) | 0.75587 | 0.92121 | 4.42376 | 0.04098 | 67.65048 |
6 | LSTM | 0.71410 | 0.90178 | 4.80527 | 0.04494 | 36.36427 |
7 | (d0, o1) | 0.75132 | 0.91719 | 4.46954 | 0.04131 | 56.96864 |
8 | (d0, o4) | 0.74757 | 0.91746 | 4.50223 | 0.04162 | 51.03376 |
9 | (d0, o24) | 0.70886 | 0.89991 | 4.85817 | 0.04536 | 54.52106 |
10 | (d8, o1) | 0.74860 | 0.91608 | 4.48958 | 0.04164 | 55.04288 |
11 | ET | 0.75236 | 0.91677 | 4.48561 | 0.04100 | 10.69692 |
12 | (d0, o1) | 0.75413 | 0.91787 | 4.47682 | 0.04096 | 2.29273 |
13 | (d0, o4) | 0.75144 | 0.91707 | 4.50112 | 0.04108 | 3.35555 |
14 | (d0, o24) | 0.75453 | 0.91775 | 4.47306 | 0.04095 | 10.22563 |
15 | (d8, o1) | 0.74594 | 0.91575 | 4.55117 | 0.04158 | 2.41416 |
16 | XGBRF | 0.73285 | 0.91280 | 4.64339 | 0.04220 | 5.45900 |
17 | (d0, o1) | 0.74011 | 0.91516 | 4.59084 | 0.04192 | 1.77346 |
18 | (d0, o4) | 0.74169 | 0.91564 | 4.57746 | 0.04181 | 2.10689 |
19 | (d0, o24) | 0.74247 | 0.91579 | 4.57905 | 0.04183 | 4.59505 |
20 | (d8, o1) | 0.74069 | 0.91578 | 4.58845 | 0.04193 | 1.71298 |
Timestep | Date and Time | PM2.5 | Cumulated Wind Speed | Combined Wind Direction |
---|---|---|---|---|
30126 | 9 June 2013 5:00 | 130 | 1.78 | cv |
30127 | 9 June 2013 6:00 | 153 | 2.23 | cv |
30128 | 9 June 2013 7:00 | 110 | 1.79 | NW |
30129 | 9 June 2013 8:00 | 21 | 3.58 | NW |
30130 | 9 June 2013 9:00 | 14 | 9.39 | NW |
30131 | 9 June 2013 10:00 | 13 | 17.44 | NW |
30132 | 9 June 2013 11:00 | 36 | 23.25 | NW |
30133 | 9 June 2013 12:00 | 14 | 29.06 | NW |
Testing Count = 3722 | Mean | SD | Min | Percentile | Max | |||||
---|---|---|---|---|---|---|---|---|---|---|
(25%) | (50%) | (75%) | (95%) | (99%) | (99.99%) | |||||
Training | 101.9 | 95.1 | 0 | 29 | 75 | 144 | 289 | 434 | 915.5 | 994 |
Testing | 78 | 56 | 4 | 37 | 66 | 107.3 | 182 | 257.3 | 459.2 | 466 |
CNN–LSTM | 77.5 | 53.8 | 4 | 36 | 66 | 107 | 179 | 248.6 | 379.7 | 382 |
(d0, o1) | 78.4 | 54.9 | 4 | 37 | 67 | 108 | 182 | 255.3 | 396.7 | 399 |
(d0, o4) | 78.2 | 53.5 | 5 | 38 | 67 | 107 | 178 | 248 | 383.6 | 387 |
(d0, o24) | 77.1 | 52.7 | −3.0 | 37 | 66 | 106 | 176.5 | 247.9 | 363.5 | 365 |
(d8, o1) | 78.8 | 55.1 | −15.0 | 38 | 67 | 108 | 182 | 254.6 | 400.4 | 403 |
LSTM | 78.2 | 53.8 | −7.0 | 36 | 66 | 107 | 181.5 | 255.6 | 359.2 | 360 |
(d0, o1) | 77.8 | 52.8 | −7.0 | 38 | 67 | 107 | 177 | 246.3 | 368.1 | 370 |
(d0, o4) | 77.1 | 53.1 | −11.0 | 36 | 66 | 106 | 177 | 248 | 361.7 | 364 |
(d0, o24) | 77 | 51.8 | −12.0 | 36 | 67 | 106 | 174 | 247.3 | 357.1 | 359 |
(d8, o1) | 77.8 | 52.8 | 8 | 38 | 67 | 107 | 177 | 244.3 | 368 | 371 |
Testing Count = 3722 | Mean | SD | Min | Percentile | Max | |||||
---|---|---|---|---|---|---|---|---|---|---|
(25%) | (50%) | (75%) | (95%) | (99%) | (99.99%) | |||||
Training | 101.9 | 95.1 | 0 | 29 | 75 | 144 | 289 | 434 | 915.5 | 994 |
Testing | 78 | 56 | 4 | 37 | 66 | 107.3 | 182 | 257.3 | 459.2 | 466 |
ET | 79.5 | 54.2 | 5 | 39 | 69 | 108 | 179.5 | 252 | 418.2 | 422 |
(d0, o1) | 79.6 | 54.7 | 6 | 39 | 69 | 108 | 179.5 | 254.2 | 451.5 | 473 |
(d0, o4) | 79.6 | 54.7 | 5 | 39 | 69 | 107 | 179.5 | 253.3 | 439.7 | 442 |
(d0, o24) | 79.6 | 54.4 | 5 | 39 | 69 | 109 | 180 | 253 | 424.2 | 425 |
(d8, o1) | 79.6 | 55 | 5 | 39 | 69 | 108 | 181 | 254.3 | 447.9 | 449 |
XGBRF | 79.4 | 54.5 | 10 | 39 | 71 | 105 | 178 | 252.3 | 442.3 | 448 |
(d0, o1) | 79.4 | 54.7 | 10 | 39 | 70 | 105 | 178 | 251.3 | 449 | 455 |
(d0, o4) | 79.4 | 54.6 | 9 | 39 | 70 | 105 | 178 | 251.3 | 449.3 | 455 |
(d0, o24) | 79.4 | 54.6 | 10 | 39 | 70 | 105 | 178 | 252.3 | 446.7 | 452 |
(d8, o1) | 79.4 | 54.7 | 10 | 39 | 70.5 | 105 | 178.5 | 252 | 449 | 455 |
Testing Count = 3960 | Mean | SD | Min | Percentile | Max | |||||
---|---|---|---|---|---|---|---|---|---|---|
(25%) | (50%) | (75%) | (95%) | (99%) | (99.99%) | |||||
Training | 10 | 9.7 | 0 | 4.5 | 7.5 | 12.3 | 27.8 | 45.2 | 253.9 | 404.3 |
Testing | 11.8 | 10.1 | 0 | 6 | 9 | 15 | 28 | 48.4 | 131 | 135 |
CNN–LSTM | 11.3 | 7.9 | −1.0 | 6 | 9 | 15 | 26 | 40 | 65.4 | 67 |
(d0, o1) | 11.1 | 7.7 | −3.0 | 6 | 9 | 14 | 25 | 40 | 66 | 66 |
(d0, o4) | 11.4 | 7.9 | −1.0 | 6 | 9 | 15 | 26 | 40 | 65 | 65 |
(d0, o24) | 11.1 | 7.4 | −2.0 | 6 | 9 | 14 | 25 | 38 | 59.6 | 60 |
(d8, o1) | 11.3 | 7.6 | −1.0 | 6 | 9 | 14 | 26 | 39.4 | 66 | 66 |
LSTM | 11.4 | 7.6 | −2.0 | 6 | 9 | 14 | 26 | 41 | 61.4 | 63 |
(d0, o1) | 11.6 | 8.1 | −3.0 | 6 | 9 | 14 | 26 | 42 | 78.4 | 80 |
(d0, o4) | 11.2 | 8 | −1.0 | 6 | 9 | 14 | 26 | 41 | 71.4 | 73 |
(d0, o24) | 11.5 | 7.6 | −3.0 | 6 | 9 | 15 | 26 | 40 | 66.6 | 67 |
(d8, o1) | 11.3 | 8.4 | −4.0 | 6 | 9 | 14 | 27 | 43 | 73.8 | 75 |
Testing Count = 3960 | Mean | SD | Min | Percentile | Max | |||||
---|---|---|---|---|---|---|---|---|---|---|
(25%) | (50%) | (75%) | (95%) | (99%) | (99.99%) | |||||
Training | 10 | 9.7 | 0 | 4.5 | 7.5 | 12.3 | 27.8 | 45.2 | 253.9 | 404.3 |
Testing | 11.8 | 10.1 | 0 | 6 | 9 | 15 | 28 | 48.4 | 131 | 135 |
ET | 11.4 | 8.5 | 2 | 6 | 9 | 14 | 26 | 43 | 113.7 | 120 |
(d0, o1) | 11.4 | 8.5 | 2 | 6 | 9 | 15 | 26 | 40 | 110.8 | 112 |
(d0, o4) | 11.5 | 8.7 | 2 | 6 | 9 | 15 | 26 | 42 | 140 | 142 |
(d0, o24) | 11.4 | 8.5 | 2 | 6 | 9 | 15 | 26 | 40 | 122.9 | 130 |
(d8, o1) | 11.6 | 8.8 | 2 | 6 | 9 | 15 | 27 | 43 | 130.1 | 138 |
XGBRF | 11.6 | 9.8 | 3 | 6 | 9 | 14 | 27 | 46.4 | 189.2 | 190 |
(d0, o1) | 11.6 | 9.4 | 3 | 6 | 9 | 14 | 27 | 46 | 157.2 | 158 |
(d0, o4) | 11.5 | 9.3 | 3 | 6 | 9 | 14 | 27 | 45.4 | 147.2 | 148 |
(d0, o24) | 11.5 | 9.2 | 3 | 6 | 9 | 14 | 27 | 46 | 139.6 | 140 |
(d8, o1) | 11.6 | 9.4 | 3 | 6 | 9 | 14 | 27 | 46 | 157.2 | 158 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Moursi, A.S.A.; El-Fishawy, N.; Djahel, S.; Shouman, M.A. Enhancing PM2.5 Prediction Using NARX-Based Combined CNN and LSTM Hybrid Model. Sensors 2022, 22, 4418. https://doi.org/10.3390/s22124418
Moursi ASA, El-Fishawy N, Djahel S, Shouman MA. Enhancing PM2.5 Prediction Using NARX-Based Combined CNN and LSTM Hybrid Model. Sensors. 2022; 22(12):4418. https://doi.org/10.3390/s22124418
Chicago/Turabian StyleMoursi, Ahmed Samy AbdElAziz, Nawal El-Fishawy, Soufiene Djahel, and Marwa A. Shouman. 2022. "Enhancing PM2.5 Prediction Using NARX-Based Combined CNN and LSTM Hybrid Model" Sensors 22, no. 12: 4418. https://doi.org/10.3390/s22124418