Abstract
With the increasing number of geographically distributed scientific collaborations and the growing sizes of scientific data, it has become challenging for users to achieve the best possible network performance on a shared network. We have developed a model to forecast expected bandwidth utilization on high-bandwidth wide area networks. The forecast model can improve the efficiency of the resource utilization and scheduling of data movements on high-bandwidth networks to accommodate ever increasing data volume for large-scale scientific data applications. A univariate time-series forecast model is developed with the Seasonal decomposition of Time series by Loess (STL) and the AutoRegressive Integrated Moving Average (ARIMA) on Simple Network Management Protocol (SNMP) path utilization measurement data. Compared with the traditional approach such as Box-Jenkins methodology to train the ARIMA model, our forecast model reduces computation time up to 92.6 %. It also shows resilience against abrupt network usage changes. Our forecast model conducts the large number of multi-step forecast, and the forecast errors are within the mean absolute deviation (MAD) of the monitored measurements.
Similar content being viewed by others
References
Energy Sciences Network (ESnet) http://www.es.net/ (2014)
Network Simulator (ns2). http://www.isi.edu/nsnam/ns/ (2014)
Aceto, G., Botta, A., Pescapé, A., D’Arienzo, M.: Unified architecture for network measurement: The case of available bandwidth. J. Netw. Comput. Appl. 35(5), 1402–1414 (2012)
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)
Balman, M., Chaniotakisy, E., Shoshani, A., Sim, A.: A flexible reservation algorithm for advance network provisioning. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM/IEEE (2010)
Benson, T., Akella, A., Maltz, D.A.: Network traffic characteristics of data centers in the wild. In: Proceedings of the Conference on Internet Measurement - IMC ’10, pp. 267–280. ACM, New York (2010)
Box, G.E.P., Jenkins, G.M., Reinsel, G.C.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley (2013)
Brockwell, P., Davis, R.: Time Series: Theory and Methods. Springer-Verlag (2009)
Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I.: STL: A seasonal-trend decomposition procedure based on loess. J. Official Stat. 6(1), 3–73 (1990)
Cleveland, W., Devlin, S.: Locally weighted regression: an approach to regression analysis by local fitting. J. Amer. Stat. Assoc. 83(403), 596–610 (1988)
Cortez, P., Rio, M., Rocha, M., Sousa, P.: Multi-scale internet traffic forecasting using neural networks and time series methods 29(2), 143–155 http://onlinelibrary.wiley.com/doi/10.1111/j.1468-0394. 2010.00568.x/abstract
Croce, D., Melliay, M., Leonardiy, E.: The quest for bandwidth estimation techniques for large-scale distributed systems. ACM SIGMETRICS Perform. Eval. Rev. 37(3), 20–25 (2010)
Crovella, M., Bestavros, A.: Self-similarity in World Wide Web traffic: Evidence and possible causes. IEEE/ACM Trans.on Network. 5(6), 835–846 (1997)
Diebold, F.X., Mariano, R.S.: Comparing predictive accuracy 13(3), 253–263 http://amstat.tandfonline.com/doi/abs/10.1080/07350015.1995.10524599
Estan, C., Savage, S., Varghese, G.: Automatically inferring patterns of resource consumption in network traffic. In: SIGCOMM ’03. pp. 137–148. ACM
Feamster, N., Rexford, J., Zegura, E.: The road to SDN: An intellectual history of programmable networks. ACM SIGCOMM Comput. Commun. Rev. 44(2), 87–98 (2014)
Gonzalez, B.P., Snchez, G.G., Donate, J.P., Cortez, P., Miguel, A.S.d.: Parallelization of an evolving artificial neural networks system to forecast time series using OPENMP and MPI. In: 2012 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS). pp. 186–191
Hampel, F.R.: The influence curve and its role in robust estimation. J. Amer. Stat. Assoc. 69(346), 383–393 (1974)
He, Q., Dovrolis, C., Ammar, M.: On the predictability of large transfer TCP throughput. In: Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, vol. 35. ACM, New York (2005)
Hjorth, J.: Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap. CRC Press (1993)
Hu, N., Steenkiste, P.: Evaluation and characterization of available bandwidth probing techniques. IEEE J. Selected Areas Commun. 21(6), 879–894 (2003)
Hyndman, R.J., Koehler, A.B., Snyder, R.D., Grose, S.: A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 18(3), 439–454 (2002)
Jain, M., Dovrolis, C.: End-to-end available bandwidth. In: Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications - SIGCOMM ’02, vol. 32, p. 295. ACM Press, New York (2002)
Krithikaivasan, B., Zeng, Y., Deka, K., Medhi, D.: ARCH-based traffic forecasting and dynamic bandwidth provisioning for periodically measured nonstationary traffic. IEEE/ACM Trans. Network. 15 (3), 683–696 (2007)
Kwiatkowski, D., Phillips, P.C., Schmidt, P., Shin, Y.: Testing the null hypothesis of stationarity against the alternative of a unit root. J. Econom. 54(1-3), 159–178 (1992)
Leland, W., Taqqu, M., Willinger, W., Wilson, D.: On the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Trans. Netw 2(1) (1994)
Ljung, G., Box, G.: On a measure of lack of fit in time series models. Biometrika 65(2), 297–303 (1978)
Lu, D., Qiao, Y., Dinda, P., Bustamante, F.: Characterizing and predicting TCP throughput on the wide area network. In: 25th IEEE International Conference on Distributed Computing Systems (ICDCS’05). pp. 414–424. IEEE (2005)
Mirza, M., Sommers, J., Barford, P.: A machine learning approach to TCP throughput prediction. IEEE/ACM Trans. Netw. 18(4), 1026–1039 (2010)
Papagiannaki, K., Taft, N., Zhang, Z.L., Diot, C.: Long-term forecasting of internet backbone traffic. IEEE Trans. Neural Netw. 16(5), 1110–1124 (2005)
Paxson, V., Floyd, S.: Wide area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3(3), 226–244 (1995)
Pearson, R.: Data cleaning for dynamic modeling and control. In: Proceedings of European Control Conference (1999)
Qiao, Y., Skicewicz, J., Dinda, P.: An empirical study of the multiscale predictability of network traffic. In: Proceedings of the International Symposium on High performance Distributed Computing. pp. 66–76. IEEE (2004)
Ribeiro, V.J., Riedi, R.H., Baraniuk, R.G., Navratil, J., Cottrell, L.: pathChirp: Efficient available bandwidth estimation for network paths. In: Proceedings of the Passive and Active Measurements (PAM) Workshop (2003)
Sang, A., Li, S.q.: A predictability analysis of network traffic. Comput. Netw. 39(4), 329–345 (2002)
Shao, J.: An asymptotic theory for linear model selection. Statistica Sinica 7, 221–264 (1997)
Shriram, A., Kaur, J.: Empirical evaluation of techniques for measuring available bandwidth. In: Proceedings of the International Conference on Computer Communications. pp. 2162–2170. IEEE (2007)
Strauss, J., Katabi, D., Kaashoek, F.: A measurement study of available bandwidth estimation tools. In: Proceedings of the Conference on Internet Measurement - IMC ’03, pp. 39–44. ACM, New York (2003)
Yin, D., Yildirim, E., Kulasekaran, S., Ross, B., Kosar, T.: A data throughput prediction and optimization service for widely distributed many-task computing. IEEE Trans. Parallel Distrib. Syst. 22(6), 899–909 (2011)
Yoo, W., Sim, A.: Network bandwidth utilization forecast model on high bandwidth networks. In: Proceedings of the IEEE International Conference on Computing, Networking and Communications (2015)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yoo, W., Sim, A. Time-Series Forecast Modeling on High-Bandwidth Network Measurements. J Grid Computing 14, 463–476 (2016). https://doi.org/10.1007/s10723-016-9368-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-016-9368-9