Abstract
As one of the important operations in Geographic Information System (GIS) spatial analysis, polygon overlay processing is a time-consuming task in many big data cases. In this paper, a specially designed MapReduce algorithm with grid index is proposed to decrease the running time. Our proposed algorithm can reduce the times of calling intersection computation by the aid of grid index. The experiment is carried out on the cloud framework based on Hadoop built by ourselves. Experimental results show that our algorithm with spatial grid index consumes less time than its peer without spatial index. Moreover, the proposed algorithm has an upward speed-up ratio when more nodes of Hadoop framework are used. Nevertheless, with the increase of nodes, the upward trend of speed-up ratio slows down.
Similar content being viewed by others
References
Wang, F.: A parallel intersection algorithm for vector polygon overlay. IEEE Comput. Graph. Appl. 13(2), 74–81 (1993)
Wang, L., Liu, P., Ranjan, R., Chen, L.: IK-SVD: dictionary learning for spatial big data via incremental atom update. Comput. Sci. Eng. 16(4), 41–52 (2014)
Shekhar, S., Gunturi, V., Evans, M.R., Yang, K.: Spatial big-data challenges intersecting mobility and cloud computing. In: Proceedings of the 11th ACM International Workshop on Data Engineering for Wireless and Mobile Access, pp. 1–6. ACM (2012)
Ma, Y., Wang, L., Zomaya, A., Chen, D., Ranjan, R.: Task-tree based large-scale mosaicking for remote sensed imageries with dynamic DAG scheduling. IEEE Trans. Parallel Distrib. Syst. 25(8), 2126–2137 (2014)
Wang, J., Cheng, L., Wang, L.: Concentric layout, a new scientific data layout for matrix data-set in Hadoop file system. Int. J. Parallel Emergent Distrib. Syst. 28(5), 407–433 (2013)
Chen, D., Li, X., Cui, D., Wang, L., Lu, D.: Global synchronization measurement of multivariate neural signals with massively parallel nonlinear interdependence analysis. IEEE Trans. Neural Syst. Rehabil. Eng. 22(1), 33–43 (2014)
Chen, D., Li, D., Xiong, M., Bao, H., Li, X.: GPGPU-aided ensemble empirical-mode decomposition for EEG analysis during anesthesia. IEEE Trans. Inform. Technol. Biomed. 14(6), 1417–1427 (2010)
Agarwal, D., Prasad, S.K.: Lessons learnt from the development of gis application on azure cloud platform. In: Proceedings of the 5th IEEE International Conference on Cloud Computing (CLOUD), pp. 352–359 (2012)
Agarwal, D., Puri, S., He, X., Prasad, S.K.: A system for GIS polygon overlay computation on linux cluster-an experience and performance report. In: Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 1433–1439 (2012)
Agarwal, D., Puri, S., He, X., Prasad, S.K.: Cloud computing for fundamental spatial operations on polygon gis data. 2012 Cloud Futures Workshop (2012)
Hadoop: Open source implementation of MapReduce. http://lucene.apache.org/hadoop/
Lam, C.: Hadoop in Action. Manning Publications Company, Greenwich (2010)
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Sebastopol (2012)
Zikopoulos, P., Eaton, C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011)
Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., Chen, D.: G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Future Gener. Comput. Syst. 29(3), 739–750 (2013)
Huang, F., Liu, D., Li, X., Wang, L., Xu, W.: Preliminary study of a cluster-based open-source parallel GIS based on the GRASS GIS. Int. J. Digit. Earth 4(5), 402–420 (2011)
Cary, A., Sun, Z., Hristidis, V., Rishe, N.: Experiences on processing spatial data with MapReduce. In: Proceedings of the 21st International Conference on Scientific and Statistical Database Management, pp. 302–319. Springer (2009)
Chen, Q., Wang, L., Shang, Z.: MRGIS: A MapReduce-enabled high performance workflow system for GIS. In: Proceedings of the 4th IEEE International Conference on e-Science, pp. 646–651 (2008)
Puri, S., Agarwal, D., He, X., Prasad, S.K.: MapReduce algorithms for GIS polygon overlay processing. In: Proceedings of the 27th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 1009–1016 (2013)
Wang, Y., Wang, J., Li, C., Yan, X.: Cloud GIS: Theory. Method and Practice, Geocomputation (2013)
Preparatat, F.P., Shamos, M.I.: Computational Geometry: An Introduction. Springer, New York (1985)
General polygon clipper library. http://www.cs.man.ac.uk/_toby/alan/software/gpc.html
Shekhar, S., Xiong, H.: Encyclopedia of GIS. Springer, New York (2008)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Fu, Z., Liu, S., Tian, Z., Xu, H.: Distributed spatial index based on multilevel R-tree. Bull. Surv. Mapp. 11, 42–46 (2012)
Li, X., Zheng, W.: Parallel spatial index algorithm based on Hilbert partition. In: Proceedings of the 5th International Conference on Computational and Information Sciences (ICCIS), pp. 876–879 (2013)
Zhong, Y., Han, J., Zhang, T., Li, Z., Fang, J., Chen, G.: Towards parallel spatial query processing for big spatial data. In: Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 2085–2094 (2012)
Puri, S., Prasad, S, K.: Efficient parallel and distributed algorithms for GIS polygon overlay processing. In: Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, pp. 2238–2241. IEEE Computer Society (2013)
Kim, J., Hong, S., Nam, B.: A performance study of traversing spatial indexing structures in parallel on GPU. In: Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & the 9th IEEE International Conference on Embedded Software and Systems (HPCC-ICESS), pp. 855–860 (2012)
Cheng, C.: Spatial Database Management System. Science Press, Beijing (2012)
Acknowledgments
The authors would like to thank Professor Shanyu Tang for his valuable suggestions. The project was supported by the Fundamental Research Founds for National University, China University of Geosciences (Wuhan) under Grant CUGL110228 and CUGL120292.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Y., Liu, Z., Liao, H. et al. Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing. Cluster Comput 18, 507–516 (2015). https://doi.org/10.1007/s10586-015-0428-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-015-0428-x