[go: up one dir, main page]

Skip to main content
Log in

A fault-tolerant optimization mechanism for spatiotemporal data analysis in flink

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Spatiotemporal data analysis plays a vital role in big data processing, and it is also a research hotspot in location-aware and recommender systems. In these applications, graph modeling and distributed iterative computing are the basis and guarantee for data query and mining. Because of the constant repeated execution of specific calculation logic, iterative jobs have the characteristics of being time-consuming and exerting high pressure on system resources. However, iterative jobs always face the risk of stopping due to computing node fault, which in turn causes serious economic losses. At present, the latest generation of distributed computing system Flink’s recovery strategy for node faults in batch processing mode is to restart the job from the beginning, which is extremely time-consuming. If the checkpoint mechanism in Flink’s stream-processing mode is used to recover from batch jobs failures, it will greatly increase the running time and storage overhead in trouble-free state. Therefore, a lightweight fault-tolerant mechanism is needed to reduce failure recovery time while ensuring the job efficiency of spatiotemporal data analysis. In view of the above situation and the characteristics of the iterative computing model for graph computing, a single-node failure recovery mechanism only for the failed node is proposed, which reduces the failure recovery time by introducing lightweight checkpoints and local logs. Based on the proposed single-node failure recovery mechanism, a failure recovery mechanism under multi-node fault and associated fault is proposed, which can cope with more complex failure situations occurs. Experimental results show that the proposed method can quickly and effectively recover jobs after failure, reducing the average recovery time by 37% in the case of single node fault, and reducing the average recovery time by 24% in the case of multi-node fault.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

All data in the experiment is authoritative and available.

Code availability

All the codes in this research are available.

References

  1. Cantarella, G.E., Improta, G., Sforza, A.: Iterative procedure for equilibrium network traffic signal setting. Transportation Research Part A General 25(5), 241–249 (1991)

    Article  Google Scholar 

  2. Carbone, P., Katsifodimos, A., Kth, ., Sweden, S., Tzoumas, K.: Apache flink : Stream and batch processing in a single engine (2015)

  3. Chandy, K.M., Lamport, L.: Distributed snapshots: Determining global states of a distributed system. Acm Trans on Computer Systems 3(1), 63–75 (2016)

    Article  Google Scholar 

  4. Chen, L., Shang, S., Jensen, C.S., Yao, B., Zhang, Z., Shao, L.: Effective and efficient reuse of past travel behavior for route recommendation. KDD, 488–498 (2019)

  5. Chen, D., Yuan, Y., Du, W., Cheng, Y., Wang, G.: Online route planning over time-dependent road networks. In: ICDE, pp. 325– 335. IEEE, ??? (2021)

  6. Deo, N., Pang, C.Y.: Shortest path algorithms: a taxonomy and annotation (1984)

  7. Dijkstra, E.W.: The distributed snapshot of k.m. chandy and l. lamport. Springer, Berlin (1986)

  8. Doan, H., Zhang, W., Min, Z., Ogata, K.: Model checking chandy-lamport distributed snapshot algorithm revisited. In: International Symposium on Dependable Computing & Internet of Things (2016)

  9. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. AAAI Press (1996)

  10. Failure, H., Failure, H., Access, S.D., Access, S.D., Sets, L.D., Sets, L.D., Model, S.C., Model, S.C., Computation, M., Computation, M.: The hadoop distributed file system: Architecture and design. Hadoop Project Website 11(11), 1–10 (2007)

    Google Scholar 

  11. Fang, Y., Feng, C., Mammar, S., Che, A.: Iterative algorithm for lane reservation problem on transportation network. In: IEEE International Conference on Networking (2011)

  12. Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. Appl. Stat. 28(1) (1979)

  13. Iserngonzalez, J., Hernandezsosa, D., Fernandezperdomo, E., Cabreragamez, J., Dominguezbrito, A.C., Prietomaranon, V.: Path planning for underwater gliders using iterative optimization. In: IEEE International Conference on Robotics & Automation (2011)

  14. Javed, M.A., Younis, M.S., Latif, S., Qadir, J., Baig, A.: Community detection in networks: A multidisciplinary review. Journal of Network and Computer Applications 108(APR.), 87–111 (2018)

    Article  Google Scholar 

  15. Kambhatla, S., Walpole, J.: Recovery with limited replay: fault-tolerant processes in linda. In: Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, 1990. (1991)

  16. Kunegis, J.: Handbook of network analysis [konect - the koblenz network collection]. Computer Science 2, 1343–1350 (2014)

    Google Scholar 

  17. Lou, Y.S., Zhang, W.Y., Xu, F., Wang, Y., Chen, S.: Parallel implementation of single-source shortest path algorithm based on haloop. Applied Mechanics & Materials 220–223, 2428–2432 (2012)

    Article  Google Scholar 

  18. Luo, W.: A real-time fault-tolerant scheduling algorithm for distributed systems based on deferred active backup-copy. Journal of Computer Research and Development 44(44), 521–528 (2007)

    Article  Google Scholar 

  19. Patriksson, M.: The traffic assignment problem: Models and methods. VSP (1994)

  20. Pfoser, D., Tryfona, N., Jensen, C.S.: Indeterminacy and spatiotemporal data: Basic definitions and case study. Geoinformatica 9(3), 211–236 (2005)

    Article  Google Scholar 

  21. Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. EDBT, 156–167 (2012)

  22. Shang, S., Chen, L., Wei, Z., Jensen, C.S., Wen, J., Kalnis, P.: Collective travel planning in spatial networks. IEEE Trans. Knowl. Data Eng. 28(5), 1132–1146 (2016)

    Article  Google Scholar 

  23. Shang, S., Chen, L., Jensen, C.S., Wen, J., Kalnis, P.: Searching trajectories by regions of interest. IEEE Trans. Knowl. Data Eng. 29(7), 1549–1562 (2017)

    Article  Google Scholar 

  24. Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Trajectory similarity join in spatial networks. Proc. VLDB Endow. 10(11), 1178–1189 (2017)

    Article  Google Scholar 

  25. Shang, S., Chen, L., Wei, Z., Jensen, C.S., Zheng, K., Kalnis, P.: Parallel trajectory similarity joins in spatial networks. VLDB J. 27(3), 395–420 (2018)

    Article  Google Scholar 

  26. Shang, S., Chen, L., Zheng, K., Jensen, C.S., Wei, Z., Kalnis, P.: Parallel trajectory-to-location join. IEEE Trans. Knowl. Data Eng. 31(6), 1194–1207 (2019)

    Article  Google Scholar 

  27. Tian C, Hu Z, Vora K, Gupta R: Coral: Confined recovery in distributed asynchronous graph processing. In: Acm Sigplan Notices A Monthly Publication of the Special Interest Group on Programming Languages (2017)

  28. Venkateswara, R.K.: Spatiotemporal data mining: Issues, tasks and applications. International Journal of Computer Science & Engineering Survey 3(1), 39–52 (2012)

    Article  MathSciNet  Google Scholar 

  29. Wang, Y., Yuan, Y., Wang, H., Zhou, X., Mu, C., Wang, G.: Constrained route planning over large multi-modal time-dependent networks. ICDE, 313–324 (2021)

  30. Xu, C., Holzemer, M., Kaul, M., Soto, J., Markl, V.: On fault tolerance for distributed iterative dataflow processing. IEEE Transactions on Knowledge and Data Engineering PP, 1–1 (2017)

    Google Scholar 

  31. Yuan, Y., Lian, X., Wang, G., Chen, L., Ma, Y., Wang, Y.: Weight-constrained route planning over time-dependent graphs. ICDE, 914–925 (2019)

  32. Yuan, Y., Lian, X., Chen, L., Wang, G., Yu, J.X., Wang, Y., Ma, Y.: Gcache: Neighborhood-guided graph caching in a distributed environment. IEEE Trans. Parallel Distributed Syst. 30(11), 2463–2477 (2019)

    Article  Google Scholar 

  33. Yuan, Y., Lian, X., Wang, G., Ma, Y., Wang, Y.: Constrained shortest path query in a large time-dependent graph. Proc. VLDB Endow. 12(10), 1058–1070 (2019)

    Article  Google Scholar 

  34. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets (2010)

  35. Zhang, Y., Gao, Q., Gao, L., Wang, C.: Imapreduce: A distributed computing framework for iterative computation. Journal of Grid Computing 10(1), 47–68 (2012)

    Article  Google Scholar 

  36. Zhu, P., Yang, F., Tu, G.: Real-time fault-tolerant scheduling for distributed systems based on improving priority of passive backup. Journal of Computer Research and Development 47(11), 2003–2010 (2010)

    Google Scholar 

Download references

Funding

This research was supported by the National Key R&D Program of China under Grant No. 2018YFB1004402; and the NSFC under Grant No. 61872072, 62072087, 61772124, 61932004, 61732003, and 61729201; and the Fundamental Research Funds for the Central Universities under Grant No. N2016009.

Author information

Authors and Affiliations

Authors

Contributions

conceptualization, Hangxu Ji; software, Hangxu Ji; methodology, Hangxu Ji and Yuhai Zhao; supervision, Hangxu Ji and Liuguo Wei; validation, Yuchen Fan; writing-original draft, Hangxu Ji; writing-review and editing, Gang Wu and Guoren Wang.

Corresponding author

Correspondence to Gang Wu.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Ethics approval

This article does not contain any studies involving human participants and/or animals by any of the authors.

Consent to participate

All authors have agreed to participate in the research described in this manuscript.

Consent for publication

All authors have read and agreed to the published version of the manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Spatiotemporal Data Management and Analytics for Recommend

Guest Editors: Shuo Shang, Xiangliang Zhang, and Panos Kalnis

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, H., Wu, G., Zhao, Y. et al. A fault-tolerant optimization mechanism for spatiotemporal data analysis in flink. World Wide Web 26, 867–887 (2023). https://doi.org/10.1007/s11280-022-01006-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-022-01006-5

Keywords

Navigation