[go: up one dir, main page]

Skip to main content
Log in

FCHM-stream: fast closed high utility itemsets mining over data streams

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

The high-speed, continuous and endless characteristics of data streams make it a challenging task to quickly mine high utility itemsets in limited memory space. The sliding window model, which focuses only on the most recent data, has received extensive research and attention as it can effectively adapt to the data stream environment. However, the presence of many communal batches in adjacent sliding windows causes the algorithm to repeatedly generate a large number of identical itemsets, which reduces the spatiotemporal performance of the algorithm. In order to solve these problems and provide users with a concise and lossless resultset, a new closed high utility pattern mining algorithm over data stream is proposed, named FCHM-Stream. A new utility list structure based on batch division and a resultset maintenance strategy based on skip-list structure are designed to effectively reduce identical itemsets repeatedly generated and thus reduce the running time of the algorithm. Extensive experimental results show that the proposed algorithm has a large improvement in runtime compared to the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625

    Article  Google Scholar 

  2. Krishnamoorthy S (2017) HMiner: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183

    Article  Google Scholar 

  3. Dawar S, Sharma V, Goyal V (2017) Mining top-k high-utility itemsets from a data stream under sliding window model. Appl Intell 47(4):1240–1255

    Article  Google Scholar 

  4. Liu Y, Liao W-k and Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In Pacific-Asia conference on knowledge discovery and data mining. Springer Berlin Heidelberg, Berlin, pp. 689–695.

  5. Tseng VS, Wu C-W, Shie B-E and Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 253–262.

  6. Liu M and Qu J (2012) Mining high utility itemsets without candidate generation. Proceedings of the 21st ACM international conference on Information and knowledge management. pp. 55–64.

  7. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381

    Article  Google Scholar 

  8. Fournier-Viger P, Wu C-W, Zida S and Tseng VS (2014) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. International symposium on methodologies for intelligent systems. Springer International Publishing, Cham, pp. 83–92.

  9. Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877

    Article  Google Scholar 

  10. Wu P, Niu X, Fournier-Viger P, Huang C, Wang B (2022) UBP-Miner: An efficient bit based high utility itemset mining algorithm. Knowl Based Syst 248:108865

    Article  Google Scholar 

  11. Sohrabi MK (2020) An efficient projection-based method for high utility itemset mining using a novel pruning approach on the utility matrix. Knowl Inf Syst 62(11):4141–4167

    Article  Google Scholar 

  12. Ahmed CF, Tanbeer SK, Jeong B-S, Choi H-J (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991

    Article  Google Scholar 

  13. Baek Y, Yun U, Kim H, Nam H, Kim H, Lin JC-W, Vo B, Pedrycz W (2021) Rhups: Mining recent high utility patterns with sliding window–based arrival time control over data streams. ACM Trans Intell Syst Technol TIST 12(2):1–27

    Article  Google Scholar 

  14. Ryang H, Yun U (2016) High utility pattern mining over data streams with sliding window technique. Expert Syst Appl 57:214–231

    Article  Google Scholar 

  15. Jaysawal BP and Huang J-W (2020) Sohupds: a single-pass one-phase algorithm for mining high utility patterns over a data stream. In Proceedings of the 35th annual acm symposium on applied computing. pp. 490–497.

  16. Chen H, Han M, Zhang N, Li X, Wang L (2021) Closed high utility itemsets mining over data stream based on sliding window. J Comput Res Dev 58(11):1–15

    Google Scholar 

  17. Chen X, Zhai P and Fang Y (2021) High utility pattern mining based on historical data table over data streams. In 2021 4th International Conference on Data Science and Information Technology. pp. 368–376.

  18. Wu C-W, Fournier-Viger P, Gu J-Y and Tseng VS (2015) Mining closed+ high utility itemsets without candidate generation. In 2015 conference on technologies and applications of artificial intelligence (TAAI). IEEE, pp. 187–194.

  19. Tseng VS, Wu C-W, Fournier-Viger P, Philip SY (2014) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739

    Article  Google Scholar 

  20. Fournier-Viger P, Zida S, Lin JC-W, Wu C-W, Tseng VS (2016) EFIM-closed: fast and memory efficient discovery of closed high-utility itemsets. In: Perner P (ed) Machine learning and data mining in pattern recognition. Springer, Cham, pp 199–213

    Chapter  Google Scholar 

  21. Dam T-L, Li K, Fournier-Viger P, Duong Q-H (2019) CLS-Miner: efficient and effective closed high-utility itemset mining. Front Comp Sci 13(2):357–381

    Article  Google Scholar 

  22. Nguyen LT, Vu VV, Lam MT, Duong TT, Manh LT, Nguyen TT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Inf Sci 495:78–99

    Article  Google Scholar 

  23. Dam T-L, Ramampiaro H, Nørvåg K, Duong Q-H (2019) Towards efficiently mining closed high utility itemsets from incremental databases. Knowl Based Syst 165:13–29

    Article  Google Scholar 

  24. Yun U, Lee G, Yoon E (2017) Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans Indust Electron 9:7239–7249

    Article  Google Scholar 

  25. Lee J, Yun U, Lee G, Yoon E (2018) Efficient incremental high utility pattern mining based on pre-large concept. Eng Appl Artif Intell 72:111–123

    Article  Google Scholar 

  26. Fournier-Viger P, Lin JC-W, Gueniche T, Barhate P (2015) Efficient incremental high utility itemset mining. Proc ASE Big Data Soc Inform 2015:1–6

    Google Scholar 

  27. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206

    Article  Google Scholar 

  28. Yun U, Nam H, Lee G, Yoon E (2019) Efficient approach for incremental high utility pattern mining with indexed list structure. Futur Gener Comput Syst 95:221–239

    Article  Google Scholar 

  29. Kim H, Yun U, Baek Y, Kim H, Nam H, Lin JC-W, Fournier-Viger P (2021) Damped sliding based utility oriented pattern mining over stream data. Knowl-Based Syst 213:106653

    Article  Google Scholar 

  30. Lee C, Ryu T, Kim H, Kim H, Vo B, Lin JC-W, Yun U (2022) Efficient approach of sliding window-based high average-utility pattern mining with list structures. Knowl Based Syst 256:109702

    Article  Google Scholar 

  31. Lucchese C, Orlando S, Perego R (2005) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36

    Article  Google Scholar 

  32. Sahoo J, Das AK, Goswami A (2016) An efficient fast algorithm for discovering closed+ high utility itemsets. Appl Intell 45(1):44–74

    Article  Google Scholar 

  33. Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive GA-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422

    Article  Google Scholar 

  34. Pramanik S, Goswami A (2022) Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm. Appl Intell 52(8):8839–8855

    Article  Google Scholar 

  35. Lin JC-W, Djenouri Y, Srivastava G, Fourier-Viger P (2022) Efficient evolutionary computation model of closed high-utility itemset mining. Appl Intell 14:1–13

    Google Scholar 

  36. Hidouri A, Jabbour S, Raddaoui B, Yaghlane BB (2021) Mining closed high utility itemsets based on propositional satisfiability. Data Knowl Eng 136:101927

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (62062004), the Natural Science Foundation of Ningxia Province (2022AAC03279), and the Graduate Innovation Project of North Minzu University (YCX22195).

Author information

Authors and Affiliations

Authors

Contributions

M.L. and M.H. wrote the main manuscript text. Z.C., H.W. and X.Z. prepared figures and tables. And all authors have checked the manuscript and have agreed to the submission.

Corresponding author

Correspondence to Meng Han.

Ethics declarations

Conflict of interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Han, M., Chen, Z. et al. FCHM-stream: fast closed high utility itemsets mining over data streams. Knowl Inf Syst 65, 2509–2539 (2023). https://doi.org/10.1007/s10115-023-01831-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01831-8

Keywords

Navigation