Online Multi-threshold Learning with Imbalanced Data Stream

Xufen Cai^16,18,
Min Yang¹⁷,
Rong Zhu¹⁷,
Xiaoyan Li¹⁸,
Long Ye¹⁶ &
…
Qin Zhang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10261))

Included in the following conference series:

International Symposium on Neural Networks

2535 Accesses

Abstract

This paper addresses the imbalanced data problem in an online fashion based on multi-threshold learning. The majority of existing works on processing large-scale imbalanced data stream assume a prior distribution of data based on a training dataset, while we consider a more challenging setting without any assumption of the prior, and propose an online multi-threshold learning (OMTL) method by simultaneously learning multiple classifiers with different threshold based on F-measure incremental updating. The proposed approach shows its potentials on recent benchmark datasets compared to previous cost-sensitive and threshold fine-tuning based online learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A review of online supervised learning

Article 08 July 2022

DynaQ: online learning from imbalanced multi-class streams through dynamic sampling

Article Open access 29 July 2023

Incremental Optimization Mechanism for Constructing a Balanced Very Fast Decision Tree for Big Data

Notes

1.
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

References

Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C.: DBSMOTE: density-based synthetic minority over-sampling technique. Appl. Intell. 36(3), 664–684 (2012)
Article Google Scholar
Busa-Fekete, R., Szörényi, B., Dembczynski, K., Hüllermeier, E.: Online F-measure optimization. In: Advances in Neural Information Processing Systems, pp. 595–603 (2015)
Google Scholar
Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Lawrence Erlbaum Associates Ltd. (2001)
Google Scholar
Gao, J., Liu, X., Ooi, B.C., Wang, H., Chen, G.: An online cost sensitive decision-making method in crowdsourcing systems. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 217–228. ACM (2013)
Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). doi:10.1007/11538059_91
Chapter Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)
Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.: The perceptron algorithm with uneven margins. In: ICML, vol. 2, pp. 379–386 (2002)
Google Scholar
Scott, C.: Surrogate losses and regret bounds for cost-sensitive classification with example-dependent costs. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 153–160 (2011)
Google Scholar
Stefanowski, J., Wilk, S.: Selective pre-processing of imbalanced data for improving classification performance. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 283–292. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85836-2_27
Chapter Google Scholar
Sun, Y., Wong, A.K., Kamel, M.S.: Classification of imbalanced data: a review. Int. J. Pattern Recogn. Artif. Intell. 23(04), 687–719 (2009)
Article Google Scholar
Van Hulse, J., Khoshgoftaar, T.M., Napolitano, A.: Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 935–942. ACM (2007)
Google Scholar
Wang, J., Zhao, P., Hoi, S.C.: Cost-sensitive online classification. IEEE Trans. Knowl. Data Eng. 26(10), 2425–2438 (2014)
Article Google Scholar
Ying, Y., Wen, L., Lyu, S.: Stochastic online AUC maximization. In: Advances in Neural Information Processing Systems, pp. 451–459 (2016)
Google Scholar
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent (2003)
Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge the funding supported by State Key Laboratory of Software Engineering, Computer School, Wuhan University, and research project number is SKLSE-2015-A-06, and also partially supported by the National Natural Science Foundation of China under the Project 61371191.

Author information

Authors and Affiliations

Department of Information Engineering, Communication University of China, Beijing, China
Xufen Cai, Long Ye & Qin Zhang
State Key Laboratory of Software Engineering, Computer School, Wuhan University, Hubei, China
Min Yang & Rong Zhu
Institute of Computer Technology, CAS, Beijing, China
Xufen Cai & Xiaoyan Li

Authors

Xufen Cai
View author publications
You can also search for this author in PubMed Google Scholar
Min Yang
View author publications
You can also search for this author in PubMed Google Scholar
Rong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Long Ye
View author publications
You can also search for this author in PubMed Google Scholar
Qin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xufen Cai , Rong Zhu or Long Ye .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Fengyu Cong
City University of Hong Kong, Kowloon Tong, Hong Kong
Andrew Leung
Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, X., Yang, M., Zhu, R., Li, X., Ye, L., Zhang, Q. (2017). Online Multi-threshold Learning with Imbalanced Data Stream. In: Cong, F., Leung, A., Wei, Q. (eds) Advances in Neural Networks - ISNN 2017. ISNN 2017. Lecture Notes in Computer Science(), vol 10261. Springer, Cham. https://doi.org/10.1007/978-3-319-59072-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-59072-1_1
Published: 31 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59071-4
Online ISBN: 978-3-319-59072-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Online Multi-threshold Learning with Imbalanced Data Stream

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A review of online supervised learning

DynaQ: online learning from imbalanced multi-class streams through dynamic sampling

Incremental Optimization Mechanism for Constructing a Balanced Very Fast Decision Tree for Big Data

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Online Multi-threshold Learning with Imbalanced Data Stream

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A review of online supervised learning

DynaQ: online learning from imbalanced multi-class streams through dynamic sampling

Incremental Optimization Mechanism for Constructing a Balanced Very Fast Decision Tree for Big Data

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation