[go: up one dir, main page]

Skip to main content

Advertisement

Log in

Class overlap handling methods in imbalanced domain: A comprehensive survey

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Class overlap in imbalanced datasets is the most common challenging situation for researchers in the fields of deep learning (DL) machine learning (ML), and big data (BD) based applications. Class overlap and imbalance data intrinsic characteristics negatively affect the performance of classification models. The data level, algorithm level, ensemble, and hybrid methods are the most commonly used solutions to reduce the biasing of the standard classification model towards the majority class. The data level methods change the distribution of class instances thus, increasing the information loss and overfitting. The algorithm-level methods attempt to modify its structure which gives more weight to the misclassified minority class instances in the learning phases. However, the changes in the algorithm are less compatible for the users. To overcome the issues in these methods, an in-depth discussion on the state-of-the-art methods is required and thus, presented here. In this survey, we presented a detailed discussion of the existing methods to handle class overlap in imbalanced datasets with their advantages, disadvantages, limitations, and key performance metrics in which the method shown outperformed. The detailed comparative analysis mainly of recent years’ papers discussed and summarized the research gaps and future directions for the researchers in ML, DL, and BD-based applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Kumar A, Singh D, Yadav RS (2023) Entropy and improved k-nearest neighbor search based under-sampling (ENU) method to handle class overlap in imbalanced datasets. Concurr Comput Pract Exp e7894

  2. Vuttipittayamongkol P, Elyan E, Petrovski A (2021) On the class overlap problem in imbalanced data classification. Knowl-Based Syst 212:106631

    Google Scholar 

  3. Vuttipittayamongkol P, Elyan E (2020) Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Inf Sci 509:47–70

    Google Scholar 

  4. Bilal M, Maqsood M, Yasmin S, Ul Hasan N, Rho S (2022) A transfer learning-based efficient spatiotemporal human action recognition framework for long and overlapping action classes. J Supercomput 78(2):2873–2908

    Google Scholar 

  5. Ghosh K, Bellinger C, Corizzo R, Krawczyk B, Japkowicz N (2021) On the combined effect of class imbalance and concept complexity in deep learning. In: 2021 IEEE international conference on big data (big data), pp 4859–4868

  6. Zhai J, Wang M, Zhang S (2022) Binary imbalanced big data classification based on fuzzy data reduction and classifier fusion. Soft Comput 26(6):2781–2792

    Google Scholar 

  7. Yin X, Liu Q, Huang X, Pan Y (2022) Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning. Tunn Undergr Space Technol 120:104285

    Google Scholar 

  8. Javaid N, Jan N, Umar Javed M (2021) An adaptive synthesis to handle imbalanced big data with deep siamese network for electricity theft detection in smart grids. J Parallel Distrib Comput 153:44–52

    Google Scholar 

  9. William C, Sleeman IV, Krawczyk B (2021) Multi-class imbalanced big data classification on spark. Knowl Based Syst 212:106598

    Google Scholar 

  10. Maurya CK, Toshniwal D, Venkoparao GV (2016) Online sparse class imbalance learning on big data. Neurocomputing 216:250–260

    Google Scholar 

  11. Wang Z, Xin J, Yang H, Tian S, Yu G, Xu C, Yao Y (2017) Distributed and weighted extreme learning machine for imbalanced big data learning. Tsinghua Sci Technol 22(2):160–173

    Google Scholar 

  12. Johnson JM, Khoshgoftaar TM (2019) Deep learning and data sampling with imbalanced big data. In: 2019 IEEE 20th international conference on information reuse and integration for data science (IRI), pp 175–183

  13. Chatrati SP, Hossain G, Goyal A, Bhan A, Bhattacharya S, Gaurav D, Tiwari SM (2020) Smart home health monitoring system for predicting type 2 diabetes and hypertension. J King Saud Univ-Comput Inf Sci

  14. Liu Y, Luo J, Ding P (2018) Inferring microrna targets based on restricted Boltzmann machines. IEEE J Biomed Health Inform 23(1):427–436

    Google Scholar 

  15. Jayashree R (2022) Enhanced classification using restricted boltzmann machine method in deep learning for covid-19. In: Understanding COVID-19: the role of computational intelligence. Springer, pp 425–446

  16. Mohd Hasri NN, Wen NH, Howe CW, Mohamad MS, Deris S, Kasim S (2017) Improved support vector machine using multiple SVM-RFE for cancer classification. Int J Adv Sci Eng Inf Technol 7(4–2):1589–1594

    Google Scholar 

  17. Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit 77:160–172

    Google Scholar 

  18. Gupta S, Kumar M (2021) Prostate cancer prognosis using multi-layer perceptron and class balancing techniques. In: 2021 13th international conference on contemporary computing (IC3-2021), pp 1–6

  19. Ding H, Chen L, Dong L, Fu Z, Cui X (2022) Imbalanced data classification: A KNN and generative adversarial networks-based hybrid approach for intrusion detection. Future Gener Comput Syst 131:240–254

    Google Scholar 

  20. Qu X, Yang L, Guo K, Ma L, Sun M, Ke M, Li M (2021) A survey on the development of self-organizing maps for unsupervised intrusion detection. Mobile Netw Appl 26(2):808–829

    Google Scholar 

  21. Aldwairi T, Perera D, Novotny MA (2018) An evaluation of the performance of restricted Boltzmann machines as a model for anomaly network intrusion detection. Comput Netw 144:111–119

    Google Scholar 

  22. Gupta N, Jindal V, Bedi P (2021) LIO IDS: handling class imbalance using LSTM and improved one-vs-one technique in intrusion detection system. Comput Netw 192:1080–76

    Google Scholar 

  23. Pal A, Kumar M (2019) DLME: distributed log mining using ensemble learning for fault prediction. IEEE Syst J 13(4):3639–3650

    Google Scholar 

  24. Liu S, Jiang H, Wu Z, Li X (2022) Data synthesis using deep feature enhanced generative adversarial networks for rolling bearing imbalanced fault diagnosis. Mechan Syst Signal Process 163:108139

    Google Scholar 

  25. Peng Y, Wang Y, Shao Y (2022) A novel bearing imbalance fault-diagnosis method based on a wasserstein conditional generative adversarial network. Measurement 192:110924

    Google Scholar 

  26. Zhang W, Li X, Jia XD, Ma H, Luo Z, Li X (2020) Machinery fault diagnosis with imbalanced data using deep generative adversarial networks. Measurement 152:107377

    Google Scholar 

  27. Jang J, Kim CO (2022) Unstructured borderline self-organizing map: learning highly imbalanced, high-dimensional datasets for fault detection. Expert Syst Appl 188:116028

    Google Scholar 

  28. Kim JK, Lee JS, Han YS (2019) Fault detection prediction using a deep belief network-based multi-classifier in the semiconductor manufacturing process. Int J Softw Eng Knowl Eng 29:1125–1139

    Google Scholar 

  29. Peng P, Zhang W, Zhang Y, Wang H, Zhang H (2022) Non-revisiting genetic cost-sensitive sparse autoencoder for imbalanced fault diagnosis. Appl Soft Comput 114:108138

    Google Scholar 

  30. Zhao B, Zhang X, Li H, Yang Z (2020) Intelligent fault diagnosis of rolling bearings based on normalized CNN considering data imbalance and variable working conditions. Knowl Based Syst 199:105971

    Google Scholar 

  31. Zhu J, Jiang Q, Shen Y, Qian C, Xu F, Zhu Q (2022) Application of recurrent neural network to mechanical fault diagnosis: a review. J Mechan Sci Technol 36(2):1–16

    Google Scholar 

  32. Liu J, Zhang C, Jiang X (2022) Imbalanced fault diagnosis of rolling bearing using improved MsR-GAN and feature enhancement-driven CapsNet. Mechan Syst Signal Process 168

  33. Dangut MD, Skaf Z, Jennions IK (2022) Handling imbalanced data for aircraft predictive maintenance using the BACHE algorithm. Appl Soft Comput 123:108924

    Google Scholar 

  34. De S, Prabu P (2022) A sampling-based stack framework for imbalanced learning in churn prediction. IEEE Access 10:68017–68028

    Google Scholar 

  35. Toor AA, Usman M (2022) Adaptive telecom churn prediction for concept-sensitive imbalance data streams. J Supercomput 78(3):3746–3774

    Google Scholar 

  36. Kimura T (2022) Customer churn prediction with hybrid resampling and ensemble learning. J Manag Inf Decis Sci 25(1)

  37. Edwine N, Wang W, Song W, Ssebuggwawo D (2022) Detecting the risk of customer churn in telecom sector: a comparative study. Math Probl Eng 2022

  38. Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning with class imbalance. J Big Data 6(1):1–54

    Google Scholar 

  39. Moghar A, Hamiche M (2020) Stock market prediction using LSTM recurrent neural network. Procedia Comput Sci 170:1168–1173

    Google Scholar 

  40. Akşehir ZD, Kiliç E (2022) How to handle data imbalance and feature selection problems in CNN-based stock price forecasting. IEEE Access 10:31297–31305

    Google Scholar 

  41. Wang X, Zhang R, Zhang Z (2022) A novel hybrid sampling method esmote+ sslm for handling the problem of class imbalance with overlap in financial distress detection. Neural Process Lett, pp 1–25

  42. Wu JM-T, Li Z, Srivastava G, Tasi MH, Lin JCW (2021) A graph-based convolutional neural network stock price prediction with leading indicators. Softw Pract Exp 51(3):628–644

    Google Scholar 

  43. Kawintiranon K, Singh L, Budak C (2022) Traditional and context-specific spam detection in low resource settings. Mach Learn 111(7):1–22

    MathSciNet  Google Scholar 

  44. Wang G, Wang J, He K (2022) Majority-to-minority resampling for boosting-based classification under imbalanced data. Appl Intell 53(4):1–22

    Google Scholar 

  45. Lingam G, Yasaswini B, Jagadamba PVSL, Kolliboyana N (2022) An improved bot identification with imbalanced data using GG-XGBoost. In: 2022 2nd International conference on intelligent technologies (CONIT), pp 1–6

  46. Hazarika BB, Gupta D (2022) Density weighted twin support vector machines for binary class imbalance learning. Neural Process Lett 54(2):1091–1130

    Google Scholar 

  47. Hossain T, Mauni HZ, Rab R (2022) Reducing the effect of imbalance in text classification using SVD and glove with ensemble and deep learning. Comput Inform 41(1):98–115

    Google Scholar 

  48. Rashid MRU, Mahbub M, Adnan MA (2022) Breaking the curse of class imbalance: bangla text classification. Trans Asian Low-Resour Lang Inf Process 21(5):1–21

    Google Scholar 

  49. Khurana A, Verma OP (2022) Optimal feature selection for imbalanced text classification. IEEE Trans Artif Intell

  50. Wang Z, Wang H (2021) Global data distribution weighted synthetic oversampling technique for imbalanced learning. IEEE Access 9:44770–44783

    Google Scholar 

  51. Epasto A, Lattanzi S, Leme RP (2017) Ego-splitting framework: from non-overlapping to overlapping clusters. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 145–154

  52. Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE symposium on computational intelligence and data mining, pp 324–331

  53. Lu Y, Cheung Y-M, Tang YY (2016) Hybrid sampling with bagging for class imbalance learning. In: Pacific-Asia conference on knowledge discovery and data mining, pp 14–26

  54. Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE symposium on computational intelligence and data mining, pp 324–331

  55. Zhao Y, Liu S, Hu Z (2022) Focal learning on stranger for imbalanced image segmentation. IET Image Process 16(5):1305–1323

    Google Scholar 

  56. Ruwani K, Fernando M, Tsokos CP (2021) Dynamically weighted balanced loss: class imbalanced learning and confidence calibration of deep neural networks. IEEE Trans Neural Netw Learn Syst

  57. Jeong JJ, Tariq A, Adejumo T, Trivedi H, Gichoya JW, Banerjee I (2022) Systematic review of generative adversarial networks (GANs) for medical image classification and segmentation. J Digit Imag 35:1–16

    Google Scholar 

  58. Stoyanov D, Taylor Z, Carneiro G, Syeda-Mahmood T, Martel A, Maier-Hein L, Tavares JMRS, Bradley A, Papa JP, Belagiannis V et al (2018) Deep learning in medical image analysis and multimodal learning for clinical decision support. In: 4th International workshop, DLMIA 2018, and 8th international workshop, ML-CDS 2018, Held in conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings, vol 11045. Springer

  59. Akil M, Saouli R, Kachouri R et al (2020) Fully automatic brain tumor segmentation with deep learning-based selective attention using overlapping patches and multi-class weighted cross-entropy. Med Image Anal 63:101692

    Google Scholar 

  60. Nyo MT, Mebarek-Oudina F, Hlaing SS, Khan NA (2022) Otsu’s thresholding technique for mri image brain tumor segmentation. Multimedia Tools Appl 81(30):43837–43849

    Google Scholar 

  61. Sampath V, Maurtua I, Aguilar Martín JJ, Gutierrez A (2021) A survey on generative adversarial networks for imbalance problems in computer vision tasks. J Big Data 8:1–59

    Google Scholar 

  62. Fendri E, Hammami M (2022) Imbalanced learning for robust moving object classification in video surveillance applications. In: Intelligent systems design and applications: 21st international conference on intelligent systems design and applications (ISDA 2021) held during december 13–15, 2021. Springer, vol 418, pp 199

  63. Zhang Y, Lin M, Yang Y, Ding C (2022) A hybrid ensemble and evolutionary algorithm for imbalanced classification and its application on bioinformatics. Comput Biol Chem 98:107646

    Google Scholar 

  64. Dou L, Yang F, Xu L, Zou Q (2021) A comprehensive review of the imbalance classification of protein post-translational modifications. Brief Bioinform 22(5):bbab089

  65. Thavappiragasam M, Kale V, Hernandez O, Sedova A (2021) Addressing load imbalance in bioinformatics and biomedical applications: efficient scheduling across multiple GPUs. In: 2021 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 1992–1999

  66. Chen J, Yang R, Zhang C, Zhang L, Zhang Q (2019) DeepGly: a deep learning framework with recurrent and convolutional neural networks to identify protein glycation sites from imbalanced data. IEEE Access 7:142368–142378

    Google Scholar 

  67. Greene CS, Himmelstein DS, Kiralis J, Moore JH (2010) The informative extremes: using both nearest and farthest individuals can improve relief algorithms in the domain of human genetics. In: European conference on evolutionary computation, machine learning and data mining in bioinformatics, pp 182–193

  68. Greene CS, Penrod NM, Kiralis J, Moore JH (2009) Spatially uniform relieff (SURF) for computationally-efficient filtering of gene-gene interactions. BioData Min 2(1):1–9

    Google Scholar 

  69. Djenouri Y, Belhadi A, Srivastava G, Lin JCW (2021) Secure collaborative augmented reality framework for biomedical informatics. IEEE J Biomed Health Inform 26(6):2417–2424

    Google Scholar 

  70. Chen L, Fang B, Shang Z, Tang Y (2018) Tackling class overlap and imbalance problems in software defect prediction. Softw Qual J 26(1):97–125

    Google Scholar 

  71. Goyal S (2022) Handling class-imbalance with KNN (neighbourhood) under-sampling for software defect prediction. Artif Intell Rev 55(3):2023–2064

    Google Scholar 

  72. Manchala P, Bisi M (2022) Diversity based imbalance learning approach for software fault prediction using machine learning models. Appl Soft Comput 124:109069

    Google Scholar 

  73. Yin J, Tang MJ, Cao J, Wang H, You M, Lin Y (2022) Vulnerability exploitation time prediction: an integrated framework for dynamic imbalanced learning. World Wide Web 25(1):401–423

    Google Scholar 

  74. Lu S, Gao Z, Xu Q, Jiang C, Zhang A, Wang X (2022) Class-imbalance privacy-preserving federated learning for decentralized fault diagnosis with biometric authentication. IEEE Trans Ind Inform

  75. Sun M, Yang R, Liu M (2022) Privacy-preserving minority oversampling protocols with fully homomorphic encryption. Secur Commun Netw 2022

  76. Singh K, Mahajan A, Mansotra V (2022) Deep learning approach based on ADASYN for detection of web attacks in the CICIDS2017 dataset. In: Rising threats in expert applications and solutions. Springer, pp 53–62

  77. Le TTH, Oktian YE, Kim H (2022) Xgboost for imbalanced multiclass classification-based industrial internet of things intrusion detection systems. Sustainability 14(14):8707

    Google Scholar 

  78. Zhang S, Yin J, Li Z, Yang R, Du M, Li R (2022) Node-imbalance learning on heterogeneous graph for pirated video website detection. In: 2022 IEEE 25th international conference on computer supported cooperative work in design (CSCWD). IEEE, pp 834–840

  79. Santos MS, Abreu PH, Japkowicz N, Fernández A, Soares C, Wilk S, Santos J (2022) On the joint-effect of class imbalance and overlap: a critical review. Artif Intell Rev 55(8):1–69

    Google Scholar 

  80. Santos MS, Abreu PH, Japkowicz N, Fernández A, Santos J (2022) A unifying view of class overlap and imbalance: key concepts, multi-view panorama, and open avenues for research. Inf Fusion

  81. Branco P, Torgo L, Ribeiro RP (2016) A survey of predictive modeling on imbalanced domains. ACM Comput Surv (CSUR) 49(2):1–50

    Google Scholar 

  82. Rout N, Mishra D, Mallick MK (2018) Handling imbalanced data: a survey. In: International proceedings on advances in soft computing, intelligent systems and applications. Springer, pp 431–443

  83. Kaur H, Pannu HS, Malhi AK (2019) A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Comput Surv (CSUR) 52(4):1–36

    Google Scholar 

  84. Xiong H, Wu J, Liu L (2010) Classification with class overlapping: a systematic study. In: 2010 International conference on E-business intelligence, pp 491–497

  85. Liu X, Fu L, Lin JCW, Liu S (2022) SRAS-net: low-resolution chromosome image classification based on deep learning. IET Syst Biol 16(3–4):85–97

    Google Scholar 

  86. Tian C, Zhang X, Lin JCW, Zuo W, Zhang Y, Lin CW (2022) Generative adversarial networks for image super-resolution: a survey. arXiv:2204.13620

  87. Mezair T, Djenouri Y, Belhadi A, Srivastava G, Lin JCW (2022) A sustainable deep learning framework for fault detection in 6G industry 4.0 heterogeneous data environments. Comput Commun 187:164–171

    Google Scholar 

  88. Akondi VS, Menon V, Baudry J, Whittle J (2022) Novel big data-driven machine learning models for drug discovery application. Molecules 27(3):594

    Google Scholar 

  89. Khattak A, Bukhsh R, Aslam S, Yafoz A, Alghushairy O, Alsini R (2022) A hybrid deep learning-based model for detection of electricity losses using big data in power systems. Sustainability 14(20):13627

    Google Scholar 

  90. Hewamalage H, Bergmeir C, Bandara K (2021) Recurrent neural networks for time series forecasting: current status and future directions. Int J Forecast 37:388–427

    Google Scholar 

  91. Das S, Datta S, Chaudhuri BB (2018) Handling data irregularities in classification: foundations, trends, and future challenges. Pattern Recognit 81:674–693

    Google Scholar 

  92. Napierała K, Stefanowski J, Wilk S (2010) Learning from imbalanced data in presence of noisy and borderline examples. In: International conference on rough sets and current trends in computing. Springer, pp 158–167

  93. López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Google Scholar 

  94. Stefanowski J (2016) Dealing with data difficulty factors while learning from imbalanced data. In: Challenges in computational statistics and data mining. Springer, pp 333–363

  95. Wojciechowski S, Wilk S (2017) Difficulty factors and preprocessing in imbalanced data sets: an experimental study on artificial data. Found Comput Decis Sci 42(2):149–176

    Google Scholar 

  96. García V, Mollineda RA, Sánchez JS (2008) On the k-NN performance in a challenging scenario of imbalance and overlapping. Pattern Anal Appl 11(3):269–280

    MathSciNet  Google Scholar 

  97. Lee HK, Kim SB (2018) An overlap-sensitive margin classifier for imbalanced and overlapping data. Expert Syst Appl 98:72–83

    Google Scholar 

  98. Das B, Krishnan NC, Cook DJ (2014) Handling imbalanced and overlapping classes in smart environments prompting dataset. In: Data Min Serv. Springer, pp 199–219

  99. Pascual-Triana JD, Charte D, Arroyo MA, Fernández A, Herrera F (2021) Revisiting data complexity metrics based on morphology for overlap and imbalance: snapshot, new overlap number of balls metrics and singular problems prospect. Knowl Inf Syst 63:1–29

    Google Scholar 

  100. Vuttipittayamongkol P, Elyan E (2020) Improved overlap-based undersampling for imbalanced dataset classification with application to Epilepsy and Parkinson’s disease. Int J Neural Syst 30(08):2050043

    Google Scholar 

  101. Dkhar RA, Nath K, Roy S, Bhattacharyya DK, Nandi S (2016) Evaluating the effectiveness of soft k-means in detecting overlapping clusters. In: Proceedings of the 2nd international conference on information and communication technology for competitive strategies, pp 1–6

  102. Tao X, Chen W, Zhang X, Guo W, Qi L, Fan Z (2021) SVDD boundary and DPC clustering technique-based oversampling approach for handling imbalanced and overlapped data. Knowl Based Syst 234:107588

    Google Scholar 

  103. Xiong H, Li M, Jiang T, Zhao S (2013) Classification algorithm based on nb for class overlapping problem. Appl Math 7(2L):409–415

    MathSciNet  Google Scholar 

  104. Tung NT, Dieu VH, Than K, Linh NV (2018) Reducing class overlapping in supervised dimension reduction. In: Proceedings of the 9th international symposium on information and communication technology, pp 8–15

  105. Fernandes ERQ, De Carvalho AC (2019) Evolutionary inversion of class distribution in overlapping areas for multi-class imbalanced learning. Inf Sci 494:141–154

    Google Scholar 

  106. Li Z, Huang M, Liu G, Jiang C (2021)A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection. Expert Syst Appl 175:114750

  107. Wong ML, Seng K, Wong PK (2020) Cost-sensitive ensemble of stacked denoising autoencoders for class imbalance problems in business domain. Expert Syst Appl 141:112918

    Google Scholar 

  108. Rogić S, Kašćelan L, Bach MP (2022) Customer response model in direct marketing: solving the problem of unbalanced dataset with a balanced support vector machine. J Theor Appl Electron Commer Res 17(3):1003–1018

    Google Scholar 

  109. Zhu B, Pan X, Vanden Broucke S, Xiao J (2022) A GAN-based hybrid sampling method for imbalanced customer classification. Inf Sci 609:1397–1411

    Google Scholar 

  110. Ntomaris AV, Marneris IG, Biskas PN, Bakirtzis AG (2022) Optimal participation of RES aggregators in electricity markets under main imbalance pricing schemes: price taker and price maker approach. Electr Power Syst Res 206:107786

    Google Scholar 

  111. Lee D, Kim K (2022) Business transaction recommendation for discovering potential business partners using deep learning. Expert Syst Appl 201:117222

    Google Scholar 

  112. Garcia J (2022) Bankruptcy prediction using synthetic sampling. Mach Learn Appl 9:100343

    Google Scholar 

  113. Rodić LD, Perković T, Škiljo M, Šolić P (2022) Privacy leakage of lorawan smart parking occupancy sensors. Future Gener Comput Syst

  114. Vuttipittayamongkol P, Elyan E (2020) Overlap-based undersampling method for classification of imbalanced medical datasets. In: Maglogiannis I, Iliadis L, Pimenidis E (eds) Artificial intelligence applications and innovations. Springer, Cham, pp 358–369

    Google Scholar 

  115. Zhang R, Zhang Z, Wang D (2021) RFCL: a new under-sampling method of reducing the degree of imbalance and overlap. Pattern Anal Appl 24(2):641–654

    Google Scholar 

  116. Devi D, Biswas SK, Purkayastha B (2019) Learning in presence of class imbalance and class overlapping by using one-class SVM and undersampling technique. Connect Sci 31(2):105–142

    Google Scholar 

  117. Vuttipittayamongkol P, Elyan E, Petrovski A, Jayne C (2018) Overlap-based undersampling for improving imbalanced data classification. In: International conference on intelligent data engineering and automated learning. Springer, pp 689–697

  118. Ibrahim MH (2021) ODBOT: outlier detection-based oversampling technique for imbalanced datasets learning. Neural Comput Appl 33:15781–15806

    Google Scholar 

  119. Tao X, Zheng Y, Chen W, Zhang X, Qi L, Fan Z, Huang S (2022) SVDD-based weighted oversampling technique for imbalanced and overlapped dataset learning. Inf Sci 588:13–51

    Google Scholar 

  120. Zhu Y, Yan Y, Zhang Y, Zhang Y (2020) EHSO: evolutionary hybrid sampling in overlapping scenarios for imbalanced learning. Neurocomputing 417:333–346

    Google Scholar 

  121. Maldonado S, Vairetti C, Fernandez A, Herrera F (2022) FW-SMOTE: a feature-weighted oversampling approach for imbalanced classification. Pattern Recognit 124:108511

    Google Scholar 

  122. Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66

    Google Scholar 

  123. Mayabadi S, Saadatfar H (2022) Two density-based sampling approaches for imbalanced and overlapping data. Knowl Based Syst 241:108217

    Google Scholar 

  124. Zian S, Kareem SA, Varathan KD (2021) An empirical evaluation of stacked ensembles with different meta-learners in imbalanced classification. IEEE Access

  125. Sumana BV, Punithavalli M (2020) Optimising prediction in overlapping and non-overlapping regions. Int J Nat Comput Res (IJNCR) 9(1):45–63

    Google Scholar 

  126. Gupta S, Gupta A (2018) Handling class overlapping to detect noisy instances in classification. Knowl Eng Rev 33

  127. Chujai P, Chomboon K, Chaiyakhan K, Kerdprasop K, Kerdprasop N (2017) A cluster based classification of imbalanced data with overlapping regions between classes. Proceedings of the international multiconference of engineers and computer scientists 1:353–358

    Google Scholar 

  128. Liu C, Ren Y, Liang M, Gu Z, Wang J, Pan L, Wang Z (2020) Detecting overlapping data in system logs based on ensemble learning method. Wireless Commun Mobile Comput 2020:1–8

    Google Scholar 

  129. De Miguel L, Gómez D, Rodríguez JT, Montero J, Bustince H, Dimuro GP, Sanz JA (2019) General overlap functions. Fuzzy Sets Syst 372:81–96

    MathSciNet  Google Scholar 

  130. Elkan C (2001) The foundations of cost-sensitive learning. International joint conference on artificial intelligence, vol 17. Lawrence Erlbaum Associates Ltd, Mahwah, pp 973–978

    Google Scholar 

  131. Xia Y, Liu C, Liu N (2017) Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electron Commer Res Appl 24:30–49

    Google Scholar 

  132. Yang S, Korayem M, AlJadda K, Grainger T, Natarajan S (2017) Combining content-based and collaborative filtering for job recommendation system: a cost-sensitive statistical relational learning approach. Knowl Based Syst 136:37–45

    Google Scholar 

  133. Yuan BW, Luo XG, Zhang ZL, Yu Y, Huo HW, Johannes T, Zou XD (2021) A novel density-based adaptive k nearest neighbor method for dealing with overlapping problem in imbalanced datasets. Neural Comput Appl 33(9):4457–4481

    Google Scholar 

  134. Rubbo M, Silv LA (2021) Filtering-based instance selection method for overlapping problem in imbalanced datasets. J 4(3):308–327

  135. Zhang N, Karimoune W, Thompson L, Dang H (2017) A between-class overlapping coherence-based algorithm in KNN classification. In: 2017 IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 572–577

  136. Gu Y, Cheng L (2017) Classification of class overlapping datasets by kernel-MTS method. Int J Innovat Comput Inf Control 13(5):1759–1767

    Google Scholar 

  137. Afridi MK, Azam N, Yao J (2020) Variance based three-way clustering approaches for handling overlapping clustering. Int J Approx Reason 118:47–63

    MathSciNet  Google Scholar 

  138. Li H, Zhang L, Zhou X, Huang B (2017) Cost-sensitive sequential three-way decision modeling using a deep neural network. Int J Approx Reason 85:68–78

    MathSciNet  Google Scholar 

  139. Lee HK, Kim SB (2018) An overlap-sensitive margin classifier for imbalanced and overlapping data. Expert Syst Appl 98:72–83

    Google Scholar 

  140. Mienye ID, Sun Y (2021) Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inf Med Unlocked 25:100690

    Google Scholar 

  141. Lin X, Li C, Zhang Y, Su B, Fan M, Wei H (2018) Selecting feature subsets based on svm-rfe and the overlapping ratio with applications in bioinformatics. Molecules 23(1):52

    Google Scholar 

  142. Akhter S, Sharmin S, Ahmed S, Sajib AA, Shoyaib M (2021) mRelief: a reward penalty based feature subset selection considering data overlapping problem. In: International conference on computational science. Springer, pp 278–292

  143. Omar B, Rustam F, Mehmood A, Choi GS (2021) Minimizing the overlapping degree to improve class-imbalanced learning under sparse feature selection: application to fraud detection. IEEE Access 9:28101–28110

    Google Scholar 

  144. Alshomrani S, Bawakid A, Shim Seong-O, Fernández A, Herrera F (2015) A proposal for evolutionary fuzzy systems using feature weighting: dealing with overlapping in imbalanced datasets. Knowl Based Syst 73:1–17

    Google Scholar 

  145. Zhang Y, Cheng S, Shi Y, Gong DW, Zhao X (2019) Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm. Expert Syst Appl 137:46–58

    Google Scholar 

  146. Sáez JA, Galar M, Krawczyk B (2019) Addressing the overlapping data problem in classification using the one-vs-one decomposition strategy. IEEE Access 7:83396–83411

    Google Scholar 

  147. Shahee SA, Ananthakumar U (2021) An overlap sensitive neural network for class imbalanced data. Data Min Knowl Discov 35(4):1–34

    MathSciNet  Google Scholar 

  148. Yuan BW, Zhang ZL, Luo XG, Yu Y, Zou XH, Zou XD (2021) OIS-RF: a novel overlap and imbalance sensitive random forest. Eng Appl Artif Intell 104:104355

    Google Scholar 

  149. Nwe MM, Lynn KT (2019) kNN-based overlapping samples filter approach for classification of imbalanced data. In: International conference on software engineering research, management and applications. Springer, pp 55–73

  150. Yan Y, Jiang Y, Zheng Z, Yu C, Zhang Y, Zhang Y (2022) LDAS: local density-based adaptive sampling for imbalanced data classification. Expert Syst Appl 191:116213

    Google Scholar 

  151. Roy A, Cruz RM, Sabourin R, Cavalcanti GD (2018) A study on combining dynamic selection and data preprocessing for imbalance learning. Neurocomputing 286:179–192

    Google Scholar 

  152. Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv (CSUR) 51:1–36

    Google Scholar 

  153. Tong K, Wu Y (2022) Deep learning-based detection from the perspective of small or tiny objects: a survey. Image Vis Comput 123:104471

    Google Scholar 

  154. Liu Z, Tong L, Jiang Z, Chen L, Zhou F, Zhang Q, Zhang X, Jin Y, Zhou H (2020) Deep learning based brain tumor segmentation: a survey. Preprint at https://arxiv.org/abs/2007.09479

  155. Wong LJ, Headley WC, Michaels AJ (2019) Specific emitter identification using convolutional neural network-based IQ imbalance estimators. IEEE Access 7:33544–33555

    Google Scholar 

  156. Chen Z, Duan J, Kang L, Qiu G (2021) Class-imbalanced deep learning via a class-balanced ensemble. IEEE Trans Neural Netw Learn Syst

  157. Yan Y, Chen M, Shyu ML, Chen SC (2015) Deep learning for imbalanced multimedia data classification. In: 2015 IEEE international symposium on multimedia (ISM). IEEE, pp 483–488

  158. Böhm A, Ücker A, Jäger T, Ronneberger O, Falk T (2018) ISOO_DL: Instance segmentation of overlapping biological objects using deep learning. In: 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018). IEEE, pp 1225–1229

  159. Banerjee I, Ling Y, Chen MC, Hasan SA, Langlotz CP, Moradzadeh N, Chapman B, Amrhein T, Mong D, Rubin DL (2019) Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif Intell Med 97:79–88

    Google Scholar 

  160. Gao L, Lu P, Ren Y (2021) A deep learning approach for imbalanced crash data in predicting highway-rail grade crossings accidents. Reliab Eng Syst Saf 216:108019

    Google Scholar 

  161. Rai HM, Chatterjee K (2022) Hybrid CNN LSTM deep learning model and ensemble technique for automatic detection of myocardial infarction using big ECG data. Appl Intell 52(5):5366–5384

    Google Scholar 

  162. Gao J, Zhang H, Lu P, Wang Z (2019) An effective LSTM recurrent network to detect arrhythmia on imbalanced ecg dataset. J Healthc Eng

  163. Tran D, Mac H, Tong V, Tran HA, Nguyen LG (2018) A LSTM based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing 275:2401–2413

    Google Scholar 

  164. Dong Q, Gong S, Zhu X (2018) Imbalanced deep learning by minority class incremental rectification. IEEE Trans Pattern Anal Mach Intell 41(6):1367–1381

    Google Scholar 

  165. Lin E, Chen Q, Qi X (2020) Deep reinforcement learning for imbalanced classification. Appl Intell 50(8):2488–2502

    Google Scholar 

  166. Wang S, Liu W, Wu J, Cao L, Meng Q, Kennedy PJ (2016) Training deep neural networks on imbalanced data sets. In: 2016 International joint conference on neural networks (IJCNN). IEEE, pp 4368–4374

  167. Zhang C, Tan KC, Li H, Hong GS (2018) A cost-sensitive deep belief network for imbalanced classification. IEEE Trans Neural Netw Learn Syst 30(1):109–122

    Google Scholar 

  168. Dablain D, Krawczyk B, Chawla NV (2022) DeepSMOTE: fusing deep learning and smote for imbalanced data. IEEE Trans Neural Netw Learn Syst

  169. Andrei V, Cucu H, Burileanu C (2019) Overlapped speech detection and competing speaker counting–humans versus deep learning. IEEE J Sel Topics Signal Process 13(4):850–862

    Google Scholar 

  170. Alia A, Maree M, Chraibi M (2022) A hybrid deep learning and visualization framework for pushing behavior detection in pedestrian dynamics. Sensors 22(11):4040

    Google Scholar 

  171. Wang X, Jing L, Lyu Y, Guo M, Wang J, Liu H, Yu J, Zeng T (2022) Deep generative mixture model for robust imbalance classification. IEEE Trans Pattern Anal Mach Intell

  172. Yue X, Li H, Fujikawa Y, Meng L (2022) Dynamic dataset augmentation for deep learning-based oracle bone inscriptions recognition. J Comput Cult Herit (JOCCH)

  173. Liu T, Bao J, Wang J, Wang J (2021) Deep learning for industrial image: challenges, methods for enriching the sample space and restricting the hypothesis space, and possible issue. Int J Comput Integr Manuf 35:1–30

    Google Scholar 

  174. ArunKumar KE, Kalaga DV, Kumar CMS, Kawaji M, Brenza TM (2021) Forecasting of covid-19 using deep layer recurrent neural networks (RNNs) with gated recurrent units (GRUs) and long short term memory (LSTM) cells. Chaos, Solitons Fractals 146:110861

    MathSciNet  Google Scholar 

  175. Zhang Q, Wang W, Zhu SC (2018) Examining cnn representations with respect to dataset bias. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  176. Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv:1702.08608

  177. Ibrahim M, Louie M, Modarres C, Paisley J (2019) Global explanations of neural networks: mapping the landscape of predictions. In: Proceedings of the 2019 AAAI/ACM conference on AI, ethics, and society, pp 279–287

  178. Wu JMT, Li Z, Herencsar N, Vo B, Lin JCW (2021) A graph-based CNN-LSTM stock price prediction algorithm with leading indicators. Multimedia Syst 29:1–20

    Google Scholar 

  179. Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys D Nonlin Phenom 404:132306

    MathSciNet  Google Scholar 

  180. Chen MY, Chiang HS, Huang WK (2022) Efficient generative adversarial networks for imbalanced traffic collision datasets. IEEE Trans Intell Transp Syst

  181. Lee HK, Lee J, Kim SB (2022) Boundary-focused generative adversarial networks for imbalanced and multimodal time series. IEEE Trans Knowl Data Eng

  182. Li W, Chen J, Cao J, Ma C, Wang J, Cui X, Chen P (2022) EID-GAN: generative adversarial nets for extremely imbalanced data augmentation. IEEE Trans Ind Inform

  183. Gao S, Dai Y, Li Y, Liu K, Chen K, Liu Y (2022) Multiview wasserstein generative adversarial network for imbalanced pearl classification. Meas Sci Technol 33(8):085406

    Google Scholar 

  184. Suh S, Lee H, Lukowicz P, Lee YO (2021) CEGAN: classification enhancement generative adversarial networks for unraveling data imbalance problems. Neural Netw 133:69–86

    Google Scholar 

  185. De Oliveira Nogueira T, Palacio GBA, Braga FD, Maia PPN, De Moura EP, De Andrade CF, Rocha PAC (2022) Imbalance classification in a scaled-down wind turbine using radial basis function kernel and support vector machines. Energy 238:122064

    Google Scholar 

  186. Satapathy SK, Mishra S, Mallick PK, Chae GS (2021) ADASYN and ABC-optimized RBF convergence network for classification of electroencephalograph signal. Pers Ubiquitous Comput 27:1–17

    Google Scholar 

  187. Zhang D, Zhang N, Ye N, Fang J, Han X (2020) Hybrid learning algorithm of radial basis function networks for reliability analysis. IEEE Trans Reliab 70(3):887–900

    Google Scholar 

  188. Kamaruddin SK, Ravi V (2019) A parallel and distributed radial basis function network for big data analytics. In: TENCON 2019-2019 IEEE Region 10 Conference (TENCON). IEEE, pp 395–399

  189. Akter S, Das D, Haque RU, Tonmoy MIQ, Hasan MR, Mahjabeen S, Ahmed M (2022) AD-covNet: an exploratory analysis using a hybrid deep learning model to handle data imbalance, predict fatality, and risk factors in Alzheimer’s patients with covid-19. Comput Biol Med 146:105657

    Google Scholar 

  190. Ram PK, Kuila P (2022) GAAE: a novel genetic algorithm based on autoencoder with ensemble classifiers for imbalanced healthcare data. J Supercomput 79:1–32

    Google Scholar 

  191. Hassib EM, El-Desouky AI, Labib LM, El-Kenawy ESM (2020) WOA+BRNN: an imbalanced big data classification framework using whale optimization and deep neural network. Soft Comput 24(8):5573–5592

    Google Scholar 

  192. Dumas J, Boukas I, De Villena MM, Mathieu S, Cornélusse B (2019) Probabilistic forecasting of imbalance prices in the Belgian context. In: 2019 16th International conference on the European energy market (EEM). IEEE, pp 1–7

  193. Ghanem WA, Jantan A (2018) A cognitively inspired hybridization of artificial bee colony and dragonfly algorithms for training multi-layer perceptrons. Cogn Comput 10(6):1096–1134

    Google Scholar 

  194. Zhu G, Wu X, Ge J, Liu F, Zhao W, Wu C (2020) Influence of mining activities on groundwater hydrochemistry and heavy metal migration using a self-organizing map (SOM). J Clean Prod 257:120664

    Google Scholar 

  195. Hameed AA, Karlik B, Salman MS, Eleyan G (2019) Robust adaptive learning approach to self-organizing maps. Knowl Based Syst 171:25–36

    Google Scholar 

  196. Huysmans D, Smets E, De Raedt W, Van Hoof C, Bogaerts K, Van Diest I, Helic D (2018) Unsupervised learning for mental stress detection-exploration of self-organizing maps. Proceedings of the 11th international joint conference on biomedical engineering systems and technologies, vol 4, pp 26–35

  197. Xie H, Wu L, Xie W, Lin Q, Liu M, Lin Y (2021) Improving ECMWF short-term intensive rainfall forecasts using generative adversarial nets and deep belief networks. Atmos Res 249:105281

    Google Scholar 

  198. Vinayakumar R, Alazab M, Srinivasan S, Pham QV, Padannayil SK, Simran K (2020) A visualized botnet detection system based deep learning for the internet of things networks of smart cities. IEEE Trans Ind Appl 56:4436–4456

    Google Scholar 

  199. Leonelli FE, Agliari E, Albanese L, Barra A (2021) On the effective initialisation for restricted Boltzmann machines via duality with Hopfield model. Neural Netw 143:314–326

    Google Scholar 

  200. Savitha R, Ambikapathi A, Rajaraman K (2020) Online RBM: growing restricted boltzmann machine on the fly for unsupervised representation. Appl Soft Comput 92:106278

    Google Scholar 

  201. Huang K, Wang X (2022) ADA-INCVAE: improved data generation using variational autoencoder for imbalanced classification. Appl Intell 52(3):2838–2853

    Google Scholar 

  202. Chen J, Wu Z, Zhang J (2019) Driving safety risk prediction using cost-sensitive with nonnegativity-constrained autoencoders based on imbalanced naturalistic driving data. IEEE Trans Intell Transp Syst 20(12):4450–4465

    Google Scholar 

  203. Alhassan Z, Budgen D, Alshammari R, Daghstani T, McGough AS, Al Moubayed N (2018) Stacked denoising autoencoders for mortality risk prediction using imbalanced clinical data. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 541–546

  204. Johnson JM, Khoshgoftaar TM (2020) The effects of data sampling with deep learning and highly imbalanced big data. Inf Syst Front 22(5):1113–1131

    Google Scholar 

  205. Yan M, Li N (2022) Borderline-margin loss based deep metric learning framework for imbalanced data. Appl Intell 53:1–18

    Google Scholar 

  206. Lin WC, Tsai CF, Hu YH, Jhang JS (2017) Clustering-based undersampling in class-imbalanced data. Inf Sci 409:17–26

    Google Scholar 

  207. Vannucci M, Colla V (2018) Self–organizing–maps based undersampling for the classification of unbalanced datasets. In: 2018 International joint conference on neural networks (IJCNN). IEEE, pp 1–6

  208. Tsai CF, Lin WC, Hu YH, Yao GT (2019) Under-sampling class imbalanced datasets by combining clustering analysis and instance selection. Inf Sci 477:47–54

    Google Scholar 

  209. More A (2016) Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv:1608.06048

  210. Yang Z, Gao D (2013) Classification for imbalanced and overlapping classes using outlier detection and sampling techniques. Appl Math Inf Sci 7(1):375–381

    Google Scholar 

  211. Douzas G, Bacao F, Last F (2018) Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf Sci 465:1–20

    Google Scholar 

  212. Barua S, Islam MM, Yao X, Murase K (2012) MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405–425

    Google Scholar 

  213. He H, Bai Y et al (2008) ADASYN: adaptive synthetic sampling for imbalanced data. In: IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), vol 69. https://doi.org/10.1109/ijcnn

  214. Ren R, Yang Y, Sun L (2020) Oversampling technique based on fuzzy representativeness difference for classifying imbalanced data. Appl Intell 50(8):2465–2487

    Google Scholar 

  215. Elyan E, Moreno-Garcia CF, Jayne C (2021) CDSMOTE: class decomposition and synthetic minority class oversampling technique for imbalanced-data classification. Neural Comput Appl 33(7):2839–2851

    Google Scholar 

  216. Liu G, Yang Y, Li B (2018) Fuzzy rule-based oversampling technique for imbalanced and incomplete data learning. Knowl Based Syst 158:154–174

    Google Scholar 

  217. Koziarski M, Krawczyk B, Wozniak M (2019) Radial-based oversampling for noisy imbalanced data classification. Neurocomputing 343:19–33

    Google Scholar 

  218. Yan Y, Liu R, Ding Z, Du X, Chen J, Zhang Y (2019) A parameter-free cleaning method for SMOTE in imbalanced classification. IEEE Access 7:23537–23548

    Google Scholar 

  219. Patel H, Thakur GS (2016) A hybrid weighted nearest neighbor approach to mine imbalanced data. In: Proceedings of the international conference on data science (ICDATA), The steering committee of the world congress in computer, science, Computer, pp 106

  220. Tang B, He H (2015) ENN: extended nearest neighbor method for pattern recognition [research frontier]. IEEE Comput Intell Mag 10(3):52–60

    Google Scholar 

  221. Wang P, Yao Y (2018) CE3: a three-way clustering method based on mathematical morphology. Knowl Based Syst 155:54–65

    Google Scholar 

  222. Masson MH, Denoeux T (2009) RECM: relational evidential c-means algorithm. Pattern Recognit Lett 30(11):1015–1026

    Google Scholar 

  223. Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newsl 6(1):20–29

    Google Scholar 

  224. Fan Q, Wang Z, Li D, Gao D, Zha H (2017) Entropy-based fuzzy support vector machine for imbalanced datasets. Knowl Based Syst 115:87–99

    Google Scholar 

  225. Zhu C, Wang Z (2017) Entropy-based matrix learning machine for imbalanced data sets. Pattern Recognit Lett 88:72–80

    Google Scholar 

  226. Wang BX, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1):1–20

    Google Scholar 

  227. Ju H, Li H, Yang X, Zhou X, Huang B (2017) Cost-sensitive rough set: a multi-granulation approach. Knowl Based Syst 123:137–153

    Google Scholar 

  228. Ju H, Yang X, Yu H, Li T, Yu DJ, Yang J (2016) Cost-sensitive rough set approach. Inf Sci 355:282–298

    Google Scholar 

  229. Cabitza F, Ciucci D, Locoro A (2017) Exploiting collective knowledge with three-way decision theory: cases from the questionnaire-based research. Int J Approx Reason 83:356–370

    MathSciNet  Google Scholar 

  230. Maulidevi NU, Surendro K (2021) SMOTE-LOF for noise identification in imbalanced data classification. J King Saud Univ Comput Inf Sci

  231. Armano G, Tamponi E (2018) Building forests of local trees. Pattern Recognit 76:380–390

    Google Scholar 

  232. Galar M, Fernández A, Barrenechea E, Herrera F (2013) EUSboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognit 46(12):3460–3471

    Google Scholar 

  233. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2011) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst, Man, Cybern, Part C (Appl Rev) 42(4):463–484

  234. Sesmero MP, Ledezma AI, Sanchis A (2015) Generating ensembles of heterogeneous classifiers using stacked generalization. Wiley Interdiscip Rev Data Min Knowl Discov 5(1):21–34

    Google Scholar 

  235. Kim S, Zhang H, Wu R, Gong L (2011) Dealing with noise in defect prediction. In: 2011 33rd International conference on software engineering (ICSE). IEEE, pp 481–490

  236. Tang W, Khoshgoftaar TM (2004) Noise identification with the k-means algorithm. In: 16th IEEE international conference on tools with artificial intelligence. IEEE, pp 373–378

  237. Sundqvist T, Bhuyan MH, Forsman J, Elmroth E (2020) Boosted ensemble learning for anomaly detection in 5G RAN. In: IFIP international conference on artificial intelligence applications and innovations. Springer, pp 15–30

  238. Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2009) RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern Part A Syst Hum 40(1):185–197

    Google Scholar 

  239. Tosin MC, Majolo M, Chedid R, Cene VH, Balbinot A (2017) sEMG feature selection and classification using SVM-RFE. In: 2017 39th Annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, pp 390–393

  240. Alcala-Fdez J, Alcala R, Herrera F (2011) A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning. IEEE Trans Fuzzy Syst 19(5):857–872

    Google Scholar 

  241. Akhter S, Sharmin S, Ahmed S, Sajib AA, Shoyaib M (2021) mRelief: a reward penalty based feature subset selection considering data overlapping problem. In: International conference on computational science. Springer, pp 278–292

  242. Min F, Hu Q, Zhu W (2014) Feature selection with test cost constraint. Int J Approx Reason 55(1):167–179

    MathSciNet  Google Scholar 

  243. Zhao H, Wang P, Hu Q (2016) Cost-sensitive feature selection based on adaptive neighborhood granularity with multi-level confidence. Inf Sci 366:134–149

    MathSciNet  Google Scholar 

  244. Emekter R, Tu Y, Jirasakuldech B, Lu M (2015) Evaluating credit risk and loan performance in online peer-to-peer (P2P) lending. Appl Econ 47(1):54–70

    Google Scholar 

  245. Vorraboot P, Rasmequan S, Chinnasarn K, Lursinsap C (2015) Improving classification rate constrained to imbalanced data between overlapped and non-overlapped regions by hybrid algorithms. Neurocomputing 152:429–443

    Google Scholar 

Download references

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anil Kumar.

Ethics declarations

Conflicts of interests/Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, A., Singh, D. & Shankar Yadav, R. Class overlap handling methods in imbalanced domain: A comprehensive survey. Multimed Tools Appl 83, 63243–63290 (2024). https://doi.org/10.1007/s11042-023-17864-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17864-8

Keywords

Navigation