Access the full text.
Sign up today, get DeepDyve free for 14 days.
Xiaokang Wang, Huiwen Wang, Yihui Wang (2020)
A density weighted fuzzy outlier clustering approach for class imbalanced learningNeural Computing and Applications
Chien-Liang Liu, Yu Chang (2022)
Learning From Imbalanced Data With Deep Density Hybrid SamplingIEEE Transactions on Systems, Man, and Cybernetics: Systems, 52
Wei Wei, H. Dai, Wei-tai Liang (2020)
Regularized least squares locality preserving projections with applications to image recognitionNeural networks : the official journal of the International Neural Network Society, 128
Yuhong Xu, Zhiwen Yu, C. Chen, Zhulin Liu (2021)
Adaptive Subspace Optimization Ensemble Method for High-Dimensional Imbalanced Data ClassificationIEEE Transactions on Neural Networks and Learning Systems, 34
Xiangtao Chen, Lan Zhang, Xiaohui Wei, Xinguo Lu (2020)
An effective method using clustering-based adaptive decomposition and editing-based diversified oversamping for multi-class imbalanced datasetsApplied Intelligence, 51
Sima Mayabadi, Hamid Saadatfar (2022)
Two density-based sampling approaches for imbalanced and overlapping dataKnowl. Based Syst., 241
Bo-Wen Yuan, Zhongliang Zhang, Xinggang Luo, Yang Yu, Xiao-Hua Zou, X. Zou (2021)
OIS-RF: A novel overlap and imbalance sensitive random forestEng. Appl. Artif. Intell., 104
Fang Zhou, Suting Gao, Lyu Ni, M. Pavlovski, Qiwen Dong, Z. Obradovic, Weining Qian (2022)
Dynamic self-paced sampling ensemble for highly imbalanced and class-overlapped data classificationData Mining and Knowledge Discovery, 36
Kun Niu, Zaimei Zhang, Yan Liu, Ren-fei Li (2020)
Resampling ensemble model based on data distribution for imbalanced credit risk evaluation in P2P lendingInf. Sci., 536
Pattaramon Vuttipittayamongkol, Eyad Elyan (2020)
Improved Overlap-based Undersampling for Imbalanced Dataset Classification with Application to Epilepsy and Parkinson's DiseaseInternational journal of neural systems
Sinno Pan, James Kwok, Qiang Yang (2008)
Transfer Learning via Dimensionality Reduction
Yiming Chen, Shiji Song, Shuang Li, Cheng Wu (2020)
A Graph Embedding Framework for Maximum Mean Discrepancy-Based Domain Adaptation AlgorithmsIEEE Transactions on Image Processing, 29
N. Chawla, K. Bowyer, L. Hall, W. Kegelmeyer (2002)
SMOTE: Synthetic Minority Over-sampling TechniqueArXiv, abs/1106.1813
Paria Soltanzadeh, Mahdi Hashemzadeh (2021)
RCSMOTE: Range-Controlled synthetic minority over-sampling technique for handling the class imbalance problemInf. Sci., 542
Xinmin Tao, Qing Li, Wenjie Guo, Chao Ren, Qing He, Rui Liu, Junrong Zou (2020)
Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filteringInf. Sci., 519
Marcelo Marques, Saulo Villela, Carlos Borges (2020)
Large margin classifiers to generate synthetic data for imbalanced datasetsApplied Intelligence, 50
Xinmin Tao, Qing Li, Wenjie Guo, Chao Ren, Chenxi Li, R. Liu, Junrong Zou (2019)
Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classificationInf. Sci., 487
Yuanwei Zhu, Yuan-ting Yan, Yiwen Zhang, Yanping Zhang (2020)
EHSO: Evolutionary Hybrid Sampling in overlapping scenarios for imbalanced learningNeurocomputing, 417
Xiaoying Xie, Huawen Liu, Shouzhen Zeng, Lingbin Lin, Wen Li (2021)
A novel progressively undersampling method based on the density peaks sequence for imbalanced dataKnowl. Based Syst., 213
Chih-Fong Tsai, Wei-Chao Lin, Ya-Han Hu, Guan-Ting Yao (2019)
Under-sampling class imbalanced datasets by combining clustering analysis and instance selectionInf. Sci., 477
Pattaramon Vuttipittayamongkol, Eyad Elyan, Andrei Petrovski (2021)
On the class overlap problem in imbalanced data classificationKnowl. Based Syst., 212
Qi Dai, Jian-wei Liu, Yang Liu (2022)
Multi-granularity Relabeled Under-sampling Algorithm for Imbalanced DataAppl. Soft Comput., 124
Jie Sun, Jie Lang, H. Fujita, Hui Li (2018)
Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling ratesInf. Sci., 425
Jinjun Ren, Yuping Wang, Mingqian Mao, Y. Cheung (2022)
Equalization ensemble for large scale highly imbalanced data classificationKnowl. Based Syst., 242
Shuhao Huang, Hongmei Chen, Tianrui Li, Haoyao Chen, Chuan Luo (2021)
Feature selection via minimizing global redundancy for imbalanced dataApplied Intelligence, 52
Pattaramon Vuttipittayamongkol, Eyad Elyan (2020)
Overlap-Based Undersampling Method for Classification of Imbalanced Medical DatasetsArtificial Intelligence Applications and Innovations, 584
Min Zeng, Beiji Zou, F. Wei, Xiyao Liu, Lei Wang (2016)
Effective prediction of three common diseases by combining SMOTE with Tomek links technique for imbalanced medical data2016 IEEE International Conference of Online Analysis and Computing Science (ICOACS)
Bo-Wen Yuan, Xing-Gang Luo, Zhongliang Zhang, Yang Yu, Hong-wei Huo, Tretter Johannes, X. Zou (2020)
A novel density-based adaptive k nearest neighbor method for dealing with overlapping problem in imbalanced datasetsNeural Computing and Applications, 33
J. Hartigan, M. Wong (1979)
A k-means clustering algorithm
Weijie Zheng, Hong Zhao (2020)
Cost-sensitive hierarchical classification for imbalance classesApplied Intelligence, 50
M. Santos, P. Abreu, N. Japkowicz, Alberto Fernández, Carlos Soares, S. Wilk, João Santos (2022)
On the joint-effect of class imbalance and overlap: a critical reviewArtificial Intelligence Review, 55
Behzad Mirzaei, Bahareh Nikpour, H. Nezamabadi-pour (2021)
CDBH: A clustering and density-based hybrid approach for imbalanced data classificationExpert Syst. Appl., 164
Kaixiang Yang, Zhiwen Yu, Xin Wen, Wenming Cao, C. Chen, H. Wong, J. You (2020)
Hybrid Classifier Ensemble for Imbalanced DataIEEE Transactions on Neural Networks and Learning Systems, 31
Yinan Guo, Jiawei Feng, Botao Jiao, Linkai Yang, Hui Lu, Zekuan Yu (2021)
Manifold cluster-based evolutionary ensemble imbalance learningComput. Ind. Eng., 159
Zhaozhao Xu, Derong Shen, Tiezheng Nie, Yue Kou (2020)
A hybrid sampling algorithm combining M-SMOTE and ENN based on Random forest for medical imbalanced dataJournal of biomedical informatics
Zhenchuan Li, Mian Huang, Guanjun Liu, Changjun Jiang (2021)
A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detectionExpert Syst. Appl., 175
G. Fu, Yuan-Jiao Wu, Min-Jie Zong, Lun-zhao Yi (2020)
Feature selection and classification by minimizing overlap degree for class-imbalanced data in metabolomicsChemometrics and Intelligent Laboratory Systems, 196
Pattaramon Vuttipittayamongkol, Eyad Elyan (2020)
Neighbourhood-based undersampling approach for handling imbalanced and overlapped dataInf. Sci., 509
Xin Gao, Bing Ren, Hao Zhang, Bohao Sun, Junliang Li, Jianhang Xu, Yang He, Kangsheng Li (2020)
An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid samplingExpert Syst. Appl., 160
Shou Feng, C. Zhao, Ping Fu (2020)
A cluster-based hybrid sampling approach for imbalanced data classification.The Review of scientific instruments, 91 5
D. Wilson (1972)
Asymptotic Properties of Nearest Neighbor Rules Using Edited DataIEEE Trans. Syst. Man Cybern., 2
Zhongbin Sun, Qinbao Song, Xiaoyan Zhu, Heli Sun, Baowen Xu, Yuming Zhou (2015)
A novel ensemble method for classifying imbalanced dataPattern Recognit., 48
Xinmin Tao, Yujia Zheng, Wei Chen, Xiaohang Zhang, Lin Qi, Zhiting Fan, Shan Huang (2021)
SVDD-based weighted oversampling technique for imbalanced and overlapped dataset learningInf. Sci., 588
Zhaozhao Xu, Derong Shen, Tiezheng Nie, Yue Kou, Nan Yin, Xi Han (2021)
A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical dataInf. Sci., 572
Ming Zheng, Tong Li, Rui Zhu, Yahui Tang, Mingjing Tang, Leilei Lin, Zifei Ma (2020)
Conditional Wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classificationInf. Sci., 512
X. Liang, A. Jiang, Tao Li, Y. Xue, G. Wang (2020)
LR-SMOTE - An improved unbalanced data set oversampling based on K-means and SVMKnowl. Based Syst., 196
Jiakun Zhao, Ju Jin, Si Chen, Ruifeng Zhang, Bilin Yu, Qingfang Liu (2020)
A weighted hybrid ensemble method for classifying imbalanced dataKnowl. Based Syst., 203
Jie Zhou, W. Pedrycz, C. Gao, Zhihui Lai, Jun Wan, Zhong Ming (2022)
Robust Jointly Sparse Fuzzy Clustering With Neighborhood Structure PreservationIEEE Transactions on Fuzzy Systems, 30
Yue Zhuo, Zhiqiang Ge (2020)
Gaussian Discriminative Analysis aided GAN for imbalanced big data augmentation and fault classificationJournal of Process Control, 92
Everlandio Fernandes, A. Carvalho (2019)
Evolutionary inversion of class distribution in overlapping areas for multi-class imbalanced learningInf. Sci., 494
Hankyu Lee, S. Kim (2018)
An overlap-sensitive margin classifier for imbalanced and overlapping dataExpert Syst. Appl., 98
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Zhijun Ren, Yongsheng Zhu, Wei Kang, Hong Fu, Qingbo Niu, Dawei Gao, Ke Yan, Jun Hong (2022)
Adaptive cost-sensitive learning: Improving the convergence of intelligent diagnosis models under imbalanced dataKnowl. Based Syst., 241
Imbalanced data classification remains a research hotspot and a challenging problem in the field of machine learning. The challenge of imbalanced learning lies not only in class imbalance problem, but also in the class overlapping problem which is complex. However, most of the existing algorithms mainly focus on the former. The limitation prevents the existing methods from breaking through. To address this limitation, this paper proposes an ensemble algorithm based on dual clustering and stage-wise hybrid sampling (DCSHS) to address both class imbalance and class overlapping problems. The DCSHS has three main parts: projection clustering combination framework (PCC), stage-wise hybrid sampling (SHS) and envelope clustering transfer mapping mechanism (CTM). PCC is to create multiple subsets through projective clustering. SHS is to identify the overlapping region of each subset and conduct hybrid sampling. CTM is to explore more information of samples in each subset by combining the clustering and transfer learning. At first, we design a PCC framework guided by Davies-Bouldin clustering effectiveness index (DBI), which is used to obtain high-quality clusters and combine them to obtain a set of cross-complete subsets (CCS) with low overlapping. Secondly, according to the characteristics of subset classes, a SHS algorithm is designed to realize the de-overlapping and balancing of subsets. Finally, an envelope clustering transfer mapping mechanism (CTM) is constructed for all processed subsets by means of transfer learning, thereby reducing class overlapping and explore structural information of samples. Weak classifiers are trained on the balanced subsets, and fused as all the imbalanced ensemble algorithms did. The major advantage of our algorithm is that it can exploit the intersectionality of the CCS to realize the soft elimination of overlapping majority samples, and learn as much information of overlapping samples as possible, thereby enhancing the class overlapping while class balancing. In the experimental section, more than 30 public datasets and over ten representative algorithms are chosen for verification. The experimental results show that the DCSHS is significantly best in terms of anti-overlapping, Recall, F1-M, G-M, AUC, and diversity.
Applied Intelligence – Springer Journals
Published: Sep 1, 2023
Keywords: Imbalanced data classification; Class overlapping; Projective clustering; Cross complete set; Hybrid sampling; Envelope learning; Ensemble learning
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.