A Robust Ensemble Method for Classification in Imbalanced Datasets in the Presence of Noise

Author(s):  
Chongomweru Halimu ◽  
Asem Kasem
2017 ◽  
Vol 32 (10) ◽  
pp. 5735-5744 ◽  
Author(s):  
Arkaitz Artetxe ◽  
Manuel Graña ◽  
Andoni Beristain ◽  
Sebastián Ríos

2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Yong Zhang ◽  
Dapeng Wang

In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods.


Author(s):  
S. Sridhar ◽  
A. Kalaivani

Data imbalance occurring among multiclass datasets is very common in real-world applications. Existing studies reveal that various attempts were made in the past to overcome this multiclass imbalance problem, which is a severe issue related to the typical supervised machine learning methods such as classification and regression. But, still there exists a need to handle the imbalance problem efficiently as the datasets include both safe and unsafe minority samples. Most of the widely used oversampling techniques like SMOTE and its variants face challenges in replicating or generating the new data instances for balancing them across multiple classes, particularly when the imbalance is high and the number of rare samples count is too minimal thus leading the classifier to misclassify the data instances. To lessen this problem, we proposed a new data balancing method namely a two-stage iterative ensemble method to tackle the imbalance in multiclass environment. The proposed approach focuses on the rare minority sample’s influence on learning from imbalanced datasets and the main idea of the proposed approach is to balance the data without any change in class distribution before it gets trained by the learner such that it improves the learner’s learning process. Also, the proposed approach is compared against two widely used oversampling techniques and the results reveals that the proposed approach shows a much significant improvement in the learning process among the multiclass imbalanced data.


2011 ◽  
Vol 31 (2) ◽  
pp. 441-445 ◽  
Author(s):  
Guang LING ◽  
Ming-chun WANG ◽  
Jia-yi FENG

Sign in / Sign up

Export Citation Format

Share Document