The Effects of Class Label Noise on Highly-Imbalanced Big Data

Text classification and clustering approach is essential for big data environments. In supervised learning applications many classification algorithms have been proposed. In the era of big data, a large volume of training data is available in many machine learning works. However, there is a possibility of mislabeled or unlabeled data that are not labeled properly. Some labels may be incorrect resulted in label noise which in turn regress learning performance of a classifier. A general approach to address label noise is to apply noise filtering techniques to identify and remove noise before learning. A range of noise filtering approaches have been developed to improve the classifiers performance. This paper proposes noise filtering approach in text data during the training phase. Many supervised learning algorithms generates high error rates due to noise in training dataset, our work eliminates such noise and provides accurate classification system.

Download Full-text

Improved Majority Filtering Algorithm for Cleaning Class Label Noise in Supervised Learning

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e102.a.1556 ◽

2019 ◽

Vol E102.A (11) ◽

pp. 1556-1559

Author(s):

Muhammad Ammar MALIK ◽

Jae Young CHOI ◽

Moonsoo KANG ◽

Bumshik LEE

Keyword(s):

Supervised Learning ◽

Class Label ◽

Label Noise ◽

Filtering Algorithm

Download Full-text

A new margin-based AdaBoost algorithm: Even more robust than RobustBoost to class-label noise

2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) ◽

10.1109/ccece.2016.7726684 ◽

2016 ◽

Cited By ~ 1

Author(s):

Omid Ranjbar Pouya

Keyword(s):

Class Label ◽

Label Noise ◽

Adaboost Algorithm

Download Full-text

KalmanTune: A Kalman Filter Based Tuning Method to Make Boosted Ensembles Robust to Class-Label Noise

IEEE Access ◽

10.1109/access.2020.3013908 ◽

2020 ◽

Vol 8 ◽

pp. 145887-145897

Author(s):

Arjun Pakrashi ◽

Brian Mac Namee

Keyword(s):

Kalman Filter ◽

Class Label ◽

Label Noise ◽

Tuning Method

Download Full-text

Effect of Training Class Label Noise on Classification Performances for Land Cover Mapping with Satellite Image Time Series

Remote Sensing ◽

10.3390/rs9020173 ◽

2017 ◽

Vol 9 (2) ◽

pp. 173 ◽

Cited By ~ 56

Author(s):

Charlotte Pelletier ◽

Silvia Valero ◽

Jordi Inglada ◽

Nicolas Champion ◽

Claire Marais Sicre ◽

...

Keyword(s):

Time Series ◽

Land Cover ◽

Satellite Image ◽

Land Cover Mapping ◽

Class Label ◽

Label Noise ◽

Training Class

Download Full-text

Robust Framework to Combine Diverse Classifiers Assigning Distributed Confidence to Individual Classifiers at Class Level

The Scientific World JOURNAL ◽

10.1155/2014/492387 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 5

Author(s):

Shehzad Khalid ◽

Sannia Arshad ◽

Sohail Jabbar ◽

Seungmin Rho

Keyword(s):

Real Life ◽

Weight Vector ◽

Training Data ◽

Class Label ◽

Label Noise ◽

Ensemble Techniques ◽

Classification Framework ◽

Class Level ◽

Free Data ◽

Weight Learning

We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension ofm-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes.

Download Full-text

A Robust Ensemble Based Approach to Combine Heterogeneous Classifiers in the Presence of Class Label Noise

2013 Fifth International Conference on Computational Intelligence, Modelling and Simulation ◽

10.1109/cimsim.2013.33 ◽

2013 ◽

Cited By ~ 2

Author(s):

Shehzad Khalid ◽

Sannia Arshad

Keyword(s):

Class Label ◽

Label Noise

Download Full-text