Ensemble-based Noise Filter for Noisy and Imbalanced Data

Author(s):  
Akram M. Radwan
Keyword(s):  
2021 ◽  
Vol 147 (3) ◽  
pp. 04020165
Author(s):  
Amin Ariannezhad ◽  
Abolfazl Karimpour ◽  
Xiao Qin ◽  
Yao-Jan Wu ◽  
Yasamin Salmani

2010 ◽  
Vol 30 (10) ◽  
pp. 2815-2818
Author(s):  
Shuang-shuang WANG ◽  
Shi-tong WANG ◽  
Ke-cai LI
Keyword(s):  

Author(s):  
Gunjan Saraogi ◽  
Deepa Gupta ◽  
Lavanya Sharma ◽  
Ajay Rana

Background: Backorders are an accepted abnormality affecting accumulation alternation and logistics, sales, chump service, and manufacturing, which generally leads to low sales and low chump satisfaction. A predictive archetypal can analyse which articles are best acceptable to acquaintance backorders giving the alignment advice and time to adjust, thereby demography accomplishes to aerate their profit. Objective: To address the issue of predicting backorders, this paper has proposed an un-supervised approach to backorder prediction using Deep Autoencoder. Method: In this paper, artificial intelligence paradigms are researched in order to introduce a predictive model for the present unbalanced data issues, where the number of products going on backorder is rare. Result: Un-supervised anomaly detection using deep auto encoders has shown better Area under the Receiver Operating Characteristic and precision-recall curves than supervised classification techniques employed with resampling techniques for imbalanced data problems. Conclusion: We demonstrated that Un-supervised anomaly detection methods specifically deep auto-encoders can be used to learn a good representation of the data. The method can be used as predictive model for inventory management and help to reduce bullwhip effect, raise customer satisfaction as well as improve operational management in the organization. This technology is expected to create the sentient supply chain of the future – able to feel, perceive and react to situations at an extraordinarily granular level


2013 ◽  
Vol 756-759 ◽  
pp. 3652-3658
Author(s):  
You Li Lu ◽  
Jun Luo

Under the study of Kernel Methods, this paper put forward two improved algorithm which called R-SVM & I-SVDD in order to cope with the imbalanced data sets in closed systems. R-SVM used K-means algorithm clustering space samples while I-SVDD improved the performance of original SVDD by imbalanced sample training. Experiment of two sets of system call data set shows that these two algorithms are more effectively and R-SVM has a lower complexity.


2021 ◽  
Vol 13 (2) ◽  
pp. 268
Author(s):  
Xiaochen Lv ◽  
Wenhong Wang ◽  
Hongfu Liu

Hyperspectral unmixing is an important technique for analyzing remote sensing images which aims to obtain a collection of endmembers and their corresponding abundances. In recent years, non-negative matrix factorization (NMF) has received extensive attention due to its good adaptability for mixed data with different degrees. The majority of existing NMF-based unmixing methods are developed by incorporating additional constraints into the standard NMF based on the spectral and spatial information of hyperspectral images. However, they neglect to exploit the nature of imbalanced pixels included in the data, which may cause the pixels mixed with imbalanced endmembers to be ignored, and thus the imbalanced endmembers generally cannot be accurately estimated due to the statistical property of NMF. To exploit the information of imbalanced samples in hyperspectral data during the unmixing procedure, in this paper, a cluster-wise weighted NMF (CW-NMF) method for the unmixing of hyperspectral images with imbalanced data is proposed. Specifically, based on the result of clustering conducted on the hyperspectral image, we construct a weight matrix and introduce it into the model of standard NMF. The proposed weight matrix can provide an appropriate weight value to the reconstruction error between each original pixel and the reconstructed pixel in the unmixing procedure. In this way, the adverse effect of imbalanced samples on the statistical accuracy of NMF is expected to be reduced by assigning larger weight values to the pixels concerning imbalanced endmembers and giving smaller weight values to the pixels mixed by majority endmembers. Besides, we extend the proposed CW-NMF by introducing the sparsity constraints of abundance and graph-based regularization, respectively. The experimental results on both synthetic and real hyperspectral data have been reported, and the effectiveness of our proposed methods has been demonstrated by comparing them with several state-of-the-art methods.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Ho Sun Shon ◽  
Erdenebileg Batbaatar ◽  
Wan-Sup Cho ◽  
Seong Gon Choi
Keyword(s):  

2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


Sign in / Sign up

Export Citation Format

Share Document