Flight delay classification warning based on evolutionary under-sampling bagging ensemble learning

Machine-learning-based software defect prediction (SDP) methods are receiving great attention from the researchers of intelligent software engineering. Most existing SDP methods are performed under a within-project setting. However, there usually is little to no within-project training data to learn an available supervised prediction model for a new SDP task. Therefore, cross-project defect prediction (CPDP), which uses labeled data of source projects to learn a defect predictor for a target project, was proposed as a practical SDP solution. In real CPDP tasks, the class imbalance problem is ubiquitous and has a great impact on performance of the CPDP models. Unlike previous studies that focus on subsampling and individual methods, this study investigated 15 imbalanced learning methods for CPDP tasks, especially for assessing the effectiveness of imbalanced ensemble learning (IEL) methods. We evaluated the 15 methods by extensive experiments on 31 open-source projects derived from five datasets. Through analyzing a total of 37504 results, we found that in most cases, the IEL method that combined under-sampling and bagging approaches will be more effective than the other investigated methods.

Download Full-text

Evolutionary under-sampling based bagging ensemble method for imbalanced data classification

Frontiers of Computer Science ◽

10.1007/s11704-016-5306-z ◽

2018 ◽

Vol 12 (2) ◽

pp. 331-350 ◽

Cited By ~ 17

Author(s):

Bo Sun ◽

Haiyan Chen ◽

Jiandong Wang ◽

Hua Xie

Keyword(s):

Imbalanced Data ◽

Data Classification ◽

Ensemble Method ◽

Imbalanced Data Classification ◽

Under Sampling ◽

Bagging Ensemble

Download Full-text

Stator Single-Line-to-Ground Fault Protection for Bus-Connected Powerformers Based on S-Transform and Bagging Ensemble Learning

IEEE Access ◽

10.1109/access.2020.2993692 ◽

2020 ◽

Vol 8 ◽

pp. 88322-88332

Author(s):

Yuanyuan Wang ◽

Yongsheng Guo ◽

Xiangjun Zeng ◽

Jun Chen ◽

Yang Kong ◽

...

Keyword(s):

Ensemble Learning ◽

Single Line ◽

Ground Fault ◽

Fault Protection ◽

S Transform ◽

Ground Fault Protection ◽

Bagging Ensemble

Download Full-text

Handling incomplete data classification using imputed feature selected bagging (IFBag) method

Intelligent Data Analysis ◽

10.3233/ida-205331 ◽

2021 ◽

Vol 25 (4) ◽

pp. 825-846

Author(s):

Ahmad Jaffar Khan ◽

Basit Raza ◽

Ahmad Raza Shahid ◽

Yogan Jaya Kumar ◽

Muhammad Faheem ◽

...

Keyword(s):

Multiple Imputation ◽

Ensemble Learning ◽

Incomplete Data ◽

Missing Values ◽

Learning Approach ◽

Imputation Methods ◽

Real World Datasets ◽

Almost All ◽

Bagging Ensemble

Almost all real-world datasets contain missing values. Classification of data with missing values can adversely affect the performance of a classifier if not handled correctly. A common approach used for classification with incomplete data is imputation. Imputation transforms incomplete data with missing values to complete data. Single imputation methods are mostly less accurate than multiple imputation methods which are often computationally much more expensive. This study proposes an imputed feature selected bagging (IFBag) method which uses multiple imputation, feature selection and bagging ensemble learning approach to construct a number of base classifiers to classify new incomplete instances without any need for imputation in testing phase. In bagging ensemble learning approach, data is resampled multiple times with substitution, which can lead to diversity in data thus resulting in more accurate classifiers. The experimental results show the proposed IFBag method is considerably fast and gives 97.26% accuracy for classification with incomplete data as compared to common methods used.

Download Full-text

Improving the Accuracy for Analyzing Heart Diseases Prediction Based on the Ensemble Method

Complexity ◽

10.1155/2021/6663455 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Xiao-Yan Gao ◽

Abdelmegeid Amin Ali ◽

Hassan Shaban Hassan ◽

Eman M. Anwar

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Ensemble Learning ◽

Heart Diseases ◽

Principal Component ◽

Extraction Methods ◽

Machine Learning Algorithms ◽

Learning Methods ◽

Linear Discriminant ◽

Bagging Ensemble

Heart disease is the deadliest disease and one of leading causes of death worldwide. Machine learning is playing an essential role in the medical side. In this paper, ensemble learning methods are used to enhance the performance of predicting heart disease. Two features of extraction methods: linear discriminant analysis (LDA) and principal component analysis (PCA), are used to select essential features from the dataset. The comparison between machine learning algorithms and ensemble learning methods is applied to selected features. The different methods are used to evaluate models: accuracy, recall, precision, F-measure, and ROC.The results show the bagging ensemble learning method with decision tree has achieved the best performance.

Download Full-text

PSO-BP Neural Network Grade Prediction Model Based on Bagging Ensemble Learning

Journal of Physics Conference Series ◽

10.1088/1742-6596/1069/1/012103 ◽

2018 ◽

Vol 1069 ◽

pp. 012103 ◽

Cited By ~ 1

Author(s):

Hongyi Li ◽

Xinhang Li ◽

Di Zhao

Keyword(s):

Neural Network ◽

Prediction Model ◽

Ensemble Learning ◽

Bp Neural Network ◽

Model Based ◽

Grade Prediction ◽

Bagging Ensemble

Download Full-text

A pragmatic convolutional bagging ensemble learning for recognition of Farsi handwritten digits

The Journal of Supercomputing ◽

10.1007/s11227-021-03822-4 ◽

2021 ◽

Author(s):

Y. A. Nanehkaran ◽

Junde Chen ◽

Soheil Salimi ◽

Defu Zhang

Keyword(s):

Ensemble Learning ◽

Bagging Ensemble

Download Full-text

Acute Lymphoblastic Leukemia Cells Image Analysis with Deep Bagging Ensemble Learning

Lecture Notes in Bioengineering - ISBI 2019 C-NMC Challenge: Classification in Cancer Cell Imaging ◽

10.1007/978-981-15-0798-4_12 ◽

2019 ◽

pp. 113-121 ◽

Cited By ~ 1

Author(s):

Ying Liu ◽

Feixiao Long

Keyword(s):

Image Analysis ◽

Acute Lymphoblastic Leukemia ◽

Ensemble Learning ◽

Lymphoblastic Leukemia ◽

Leukemia Cells ◽

Bagging Ensemble

Download Full-text

Application of sample balance-based multi-perspective feature ensemble learning for prediction of user purchasing behaviors on mobile wireless network platforms

10.21203/rs.3.rs-35874/v1 ◽

2020 ◽

Author(s):

Huibing Zhang ◽

Junchao Dong

Keyword(s):

Wireless Communication ◽

Sample Size ◽

Ensemble Learning ◽

Prediction Models ◽

Rapid Development ◽

Sliding Window ◽

Learning Model ◽

Great Success ◽

Prediction Ability ◽

Under Sampling

Abstract In recent years, with the rapid development of wireless communication network, M-Commerce has achieved great success. Relying on mobile phones, tablets and other wireless communication devices for online shopping has become a mainstream way for users to consume. Users leave a lot of historical behavior data when shopping on the M-Commerce platform. Using these data to predict future purchasing behaviors of the users will be of great significance for improving user experience and realizing mutual benefit and win-win result between merchant and user. Therefore, a sample balance-based multi-perspective feature ensemble learning was proposed in this study as the solution to predicting user purchasing behaviors, specifically including: 1) “Sliding window”-centroid under-sampling was combined with sample balance method was used, while the positive sample size was enlarged using “sliding window”, centroid under-sampling was used to reduce the negative sample size within “sliding window”, so as to acquire user’s historical purchasing behavioral data with sample balance. 2) Influence feature of user purchasing behaviors were extracted from three perspectives—user, commodity and interaction, in order to further enrich the feature dimensions. Meanwhile, feature selection was carried out using XGBSFS algorithm. 3) An ensemble learning model—five-fold cross validation stacking—which could be used to predict user purchasing behaviors was raised. Three prediction models—XGBoost-Logistics, LightGBM-L2 and cascaded deep forest models—so that they could realize mutual collaboration and the overall prediction ability of the ensemble learning model could be improved. 4) Large-scale real datasets were experimented on Alibaba M-Commerce platform. The experimental results show that the proposed method has achieved better prediction effect in various evaluation indexes such as precision and recall rate.

Download Full-text

A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

Information ◽

10.3390/info12080291 ◽

2021 ◽

Vol 12 (8) ◽

pp. 291

Author(s):

Moussa Diallo ◽

Shengwu Xiong ◽

Eshete Derb Emiru ◽

Awet Fesseha ◽

Aminu Onimisi Abdulsalami ◽

...

Keyword(s):

Multilayer Perceptron ◽

Real Life ◽

Class Imbalance ◽

Classification Algorithms ◽

Training Set ◽

Under Sampling ◽

Mlp Classifier ◽

F Measure ◽

Imbalance Dataset ◽

Bagging Ensemble

Classification algorithms have shown exceptional prediction results in the supervised learning area. These classification algorithms are not always efficient when it comes to real-life datasets due to class distributions. As a result, datasets for real-life applications are generally imbalanced. Several methods have been proposed to solve the problem of class imbalance. In this paper, we propose a hybrid method combining the preprocessing techniques and those of ensemble learning. The original training set is undersampled by evaluating the samples by stochastic measurement (SM) and then training these samples selected by Multilayer Perceptron to return a balanced training set. The MLPUS (Multilayer perceptron undersampling) balanced training set is aggregated using the bagging ensemble method. We applied our method to the real-life Niger_Rice dataset and forty-four other imbalanced datasets from the KEEL repository in this study. We also compared our method with six other existing methods in the literature, such as the MLP classifier on the original imbalance dataset, MLPUS, UnderBagging (combining random under-sampling and bagging), RUSBoost, SMOTEBagging (Synthetic Minority Oversampling Technique and bagging), SMOTEBoost. The results show that our method is competitive compared to other methods. The Niger_Rice real-life dataset results are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC with our proposed method. In contrast, the MLP classifier on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, F-measure, G-mean, and ROC.

Download Full-text

Flight delay classification warning based on evolutionary under-sampling bagging ensemble learning

An Investigation of Imbalanced Ensemble Learning Methods for Cross-Project Defect Prediction

Evolutionary under-sampling based bagging ensemble method for imbalanced data classification

Stator Single-Line-to-Ground Fault Protection for Bus-Connected Powerformers Based on S-Transform and Bagging Ensemble Learning

Handling incomplete data classification using imputed feature selected bagging (IFBag) method

Improving the Accuracy for Analyzing Heart Diseases Prediction Based on the Ensemble Method

PSO-BP Neural Network Grade Prediction Model Based on Bagging Ensemble Learning

A pragmatic convolutional bagging ensemble learning for recognition of Farsi handwritten digits

Acute Lymphoblastic Leukemia Cells Image Analysis with Deep Bagging Ensemble Learning

Application of sample balance-based multi-perspective feature ensemble learning for prediction of user purchasing behaviors on mobile wireless network platforms

A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

Export Citation Format