scholarly journals Applying an ensemble learning method for improving multi-label classification performance

Author(s):  
Amirreza Mahdavi-Shahri ◽  
Mahboobeh Houshmand ◽  
Mahdi Yaghoobi ◽  
Mehrdad Jalali
Author(s):  
Adem Doganer

In this study, different models were created to reduce bias by ensemble learning methods. Reducing the bias error will improve the classification performance. In order to increase the classification performance, the most appropriate ensemble learning method and ideal sample size were investigated. Bias values and learning performances of different ensemble learning methods were compared. AdaBoost ensemble learning method provided the lowest bias value with n: 250 sample size while Stacking ensemble learning method provided the lowest bias value with n: 500, n: 750, n: 1000, n: 2000, n: 4000, n: 6000, n: 8000, n: 10000, and n: 20000 sample sizes. When the learning performances were compared, AdaBoost ensemble learning method and RBF classifier achieved the best performance with n: 250 sample size (ACC = 0.956, AUC: 0.987). The AdaBoost ensemble learning method and REPTree classifier achieved the best performance with n: 20000 sample size (ACC = 0.990, AUC = 0.999). In conclusion, for reduction of bias, methods based on stacking displayed a higher performance compared to other methods.


2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Kun Zeng ◽  
Yibin Xu ◽  
Ge Lin ◽  
Likeng Liang ◽  
Tianyong Hao

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.


2021 ◽  
pp. 1-1
Author(s):  
Sutong Wang ◽  
Jiacheng Zhu ◽  
Yunqiang Yin ◽  
Dujuan Wang ◽  
T.C. Edwin Cheng ◽  
...  

Sensors ◽  
2019 ◽  
Vol 19 (21) ◽  
pp. 4784 ◽  
Author(s):  
Chern-Sheng Lin ◽  
Shih-Hua Chen ◽  
Che-Ming Chang ◽  
Tsu-Wang Shen

In this study, an innovative, ensemble learning method in a dynamic imaging system of an unmanned vehicle is presented. The feasibility of the system was tested in the crack detection of a retaining wall in a climbing area or a mountain road. The unmanned vehicle can provide a lightweight and remote cruise routine with a Geographic Information System sensor, a Gyro sensor, and a charge-coupled device camera. The crack was the target to be tested, and the retaining wall was patrolled through the drone flight path setting, and then the horizontal image was instantly returned by using the wireless transmission of the system. That is based on the cascade classifier, and the feature comparison classifier was designed further, and then the machine vision correlation algorithm was used to analyze the target type information. First, the system collects the target image and background to establish the samples database, and then uses the Local Binary Patterns feature extraction algorithm to extract the feature values for classification. When the first stage classification is completed, the classification results are target features, and edge feature comparisons. The innovative ensemble learning classifier was used to analyze the image and determine the location of the crack for risk assessment.


2013 ◽  
Vol 22 (04) ◽  
pp. 1350025 ◽  
Author(s):  
BYUNGWOO LEE ◽  
SUNGHA CHOI ◽  
BYONGHWA OH ◽  
JIHOON YANG ◽  
SUNGYONG PARK

We present a new ensemble learning method that employs a set of regional classifiers, each of which learns to handle a subset of the training data. We split the training data and generate classifiers for different regions in the feature space. When classifying an instance, we apply a weighted voting scheme among the classifiers that include the instance in their region. We used 11 datasets to compare the performance of our new ensemble method with that of single classifiers as well as other ensemble methods such as RBE, bagging and Adaboost. As a result, we found that the performance of our method is comparable to that of Adaboost and bagging when the base learner is C4.5. In the remaining cases, our method outperformed other approaches.


Sign in / Sign up

Export Citation Format

Share Document