Verification for generalizability and accuracy of a thinning-trees selection model with the ensemble learning algorithm and the cross-validation method

2008 ◽  
Vol 13 (5) ◽  
pp. 275-285 ◽  
Author(s):  
Yasushi Minowa
2020 ◽  
Vol 17 ◽  
Author(s):  
Hongwei Liu ◽  
Bin Hu ◽  
Lei Chen ◽  
Lin Lu

Background: Identification of protein subcellular location is an important problem because the subcellular location is highly related to protein function. It is fundamental to determine the locations with biology experiments. However, these experiments are of high costs and time-consuming. The alternative way to address such problem is to design effective computational methods. Objective: To date, several computational methods have been proposed in this regard. However, these methods mainly adopted the features derived from proteins themselves. On the other hand, with the development of network technique, several embedding algorithms have been proposed, which can encode nodes in the network into feature vectors. Such algorithms connected the network and traditional classification algorithms. Thus, they provided a new way to construct models for the prediction of protein subcellular location. Method: In this study, we analyzed features produced by three network embedding algorithms (DeepWalk, Node2vec and Mashup) that were applied on one or multiple protein networks. Obtained features were learned by one machine learning algorithm (support vector machine or random forest) to construct the model. The cross-validation method was adopted to evaluate all constructed models. Results: After evaluating models with the cross-validation method, embedding features yielded by Mashup on multiple networks were quite informative for predicting protein subcellular location. The model based on these features were superior to some classic models. Conclusion: Embedding features yielded by a proper and powerful network embedding algorithm were effective for building the model for prediction of protein subcellular location, providing new pipelines to build more efficient models.


2021 ◽  
Author(s):  
Yu Tang ◽  
Qi Dai ◽  
Mengyuan Yang ◽  
Lifang Chen

Abstract For the traditional ensemble learning algorithm of software defect prediction, the base predictor exists the problem that too many parameters are difficult to optimize, resulting in the optimized performance of the model unable to be obtained. An ensemble learning algorithm for software defect prediction that is proposed by using the improved sparrow search algorithm to optimize the extreme learning machine, which divided into three parts. Firstly, the improved sparrow search algorithm (ISSA) is proposed to improve the optimization ability and convergence speed, and the performance of the improved sparrow search algorithm is tested by using eight benchmark test functions. Secondly, ISSA is used to optimize extreme learning machine (ISSA-ELM) to improve the prediction ability. Finally, the optimized ensemble learning algorithm (ISSA-ELM-Bagging) is presented in the Bagging algorithm which improve the prediction performance of ELM in software defect datasets. Experiments are carried out in six groups of software defect datasets. The experimental results show that ISSA-ELM-Bagging ensemble learning algorithm is significantly better than the other four comparison algorithms under the six evaluation indexes of Precision, Recall, F-measure, MCC, Accuracy and G-mean, which has better stability and generalization ability.


2013 ◽  
Vol 22 (04) ◽  
pp. 1350025 ◽  
Author(s):  
BYUNGWOO LEE ◽  
SUNGHA CHOI ◽  
BYONGHWA OH ◽  
JIHOON YANG ◽  
SUNGYONG PARK

We present a new ensemble learning method that employs a set of regional classifiers, each of which learns to handle a subset of the training data. We split the training data and generate classifiers for different regions in the feature space. When classifying an instance, we apply a weighted voting scheme among the classifiers that include the instance in their region. We used 11 datasets to compare the performance of our new ensemble method with that of single classifiers as well as other ensemble methods such as RBE, bagging and Adaboost. As a result, we found that the performance of our method is comparable to that of Adaboost and bagging when the base learner is C4.5. In the remaining cases, our method outperformed other approaches.


2019 ◽  
Vol 23 (1) ◽  
pp. 395-406 ◽  
Author(s):  
Yanyun Tao ◽  
Yenming J. Chen ◽  
Xiangyu Fu ◽  
Bin Jiang ◽  
Yuzhen Zhang

2018 ◽  
Vol 173 ◽  
pp. 03004
Author(s):  
Gui-fang Shen ◽  
Yi-Wen Zhang

To improve the accuracy of the financial early warning of the company, aiming at defects of slow learning speed, trapped in local solution and inaccurate operating result of the traditional BP neural network with random initial weights and thresholds, a parallel ensemble learning algorithm based on improved harmony search algorithm using good point set (GIHS) optimize the BP_Adaboost is proposed. Firstly, the good-point set is used to construct a more high quality initial harmony library, and it adjusts the parameters dynamically during the search process and generates several solutions in each iteration so as to make full use of information of harmony memory to improve the global search ability and convergence speed of algorithm. Secondly, ten financial indicators are chosen as the inputs of BP neural network value, and GIHS algorithm and BP neural network are combined to construct the parallel ensemble learning algorithm to optimize BP neural network initial weights value and output threshold value. Finally, many of these weak classifier is composed as strong classifier through the AdaBoost algorithm. The improved algorithm is validated in the company's financial early warning. Simulation results show that the performance of GIHS algorithm is better than the basic HS and IHS algorithm, and the GIHS-BP_AdaBoost classifier has higher classification and prediction accuracy.


2019 ◽  
Vol 9 (15) ◽  
pp. 3143 ◽  
Author(s):  
Lu Han ◽  
Chongchong Yu ◽  
Cuiling Liu ◽  
Yong Qin ◽  
Shijie Cui

The rolling bearing is a key component of the bogie of the rail train. The working environment is complex, and it is easy to cause cracks and other faults. Effective rolling bearing fault diagnosis can provide an important guarantee for the safe operation of the track while improving the resource utilization of the rolling bearing and greatly reducing the cost of operation. Aiming at the problem that the characteristics of the vibration data of the rolling bearing components of the railway train and the vibration mechanism model are difficult to establish, a method for long-term faults diagnosis of the rolling bearing of rail trains based on Exponential Smoothing Predictive Segmentation and Improved Ensemble Learning Algorithm is proposed. Firstly, the sliding time window segmentation algorithm of exponential smoothing is used to segment the rolling bearing vibration data, and then the segmentation points are used to construct the localized features of the data. Finally, an Improved AdaBoost Algorithm (IAA) is proposed to enhance the anti-noise ability. IAA, Back Propagation (BP) neural network, Support Vector Machine (SVM), and AdaBoost are used to classify the same dataset, and the evaluation indexes show that the IAA has the best classification effect. The article selects the raw data of the bearing experiment platform provided by the State Key Laboratory of Rail Traffic Control and Safety of Beijing Jiaotong University and the standard dataset of the American Case Western Reserve University for the experiment. Theoretical analysis and experimental results show the effectiveness and practicability of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document