SGB-ELM: An Advanced Stochastic Gradient Boosting-Based Ensemble Scheme for Extreme Learning Machine

A novel ensemble scheme for extreme learning machine (ELM), named Stochastic Gradient Boosting-based Extreme Learning Machine (SGB-ELM), is proposed in this paper. Instead of incorporating the stochastic gradient boosting method into ELM ensemble procedure primitively, SGB-ELM constructs a sequence of weak ELMs where each individual ELM is trained additively by optimizing the regularized objective. Specifically, we design an objective function based on the boosting mechanism where a regularization item is introduced simultaneously to alleviate overfitting. Then the derivation formula aimed at solving the output-layer weights of each weak ELM is determined using the second-order optimization. As the derivation formula is hard to be analytically calculated and the regularized objective tends to employ simple functions, we take the output-layer weights learned by the current pseudo residuals as an initial heuristic item and thus obtain the optimal output-layer weights by using the derivation formula to update the heuristic item iteratively. In comparison with several typical ELM ensemble methods, SGB-ELM achieves better generalization performance and predicted robustness, which demonstrates the feasibility and effectiveness of SGB-ELM.

Download Full-text

Globally ncRNAs Expression Profiling of TNBC and Screening of Functional lncRNA

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2020.523127 ◽

2021 ◽

Vol 8 ◽

Author(s):

Aman Chandra Kaushik ◽

Aamir Mehmood ◽

Xiangeng Wang ◽

Dong-Qing Wei ◽

Xiaofeng Dai

Keyword(s):

Cancer Biology ◽

Area Under The Curve ◽

Stochastic Gradient ◽

Recursive Feature Elimination ◽

Gradient Boosting ◽

Cancer Subtypes ◽

Prognostic Features ◽

Stochastic Gradient Boosting ◽

Boosting Method ◽

Non Coding Rnas

One of the most well-known cancer subtypes worldwide is triple-negative breast cancer (TNBC) which has reduced prediction due to its antagonistic biotic actions and target’s deficiency for the treatment. The current work aims to discover the countenance outlines and possible roles of lncRNAs in the TNBC via computational approaches. Long non-coding RNAs (lncRNAs) exert profound biological functions and are widely applied as prognostic features in cancer. We aim to identify a prognostic lncRNA signature for the TNBC. First, samples were filtered out with inadequate tumor purity and retrieved the lncRNA expression data stored in the TANRIC catalog. TNBC sufferers were divided into two prognostic classes which were dependent on their survival time (shorter or longer than 3 years). Random forest was utilized to select lncRNA features based on the lncRNAs differential expression between shorter and longer groups. The Stochastic gradient boosting method was used to construct the predictive model. As a whole, 353 lncRNAs were differentially transcribed amongst the shorter and longer groups. Using the recursive feature elimination, two lncRNAs were further selected. Trained by stochastic gradient boosting, we reached the highest accuracy of 69.69% and area under the curve of 0.6475. Our findings showed that the two-lncRNA signs can be proved as potential biomarkers for the prognostic grouping of TNBC’s sufferers. Many lncRNAs remained dysregulated in TNBC, while most of them are likely play a role in cancer biology. Some of these lncRNAs were linked to TNBC’s prediction, which makes them likely to be promising biomarkers.

Download Full-text

Comparison of Ensemble Machine Learning Methods for Soil Erosion Pin Measurements

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10010042 ◽

2021 ◽

Vol 10 (1) ◽

pp. 42

Author(s):

Kieu Anh Nguyen ◽

Walter Chen ◽

Bor-Shiun Lin ◽

Uma Seeboonruang

Keyword(s):

Machine Learning ◽

Soil Erosion ◽

Ensemble Methods ◽

Machine Learning Algorithms ◽

Multivariate Adaptive Regression Splines ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Machine Learning ◽

Boosting Method ◽

Bagging Method

Although machine learning has been extensively used in various fields, it has only recently been applied to soil erosion pin modeling. To improve upon previous methods of quantifying soil erosion based on erosion pin measurements, this study explored the possible application of ensemble machine learning algorithms to the Shihmen Reservoir watershed in northern Taiwan. Three categories of ensemble methods were considered in this study: (a) Bagging, (b) boosting, and (c) stacking. The bagging method in this study refers to bagged multivariate adaptive regression splines (bagged MARS) and random forest (RF), and the boosting method includes Cubist and gradient boosting machine (GBM). Finally, the stacking method is an ensemble method that uses a meta-model to combine the predictions of base models. This study used RF and GBM as the meta-models, decision tree, linear regression, artificial neural network, and support vector machine as the base models. The dataset used in this study was sampled using stratified random sampling to achieve a 70/30 split for the training and test data, and the process was repeated three times. The performance of six ensemble methods in three categories was analyzed based on the average of three attempts. It was found that GBM performed the best among the ensemble models with the lowest root-mean-square error (RMSE = 1.72 mm/year), the highest Nash-Sutcliffe efficiency (NSE = 0.54), and the highest index of agreement (d = 0.81). This result was confirmed by the spatial comparison of the absolute differences (errors) between model predictions and observations using GBM and RF in the study area. In summary, the results show that as a group, the bagging method and the boosting method performed equally well, and the stacking method was third for the erosion pin dataset considered in this study.

Download Full-text

Class point approach for software effort estimation using stochastic gradient boosting technique

ACM SIGSOFT Software Engineering Notes ◽

10.1145/2597716.2597725 ◽

2014 ◽

Vol 39 (3) ◽

pp. 1-6 ◽

Cited By ~ 7

Author(s):

Shashank Mouli Satapathy ◽

Barada Prasanna Acharya ◽

Santanu Kumar Rath

Keyword(s):

Stochastic Gradient ◽

Gradient Boosting ◽

Effort Estimation ◽

Software Effort Estimation ◽

Stochastic Gradient Boosting ◽

Boosting Technique ◽

Class Point

Download Full-text

Enhanced prediction of vulnerable Web components using Stochastic Gradient Boosting Trees

International Journal of Web Information Systems ◽

10.1108/ijwis-05-2018-0041 ◽

2019 ◽

Vol 15 (2) ◽

pp. 201-214 ◽

Cited By ~ 1

Author(s):

Mahmoud Elish

Keyword(s):

Web Applications ◽

Prediction Models ◽

Stochastic Gradient ◽

Gradient Boosting ◽

Data Sets ◽

Content Type ◽

Stochastic Gradient Boosting ◽

Security Inspection ◽

Novel Model ◽

Efficient Software

Purpose Effective and efficient software security inspection is crucial as the existence of vulnerabilities represents severe risks to software users. The purpose of this paper is to empirically evaluate the potential application of Stochastic Gradient Boosting Trees (SGBT) as a novel model for enhanced prediction of vulnerable Web components compared to common, popular and recent machine learning models. Design/methodology/approach An empirical study was conducted where the SGBT and 16 other prediction models have been trained, optimized and cross validated using vulnerability data sets from multiple versions of two open-source Web applications written in PHP. The prediction performance of these models have been evaluated and compared based on accuracy, precision, recall and F-measure. Findings The results indicate that the SGBT models offer improved prediction over the other 16 models and thus are more effective and reliable in predicting vulnerable Web components. Originality/value This paper proposed a novel application of SGBT for enhanced prediction of vulnerable Web components and showed its effectiveness.

Download Full-text

Stochastic Gradient Boosting Model for Twitter Spam Detection

Computer Systems Science and Engineering ◽

10.32604/csse.2022.020836 ◽

2022 ◽

Vol 41 (2) ◽

pp. 849-859

Author(s):

K. Kiruthika Devi ◽

G. A. Sathish Kumar

Keyword(s):

Stochastic Gradient ◽

Gradient Boosting ◽

Spam Detection ◽

Stochastic Gradient Boosting

Download Full-text

Predictive Credit Risk Analytics Using Borrowers' Digital Footprint and Methods of Statistical Machine Learning

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.358-372 ◽

2021 ◽

Vol 12 (7) ◽

pp. 358-372

Author(s):

E. V. Orlova ◽

Keyword(s):

Machine Learning ◽

Credit Risk ◽

Risk Profile ◽

Classification Model ◽

Gradient Boosting ◽

Suggested Approach ◽

Digital Footprint ◽

Credit Risks ◽

Stochastic Gradient Boosting ◽

Boosting Method

The article considers the problem of reducing the banks credit risks associated with the insolvency of borrowers — individuals using financial, socio-economic factors and additional data about borrowers digital footprint. A critical analysis of existing approaches, methods and models in this area has been carried out and a number of significant shortcomings identified that limit their application. There is no comprehensive approach to identifying a borrowers creditworthiness based on information, including data from social networks and search engines. The new methodological approach for assessing the borrowers risk profile based on the phased processing of quantitative and qualitative data and modeling using methods of statistical analysis and machine learning is proposed. Machine learning methods are supposed to solve clustering and classification problems. They allow to automatically determine the data structure and make decisions through flexible and local training on the data. The method of hierarchical clustering and the k-means method are used to identify similar social, anthropometric and financial indicators, as well as indicators characterizing the digital footprint of borrowers, and to determine the borrowers risk profile over group. The obtained homogeneous groups of borrowers with a unique risk profile are further used for detailed data analysis in the predictive classification model. The classification model is based on the stochastic gradient boosting method to predict the risk profile of a potencial borrower. The suggested approach for individuals creditworthiness assessing will reduce the banks credit risks, increase its stability and profitability. The implementation results are of practical importance. Comparative analysis of the effectiveness of the existing and the proposed methodology for assessing credit risk showed that the new methodology provides predictive analytics of heterogeneous information about a potential borrower and the accuracy of analytics is higher. The proposed techniques are the core for the decision support system for justification of individuals credit conditions, minimizing the aggregate credit risks.

Download Full-text

Feasibility of Stochastic Gradient Boosting Approach for Evaluating Seismic Liquefaction Potential Based on SPT and CPT Case Histories

Journal of Performance of Constructed Facilities ◽

10.1061/(asce)cf.1943-5509.0001292 ◽

2019 ◽

Vol 33 (3) ◽

pp. 04019024 ◽

Cited By ~ 44

Author(s):

Jian Zhou ◽

Enming Li ◽

Mingzheng Wang ◽

Xin Chen ◽

Xiuzhi Shi ◽

...

Keyword(s):

Stochastic Gradient ◽

Gradient Boosting ◽

Case Histories ◽

Liquefaction Potential ◽

Seismic Liquefaction ◽

Stochastic Gradient Boosting

Download Full-text

Mobile Phone Customer Type Discrimination via Stochastic Gradient Boosting

International Journal of Data Warehousing and Mining ◽

10.4018/jdwm.2007040104 ◽

2007 ◽

Vol 3 (2) ◽

pp. 32-53 ◽

Cited By ~ 1

Author(s):

Dan Steinberg ◽

Mikhaylo Golovnya ◽

Nicholas Scott Cardell

Keyword(s):

Mobile Phone ◽

Stochastic Gradient ◽

Gradient Boosting ◽

Stochastic Gradient Boosting ◽

Customer Type

Download Full-text

Predicting 90-Day and 1-Year Mortality in Spinal Metastatic Disease: Development and Internal Validation

Neurosurgery ◽

10.1093/neuros/nyz070 ◽

2019 ◽

Vol 85 (4) ◽

pp. E671-E681 ◽

Cited By ~ 22

Author(s):

Aditya V Karhade ◽

Quirina C B S Thio ◽

Paul T Ogink ◽

Christopher M Bono ◽

Marco L Ferrone ◽

...

Keyword(s):

Metastatic Disease ◽

Web Application ◽

Performance Status ◽

Predictive Performance ◽

Operative Management ◽

Stochastic Gradient ◽

Patient Specific ◽

Gradient Boosting ◽

Support Vector ◽

Stochastic Gradient Boosting

Abstract BACKGROUND Increasing prevalence of metastatic disease has been accompanied by increasing rates of surgical intervention. Current tools have poor to fair predictive performance for intermediate (90-d) and long-term (1-yr) mortality. OBJECTIVE To develop predictive algorithms for spinal metastatic disease at these time points and to provide patient-specific explanations of the predictions generated by these algorithms. METHODS Retrospective review was conducted at 2 large academic medical centers to identify patients undergoing initial operative management for spinal metastatic disease between January 2000 and December 2016. Five models (penalized logistic regression, random forest, stochastic gradient boosting, neural network, and support vector machine) were developed to predict 90-d and 1-yr mortality. RESULTS Overall, 732 patients were identified with 90-d and 1-yr mortality rates of 181 (25.1%) and 385 (54.3%), respectively. The stochastic gradient boosting algorithm had the best performance for 90-d mortality and 1-yr mortality. On global variable importance assessment, albumin, primary tumor histology, and performance status were the 3 most important predictors of 90-d mortality. The final models were incorporated into an open access web application able to provide predictions as well as patient-specific explanations of the results generated by the algorithms. The application can be found at https://sorg-apps.shinyapps.io/spinemetssurvival/ CONCLUSION Preoperative estimation of 90-d and 1-yr mortality was achieved with assessment of more flexible modeling techniques such as machine learning. Integration of these models into applications and patient-centered explanations of predictions represent opportunities for incorporation into healthcare systems as decision tools in the future.

Download Full-text

Short term power demand prediction using stochastic gradient boosting

2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA) ◽

10.1109/icedsa.2016.7818510 ◽

2016 ◽

Cited By ~ 3

Author(s):

Ali Bou Nassif

Keyword(s):

Stochastic Gradient ◽

Gradient Boosting ◽

Power Demand ◽

Short Term ◽

Demand Prediction ◽

Stochastic Gradient Boosting

Download Full-text