Bayesian Optimization based Gradient Boosting Method of Fault Detection in Oil-Immersed Transformer and Reactors

Telemarketing is another form of marketing which is conducted via telephone. Bank can use telemarketing to offer its products such as term deposit. One of the most important strategy to the success of telemarketing is opting the potential customer to create effective telemarketing. Predicting the success of telemarketing can use machine learning. Gradient boosting is machine learning method with advanced decision tree. Gardient boosting involves many classification trees which are continually upgraded from previous tree. The optimal classification result cannot be separated from the role of the optimal hyperparameter. Hyperopt is Python library that can be used to tune hyperparameter effectively because it uses Bayesian optimization. Hyperopt uses hyperparameter prior distribution to find optimal hyperparameter. Data in this study including 20 independent variables and binary dependent variable which has ‘yes’ and ‘no’ classes. The study showed that gradient boosting reached classification accuracy up to 90,39%, precision 94,91%, and AUC 0,939. These values describe gradient boosting method is able to predict both classes ‘yes’ and ‘no’ relatively accurate.

Download Full-text

Investigating the use of random forest, gradient boosting machine, support vector machine and their ensemble applied to fault detection

10.26678/abcm.cobem2017.cob17-1600 ◽

2017 ◽

Author(s):

Luis Felipe Nogoseke ◽

Gabriel Herman Bernardim Andrade ◽

Marco Boaretto ◽

Leandro Coelho

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Fault Detection ◽

Gradient Boosting ◽

Support Vector ◽

Gradient Boosting Machine

Download Full-text

GRADIENT BOOSTING METHOD APPLICATION TO SUPPORT PROCESS DECISIONS IN THE ELECTRON-BEAM WELDING PROCESS

Siberian Journal of Science and Technology ◽

10.31772/2587-6066-2020-21-2-206-214 ◽

2020 ◽

Vol 21 (2) ◽

pp. 206-214

Author(s):

V. S. Tynchenko ◽

◽

I. A. Golovenok ◽

V. E. Petrenko ◽

A. V. Milov ◽

...

Keyword(s):

Electron Beam ◽

Electron Beam Welding ◽

Welding Process ◽

Gradient Boosting ◽

Boosting Method

Download Full-text

A Machine Learning Method for Predicting Vegetation Indices in China

Remote Sensing ◽

10.3390/rs13061147 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1147

Author(s):

Xiangqian Li ◽

Wenping Yuan ◽

Wenjie Dong

Keyword(s):

Machine Learning ◽

Growing Season ◽

Crop Growth ◽

Spatiotemporal Distribution ◽

Coefficient Of Determination ◽

Gradient Boosting ◽

Severe Drought ◽

Vegetation Growth ◽

Extreme Gradient Boosting ◽

Boosting Method

To forecast the terrestrial carbon cycle and monitor food security, vegetation growth must be accurately predicted; however, current process-based ecosystem and crop-growth models are limited in their effectiveness. This study developed a machine learning model using the extreme gradient boosting method to predict vegetation growth throughout the growing season in China from 2001 to 2018. The model used satellite-derived vegetation data for the first month of each growing season, CO2 concentration, and several meteorological factors as data sources for the explanatory variables. Results showed that the model could reproduce the spatiotemporal distribution of vegetation growth as represented by the satellite-derived normalized difference vegetation index (NDVI). The predictive error for the growing season NDVI was less than 5% for more than 98% of vegetated areas in China; the model represented seasonal variations in NDVI well. The coefficient of determination (R2) between the monthly observed and predicted NDVI was 0.83, and more than 69% of vegetated areas had an R2 > 0.8. The effectiveness of the model was examined for a severe drought year (2009), and results showed that the model could reproduce the spatiotemporal distribution of NDVI even under extreme conditions. This model provides an alternative method for predicting vegetation growth and has great potential for monitoring vegetation dynamics and crop growth.

Download Full-text

Proposing a machine-learning based method to predict stillbirth before and during delivery and ranking the features: nationwide retrospective cross-sectional study

BMC Pregnancy and Childbirth ◽

10.1186/s12884-021-03658-z ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Toktam Khatibi ◽

Elham Hanifi ◽

Mohammad Mehdi Sepehri ◽

Leila Allahqoli

Keyword(s):

Machine Learning ◽

External Validation ◽

Fetal Loss ◽

Null Distribution ◽

Training Dataset ◽

Gradient Boosting ◽

Support Vector ◽

Cross Sectional ◽

Boosting Method ◽

Demographic Features

Abstract Background Stillbirth is defined as fetal loss in pregnancy beyond 28 weeks by WHO. In this study, a machine-learning based method is proposed to predict stillbirth from livebirth and discriminate stillbirth before and during delivery and rank the features. Method A two-step stack ensemble classifier is proposed for classifying the instances into stillbirth and livebirth at the first step and then, classifying stillbirth before delivery from stillbirth during the labor at the second step. The proposed SE has two consecutive layers including the same classifiers. The base classifiers in each layer are decision tree, Gradient boosting classifier, logistics regression, random forest and support vector machines which are trained independently and aggregated based on Vote boosting method. Moreover, a new feature ranking method is proposed in this study based on mean decrease accuracy, Gini Index and model coefficients to find high-ranked features. Results IMAN registry dataset is used in this study considering all births at or beyond 28th gestational week from 2016/04/01 to 2017/01/01 including 1,415,623 live birth and 5502 stillbirth cases. A combination of maternal demographic features, clinical history, fetal properties, delivery descriptors, environmental features, healthcare service provider descriptors and socio-demographic features are considered. The experimental results show that our proposed SE outperforms the compared classifiers with the average accuracy of 90%, sensitivity of 91%, specificity of 88%. The discrimination of the proposed SE is assessed and the average AUC of ±95%, CI of 90.51% ±1.08 and 90% ±1.12 is obtained on training dataset for model development and test dataset for external validation, respectively. The proposed SE is calibrated using isotopic nonparametric calibration method with the score of 0.07. The process is repeated 10,000 times and AUC of SE classifiers using random different training datasets as null distribution. The obtained p-value to assess the specificity of the proposed SE is 0.0126 which shows the significance of the proposed SE. Conclusions Gestational age and fetal height are two most important features for discriminating livebirth from stillbirth. Moreover, hospital, province, delivery main cause, perinatal abnormality, miscarriage number and maternal age are the most important features for classifying stillbirth before and during delivery.

Download Full-text

Comparison of Ensemble Machine Learning Methods for Soil Erosion Pin Measurements

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10010042 ◽

2021 ◽

Vol 10 (1) ◽

pp. 42

Author(s):

Kieu Anh Nguyen ◽

Walter Chen ◽

Bor-Shiun Lin ◽

Uma Seeboonruang

Keyword(s):

Machine Learning ◽

Soil Erosion ◽

Ensemble Methods ◽

Machine Learning Algorithms ◽

Multivariate Adaptive Regression Splines ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Machine Learning ◽

Boosting Method ◽

Bagging Method

Although machine learning has been extensively used in various fields, it has only recently been applied to soil erosion pin modeling. To improve upon previous methods of quantifying soil erosion based on erosion pin measurements, this study explored the possible application of ensemble machine learning algorithms to the Shihmen Reservoir watershed in northern Taiwan. Three categories of ensemble methods were considered in this study: (a) Bagging, (b) boosting, and (c) stacking. The bagging method in this study refers to bagged multivariate adaptive regression splines (bagged MARS) and random forest (RF), and the boosting method includes Cubist and gradient boosting machine (GBM). Finally, the stacking method is an ensemble method that uses a meta-model to combine the predictions of base models. This study used RF and GBM as the meta-models, decision tree, linear regression, artificial neural network, and support vector machine as the base models. The dataset used in this study was sampled using stratified random sampling to achieve a 70/30 split for the training and test data, and the process was repeated three times. The performance of six ensemble methods in three categories was analyzed based on the average of three attempts. It was found that GBM performed the best among the ensemble models with the lowest root-mean-square error (RMSE = 1.72 mm/year), the highest Nash-Sutcliffe efficiency (NSE = 0.54), and the highest index of agreement (d = 0.81). This result was confirmed by the spatial comparison of the absolute differences (errors) between model predictions and observations using GBM and RF in the study area. In summary, the results show that as a group, the bagging method and the boosting method performed equally well, and the stacking method was third for the erosion pin dataset considered in this study.

Download Full-text

Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra

Remote Sensing ◽

10.3390/rs9121299 ◽

2017 ◽

Vol 9 (12) ◽

pp. 1299 ◽

Cited By ~ 11

Author(s):

Lanfa Liu ◽

Min Ji ◽

Manfred Buchroithner

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Infrared Spectra ◽

Near Infrared ◽

Soil Property ◽

Gradient Boosting ◽

Boosting Method ◽

Shortwave Infrared

Download Full-text

Passive Fetal Movement Recognition Approaches Using Hyperparameter Tuned LightGBM Model and Bayesian Optimization

Computational Intelligence and Neuroscience ◽

10.1155/2021/6252362 ◽

2021 ◽

Vol 2021 ◽

pp. 1-18

Author(s):

Sensong Liang ◽

Jiansheng Peng ◽

Yong Xu ◽

Hemin Ye

Keyword(s):

Frequency Domain ◽

Kalman Filtering ◽

Health Monitoring ◽

Fetal Movement ◽

Bayesian Optimization ◽

Gradient Boosting ◽

Wavelet Domain ◽

Prenatal Health ◽

Light Gradient ◽

Low Amplitude

Fetal movement is an important clinical indicator to assess fetus growth and development status in the uterus. In recent years, a noninvasive intelligent sensing fetal movement detection system that can monitor high-risk pregnancies at home has received a lot of attention in the field of wearable health monitoring. However, recovering fetal movement signals from a continuous low-amplitude background that is heavily contaminated with noise and recognizing real fetal movements is a challenging task. In this paper, fetal movement can be efficiently recognized by combining the strength of Kalman filtering, time and frequency domain and wavelet domain feature extraction, and hyperparameter tuned Light Gradient Boosting Machine (LightGBM) model. Firstly, the Kalman filtering (KF) algorithm is used to recover the fetal movement signal in a continuous low-amplitude background contaminated by noise. Secondly, the time domain, frequency domain, and wavelet domain (TFWD) features of the preprocessed fetal movement signal are extracted. Finally, the Bayesian Optimization algorithm (BOA) is used to optimize the LightGBM model to obtain the optimal hyperparameters. Through this, the accurate prediction and recognition of fetal movement are successfully achieved. In the performance analysis of the Zenodo fetal movement dataset, the proposed KF + TFWD + BOA-LGBM approach’s recognition accuracy and F1-Score reached 94.06% and 96.85%, respectively. Compared with 8 existing advanced methods for fetal movement signal recognition, the proposed method has better accuracy and robustness, indicating its potential medical application in wearable smart sensing systems for fetal prenatal health monitoring.

Download Full-text

4 - Novel gradient boosting method to identify feature importance for ROSC after Cardiac Arrest (RACA)

10.26226/morressier.5d24abdefcc52c7b9b69e4a4 ◽

2019 ◽

Author(s):

Anna Bichmann ◽

Krystyna Isakova ◽

Robert Spaight

Keyword(s):

Cardiac Arrest ◽

Gradient Boosting ◽

Feature Importance ◽

Boosting Method

Download Full-text

Predictive Credit Risk Analytics Using Borrowers' Digital Footprint and Methods of Statistical Machine Learning

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.358-372 ◽

2021 ◽

Vol 12 (7) ◽

pp. 358-372

Author(s):

E. V. Orlova ◽

Keyword(s):

Machine Learning ◽

Credit Risk ◽

Risk Profile ◽

Classification Model ◽

Gradient Boosting ◽

Suggested Approach ◽

Digital Footprint ◽

Credit Risks ◽

Stochastic Gradient Boosting ◽

Boosting Method

The article considers the problem of reducing the banks credit risks associated with the insolvency of borrowers — individuals using financial, socio-economic factors and additional data about borrowers digital footprint. A critical analysis of existing approaches, methods and models in this area has been carried out and a number of significant shortcomings identified that limit their application. There is no comprehensive approach to identifying a borrowers creditworthiness based on information, including data from social networks and search engines. The new methodological approach for assessing the borrowers risk profile based on the phased processing of quantitative and qualitative data and modeling using methods of statistical analysis and machine learning is proposed. Machine learning methods are supposed to solve clustering and classification problems. They allow to automatically determine the data structure and make decisions through flexible and local training on the data. The method of hierarchical clustering and the k-means method are used to identify similar social, anthropometric and financial indicators, as well as indicators characterizing the digital footprint of borrowers, and to determine the borrowers risk profile over group. The obtained homogeneous groups of borrowers with a unique risk profile are further used for detailed data analysis in the predictive classification model. The classification model is based on the stochastic gradient boosting method to predict the risk profile of a potencial borrower. The suggested approach for individuals creditworthiness assessing will reduce the banks credit risks, increase its stability and profitability. The implementation results are of practical importance. Comparative analysis of the effectiveness of the existing and the proposed methodology for assessing credit risk showed that the new methodology provides predictive analytics of heterogeneous information about a potential borrower and the accuracy of analytics is higher. The proposed techniques are the core for the decision support system for justification of individuals credit conditions, minimizing the aggregate credit risks.

Download Full-text