gradient boosting machine Latest Research Papers

Student Profile Modeling Using Boosting Algorithms

International Journal of Web-Based Learning and Teaching Technologies ◽

10.4018/ijwltt.20220901.oa4 ◽

2022 ◽

Vol 17 (5) ◽

pp. 1-13

Author(s):

Touria Hamim ◽

Faouzia Benabbou ◽

Nawal Sael

Keyword(s):

Student Performance ◽

Information Gain ◽

Recursive Feature Elimination ◽

Gradient Boosting ◽

Fisher Score ◽

Student Profile ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Boosting Algorithms ◽

Classification Prediction

The student profile has become an important component of education systems. Many systems objectives, as e-recommendation, e-orientation, e-recruitment and dropout prediction are essentially based on the profile for decision support. Machine learning plays an important role in this context and several studies have been carried out either for classification, prediction or clustering purpose. In this paper, the authors present a comparative study between different boosting algorithms which have been used successfully in many fields and for many purposes. In addition, the authors applied feature selection methods Fisher Score, Information Gain combined with Recursive Feature Elimination to enhance the preprocessing task and models’ performances. Using multi-label dataset predict the class of the student performance in mathematics, this article results show that the Light Gradient Boosting Machine (LightGBM) algorithm achieved the best performance when using Information gain with Recursive Feature Elimination method compared to the other boosting algorithms.

Cardiovascular Applications of Artificial Intelligence in Research, Diagnosis, and Disease Management

10.4018/978-1-7998-8455-2.ch004 ◽

2022 ◽

pp. 80-127

Author(s):

Viswanathan Rajagopalan ◽

Houwei Cao

Keyword(s):

Artificial Intelligence ◽

Disease Management ◽

Heart Attack ◽

Congenital Heart ◽

The United States ◽

Disease Classification ◽

Gradient Boosting ◽

Support Vector ◽

Gradient Boosting Machine ◽

Infarction Heart

Despite significant advancements in diagnosis and disease management, cardiovascular (CV) disorders remain the No. 1 killer both in the United States and across the world, and innovative and transformative technologies such as artificial intelligence (AI) are increasingly employed in CV medicine. In this chapter, the authors introduce different AI and machine learning (ML) tools including support vector machine (SVM), gradient boosting machine (GBM), and deep learning models (DL), and their applicability to advance CV diagnosis and disease classification, and risk prediction and patient management. The applications include, but are not limited to, electrocardiogram, imaging, genomics, and drug research in different CV pathologies such as myocardial infarction (heart attack), heart failure, congenital heart disease, arrhythmias, valvular abnormalities, etc.

Gradient Boosting Machine and Deep Learning Approach in Big Data Analysis

Journal of Information Technology Research ◽

10.4018/jitr.2022010101 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-20

Author(s):

Ravinder Kumar ◽

Lokesh Kumar Shrivastav

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Processing ◽

High Frequency ◽

Data Science ◽

Processing System ◽

High Frequency Data ◽

Gradient Boosting ◽

Frequency Data ◽

Gradient Boosting Machine

Designing a system for analytics of high-frequency data (Big data) is a very challenging and crucial task in data science. Big data analytics involves the development of an efficient machine learning algorithm and big data processing techniques or frameworks. Today, the development of the data processing system is in high demand for processing high-frequency data in a very efficient manner. This paper proposes the processing and analytics of stochastic high-frequency stock market data using a modified version of suitable Gradient Boosting Machine (GBM). The experimental results obtained are compared with deep learning and Auto-Regressive Integrated Moving Average (ARIMA) methods. The results obtained using modified GBM achieves the highest accuracy (R2 = 0.98) and minimum error (RMSE = 0.85) as compared to the other two approaches.

An Ensemble of Random Forest Gradient Boosting Machine and Deep Learning Methods for Stock Price Prediction

Journal of Information Technology Research ◽

10.4018/jitr.2022010102 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-19

Author(s):

Ravinder Kumar ◽

Lokesh Kumar Shrivastav

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Random Forest ◽

Stock Market ◽

Data Analytics ◽

Gradient Boosting ◽

Ensemble Model ◽

Market Data ◽

Stock Price Prediction ◽

Gradient Boosting Machine

Stochastic time series analysis of high-frequency stock market data is a very challenging task for the analysts due to the lack availability of efficient tool and techniques for big data analytics. This has opened the door of opportunities for the developer and researcher to develop intelligent and machine learning based tools and techniques for data analytics. This paper proposed an ensemble for stock market data prediction using three most prominent machine learning based techniques. The stock market dataset with raw data size of 39364 KB with all attributes and processed data size of 11826 KB having 872435 instances. The proposed work implements an ensemble model comprises of Deep Learning, Gradient Boosting Machine (GBM) and distributed Random Forest techniques of data analytics. The performance results of the ensemble model are compared with each of the individual methods i.e. deep learning, Gradient Boosting Machine (GBM) and Random Forest. The ensemble model performs better and achieves the highest accuracy of 0.99 and lowest error (RMSE) of 0.1.

An Application of Machine Learning That Uses the Magnetic Resonance Imaging Metric, Mean Apparent Diffusion Coefficient, to Differentiate between the Histological Types of Ovarian Cancer

Journal of Clinical Medicine ◽

10.3390/jcm11010229 ◽

2021 ◽

Vol 11 (1) ◽

pp. 229

Author(s):

Heekyoung Song ◽

Seongeun Bak ◽

Imhyeon Kim ◽

Jae Yeon Woo ◽

Eui Jin Cho ◽

...

Keyword(s):

Magnetic Resonance Imaging ◽

Ovarian Cancer ◽

Disease Stage ◽

Gradient Boosting ◽

Serous Ovarian Cancer ◽

Resonance Imaging ◽

Solid Portion ◽

Histological Types ◽

Gradient Boosting Machine ◽

Apparent Diffusion

This retrospective single-center study included patients diagnosed with epithelial ovarian cancer (EOC) using preoperative pelvic magnetic resonance imaging (MRI). The apparent diffusion coefficient (ADC) of the axial MRI maps that included the largest solid portion of the ovarian mass was analysed. The mean ADC values (ADCmean) were derived from the regions of interest (ROIs) of each largest solid portion. Logistic regression and three types of machine learning (ML) applications were used to analyse the ADCs and clinical factors. Of the 200 patients, 103 had high-grade serous ovarian cancer (HGSOC), and 97 had non-HGSOC (endometrioid carcinoma, clear cell carcinoma, mucinous carcinoma, and low-grade serous ovarian cancer). The median ADCmean of patients with HGSOC was significantly lower than that of patients without HGSOCs. Low ADCmean and CA 19-9 levels were independent predictors for HGSOC over non-HGSOC. Compared to stage I disease, stage III disease was associated with HGSOC. Gradient boosting machine and extreme gradient boosting machine showed the highest accuracy in distinguishing between the histological findings of HGSOC versus non-HGSOC and between the five histological types of EOC. In conclusion, ADCmean, disease stage at diagnosis, and CA 19-9 level were significant factors for differentiating between EOC histological types.

Gradient Boosting Machine to Assess the Public Protest Impact on Urban Air Quality

Applied Sciences ◽

10.3390/app112412083 ◽

2021 ◽

Vol 11 (24) ◽

pp. 12083

Author(s):

Rasa Zalakeviciute ◽

Yves Rybarczyk ◽

Katiuska Alexandrino ◽

Santiago Bonilla-Bedoya ◽

Danilo Mejia ◽

...

Keyword(s):

Machine Learning ◽

Air Quality ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Urban Air ◽

Slight Reduction ◽

Main Stage ◽

Public Protest ◽

Gradient Boosting Machine ◽

Social Side

Political and economic protests build-up due to the financial uncertainty and inequality spreading throughout the world. In 2019, Latin America took the main stage in a wave of protests. While the social side of protests is widely explored, the focus of this study is the evolution of gaseous urban air pollutants during and after one of these events. Changes in concentrations of NO2, CO, O3 and SO2 during and after the strike, were studied in Quito, Ecuador using two approaches: (i) inter-period observational analysis; and (ii) machine learning (ML) gradient boosting machine (GBM) developed business-as-usual (BAU) comparison to the observations. During the strike, both methods showed a large reduction in the concentrations of NO2 (31.5–32.36%) and CO (15.55–19.85%) and a slight reduction for O3 and SO2. The GBM approach showed an exclusive potential, especially for a lengthier period of predictions, to estimate strike impact on air quality even after the strike was over. This advocates for the use of machine learning techniques to estimate an extended effect of changes in human activities on urban gaseous pollution.

Protein pKa prediction by tree-based machine learning

10.26434/chemrxiv-2021-4d420 ◽

2021 ◽

Author(s):

Ada Y. Chen ◽

Juyong Lee ◽

Ana Damjanovic ◽

Bernard R. Brooks

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Pka Prediction ◽

Light Gradient ◽

Structure Database ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Better Than ◽

Protein Pka

We present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA. The overall RMSE for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys and Tyr), and 0.63 when considering Asp, Glu, His and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.

A stacked generalization ensemble model for optimization and prediction of the gas well rate of penetration: a case study in Xinjiang

Journal of Petroleum Exploration and Production Technology ◽

10.1007/s13202-021-01402-z ◽

2021 ◽

Author(s):

Naipeng Liu ◽

Hui Gao ◽

Zhen Zhao ◽

Yule Hu ◽

Longchen Duan

Keyword(s):

Pearson Correlation ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Model ◽

Rate Of Penetration ◽

Gas Drilling ◽

Light Gradient ◽

Stacked Generalization ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting

AbstractIn gas drilling operations, the rate of penetration (ROP) parameter has an important influence on drilling costs. Prediction of ROP can optimize the drilling operational parameters and reduce its overall cost. To predict ROP with satisfactory precision, a stacked generalization ensemble model is developed in this paper. Drilling data were collected from a shale gas survey well in Xinjiang, northwestern China. First, Pearson correlation analysis is used for feature selection. Then, a Savitzky-Golay smoothing filter is used to reduce noise in the dataset. In the next stage, we propose a stacked generalization ensemble model that combines six machine learning models: support vector regression (SVR), extremely randomized trees (ET), random forest (RF), gradient boosting machine (GB), light gradient boosting machine (LightGBM) and extreme gradient boosting (XGB). The stacked model generates meta-data from the five models (SVR, ET, RF, GB, LightGBM) to compute ROP predictions using an XGB model. Then, the leave-one-out method is used to verify modeling performance. The performance of the stacked model is better than each single model, with R2 = 0.9568 and root mean square error = 0.4853 m/h achieved on the testing dataset. Hence, the proposed approach will be useful in optimizing gas drilling. Finally, the particle swarm optimization (PSO) algorithm is used to optimize the relevant ROP parameters.

Robust Spatiotemporal Estimation of PM Concentrations Using Boosting-Based Ensemble Models

Sustainability ◽

10.3390/su132413782 ◽

2021 ◽

Vol 13 (24) ◽

pp. 13782

Author(s):

Soyoung Park ◽

Sanghun Son ◽

Jaegu Bae ◽

Doi Lee ◽

Jae-Jin Kim ◽

...

Keyword(s):

High Performance ◽

Mean Squared Error ◽

Spatial Prediction ◽

Absolute Error ◽

Air Pollutant ◽

Gradient Boosting ◽

Percentage Error ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Daily Pm10

Particulate matter (PM) as an air pollutant is harmful to the human body as well as to the ecosystem. It is crucial to understand the spatiotemporal PM distribution in order to effectively implement reduction methods. However, ground-based air quality monitoring sites are limited in providing reliable concentration values owing to their patchy distribution. Here, we aimed to predict daily PM10 concentrations using boosting algorithms such as gradient boosting machine (GBM), extreme gradient boost (XGB), and light gradient boosting machine (LightGBM). The three models performed well in estimating the spatial contrasts and temporal variability in daily PM10 concentrations. In particular, the LightGBM model outperformed the GBM and XGM models, with an adjusted R2 of 0.84, a root mean squared error of 12.108 μg/m2, a mean absolute error of 8.543 μg/m2, and a mean absolute percentage error of 16%. Despite having high performance, the LightGBM model showed low spatial prediction accuracy near the southwest part of the study area. Additionally, temporal differences were found between the observed and predicted values at high concentrations. These outcomes indicate that such methods can provide intuitive and reliable PM10 concentration values for the management, prevention, and mitigation of air pollution. In the future, performance accuracy could be improved through consideration of different variables related to spatial and seasonal characteristics.

Intelligent Diagnosis of Rolling Bearing Fault Based on Improved Convolutional Neural Network and LightGBM

Shock and Vibration ◽

10.1155/2021/1205473 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Yanwei Xu ◽

Weiwei Cai ◽

Liuyang Wang ◽

Tancheng Xie

Keyword(s):

Neural Network ◽

Classification Accuracy ◽

Rolling Bearing ◽

Experimental Result ◽

Gradient Boosting ◽

Intelligent Diagnosis ◽

Generalization Ability ◽

Bearing Fault ◽

Light Gradient ◽

Gradient Boosting Machine

Aiming at the problems of weak generalization ability and long training time in most fault diagnosis models based on deep learning, such as support vector machines and random forest algorithms, one intelligent diagnosis method of rolling bearing fault based on the improved convolution neural network and light gradient boosting machine is proposed. At first, the convolution layer is used to extract the features of the original signal. Second, the generalization ability of the model is improved by replacing the full connection layer with the global average pooling layer. Then, the extracted features are classified by a light gradient boosting machine. Finally, the verification experiment is carried out, and the experimental result shows that the average training and diagnosis time of the model is only 39.73 s and 0.09 s, respectively, and the average classification accuracy of the model is 99.72% and 95.62%, respectively, on the same and variable load test sets, which indicates that the diagnostic efficiency and classification accuracy of the proposed model are better than those of other comparison models.

gradient boosting machine
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Student Profile Modeling Using Boosting Algorithms

Cardiovascular Applications of Artificial Intelligence in Research, Diagnosis, and Disease Management

Gradient Boosting Machine and Deep Learning Approach in Big Data Analysis

An Ensemble of Random Forest Gradient Boosting Machine and Deep Learning Methods for Stock Price Prediction

An Application of Machine Learning That Uses the Magnetic Resonance Imaging Metric, Mean Apparent Diffusion Coefficient, to Differentiate between the Histological Types of Ovarian Cancer

Gradient Boosting Machine to Assess the Public Protest Impact on Urban Air Quality

Protein pKa prediction by tree-based machine learning

A stacked generalization ensemble model for optimization and prediction of the gas well rate of penetration: a case study in Xinjiang

Robust Spatiotemporal Estimation of PM Concentrations Using Boosting-Based Ensemble Models

Intelligent Diagnosis of Rolling Bearing Fault Based on Improved Convolutional Neural Network and LightGBM

Export Citation Format

gradient boosting machineRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Student Profile Modeling Using Boosting Algorithms

Cardiovascular Applications of Artificial Intelligence in Research, Diagnosis, and Disease Management

Gradient Boosting Machine and Deep Learning Approach in Big Data Analysis

An Ensemble of Random Forest Gradient Boosting Machine and Deep Learning Methods for Stock Price Prediction

An Application of Machine Learning That Uses the Magnetic Resonance Imaging Metric, Mean Apparent Diffusion Coefficient, to Differentiate between the Histological Types of Ovarian Cancer

Gradient Boosting Machine to Assess the Public Protest Impact on Urban Air Quality

Protein pKa prediction by tree-based machine learning

A stacked generalization ensemble model for optimization and prediction of the gas well rate of penetration: a case study in Xinjiang

Robust Spatiotemporal Estimation of PM Concentrations Using Boosting-Based Ensemble Models

Intelligent Diagnosis of Rolling Bearing Fault Based on Improved Convolutional Neural Network and LightGBM

gradient boosting machine
Recently Published Documents