gradient boosting Latest Research Papers

An interpretable probabilistic model for short-term solar power forecasting using natural gradient boosting

Applied Energy ◽

10.1016/j.apenergy.2021.118473 ◽

2022 ◽

Vol 309 ◽

pp. 118473

Author(s):

Georgios Mitrentsis ◽

Hendrik Lens

Keyword(s):

Probabilistic Model ◽

Solar Power ◽

Gradient Boosting ◽

Natural Gradient ◽

Short Term ◽

Solar Power Forecasting ◽

Power Forecasting

Student Profile Modeling Using Boosting Algorithms

International Journal of Web-Based Learning and Teaching Technologies ◽

10.4018/ijwltt.20220901.oa4 ◽

2022 ◽

Vol 17 (5) ◽

pp. 1-13

Author(s):

Touria Hamim ◽

Faouzia Benabbou ◽

Nawal Sael

Keyword(s):

Student Performance ◽

Information Gain ◽

Recursive Feature Elimination ◽

Gradient Boosting ◽

Fisher Score ◽

Student Profile ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Boosting Algorithms ◽

Classification Prediction

The student profile has become an important component of education systems. Many systems objectives, as e-recommendation, e-orientation, e-recruitment and dropout prediction are essentially based on the profile for decision support. Machine learning plays an important role in this context and several studies have been carried out either for classification, prediction or clustering purpose. In this paper, the authors present a comparative study between different boosting algorithms which have been used successfully in many fields and for many purposes. In addition, the authors applied feature selection methods Fisher Score, Information Gain combined with Recursive Feature Elimination to enhance the preprocessing task and models’ performances. Using multi-label dataset predict the class of the student performance in mathematics, this article results show that the Light Gradient Boosting Machine (LightGBM) algorithm achieved the best performance when using Information gain with Recursive Feature Elimination method compared to the other boosting algorithms.

Examining non-linear built environment effects on injurious traffic collisions: A gradient boosting decision tree analysis

Journal of Transport & Health ◽

10.1016/j.jth.2021.101296 ◽

2022 ◽

Vol 24 ◽

pp. 101296

Author(s):

Rui An ◽

Zhaomin Tong ◽

Yimei Ding ◽

Bo Tan ◽

Zihao Wu ◽

...

Keyword(s):

Built Environment ◽

Decision Tree ◽

Gradient Boosting ◽

Decision Tree Analysis ◽

Tree Analysis ◽

Environment Effects ◽

Non Linear

Novel Machine Learning for Big Data Analytics in Intelligent Support Information Management Systems

ACM Transactions on Management Information Systems ◽

10.1145/3469890 ◽

2022 ◽

Vol 13 (1) ◽

pp. 1-21

Author(s):

Zhihan Lv ◽

Ranran Lou ◽

Hailin Feng ◽

Dongliang Chen ◽

Haibin Lv

Keyword(s):

Machine Learning ◽

Big Data ◽

Information Management ◽

Management System ◽

Information Management System ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Intelligent Support ◽

Light Gradient ◽

Support Information

Two-dimensional 1 arrays of bi-component structures made of cobalt and permalloy elliptical dots with thickness of 25 nm, length 1 mm and width of 225 nm, have been prepared by a self-aligned shadow deposition technique. Brillouin light scattering has been exploited to study the frequency dependence of thermally excited magnetic eigenmodes on the intensity of the external magnetic field, applied along the easy axis of the elements. Scientific information technology has been developed rapidly. Here, the purposes are to make people's lives more convenient and ensure information management and classification. The machine learning algorithm is improved to obtain the optimized Light Gradient Boosting Machine (LightGBM) algorithm. Then, an Android-based intelligent support information management system is designed based on LightGBM for the big data analysis and classification management of information in the intelligent support information management system. The system is designed with modules of employee registration and login, company announcement notice, attendance and attendance management, self-service, and daily tools with the company as the subject. Furthermore, the performance of the constructed information management system is analyzed through simulations. Results demonstrate that the training time of the optimized LightGBM algorithm can stabilize at about 100s, and the test time can stabilize at 0.68s. Besides, its accuracy rate can reach 89.24%, which is at least 3.6% higher than other machine learning algorithms. Moreover, the acceleration efficiency analysis of each algorithm suggests that the optimized LightGBM algorithm is suitable for processing large amounts of data; its acceleration effect is more apparent, and its acceleration ratio is higher than other algorithms. Hence, the constructed intelligent support information management system can reach a high accuracy while ensuring the error, with apparent acceleration effect. Therefore, this model can provide an experimental reference for information classification and management in various fields.

A hybrid of convolutional neural network and long short-term memory network approach to predictive maintenance

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i1.pp721-730 ◽

2022 ◽

Vol 12 (1) ◽

pp. 721

Author(s):

Ahmed Nasser ◽

Huthaifa AL-Khazraji

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Predictive Maintenance ◽

Fault Prediction ◽

Gradient Boosting ◽

Short Term ◽

Term Memory ◽

Memory Network ◽

Long Short Term Memory

<p>Predictive maintenance (PdM) is a successful strategy used to reduce cost by minimizing the breakdown stoppages and production loss. The massive amount of data that results from the integration between the physical and digital systems of the production process makes it possible for deep learning (DL) algorithms to be applied and utilized for fault prediction and diagnosis. This paper presents a hybrid convolutional neural network based and long short-term memory network (CNN-LSTM) approach to a predictive maintenance problem. The proposed CNN-LSTM approach enhances the predictive accuracy and also reduces the complexity of the model. To evaluate the proposed model, two comparisons with regular LSTM and gradient boosting decision tree (GBDT) methods using a freely available dataset have been made. The PdM model based on CNN-LSTM method demonstrates better prediction accuracy compared to the regular LSTM, where the average F-Score increases form 93.34% in the case of regular LSTM to 97.48% for the proposed CNN-LSTM. Compared to the related works the proposed hybrid CNN-LSTM PdM approach achieved better results in term of accuracy.</p>

Geographical and temporal encoding for improving the estimation of PM2.5 concentrations in China using end-to-end gradient boosting

Remote Sensing of Environment ◽

10.1016/j.rse.2021.112828 ◽

2022 ◽

Vol 269 ◽

pp. 112828

Author(s):

Naisen Yang ◽

Haoze Shi ◽

Hong Tang ◽

Xin Yang

Keyword(s):

Gradient Boosting ◽

Temporal Encoding ◽

End To End

Enriching Conventional Ensemble Learner with Deep Contextual Semantics to Detect Fake News in Urdu

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3461614 ◽

2022 ◽

Vol 21 (1) ◽

pp. 1-19

Author(s):

Ramsha Saeed ◽

Hammad Afzal ◽

Haider Abbas ◽

Maheen Fatima

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

English Language ◽

Information Dissemination ◽

Majority Voting ◽

Public Image ◽

Gradient Boosting ◽

Fake News ◽

Adaptive Boosting ◽

Low Resource

Increased connectivity has contributed greatly in facilitating rapid access to information and reliable communication. However, the uncontrolled information dissemination has also resulted in the spread of fake news. Fake news might be spread by a group of people or organizations to serve ulterior motives such as political or financial gains or to damage a country’s public image. Given the importance of timely detection of fake news, the research area has intrigued researchers from all over the world. Most of the work for detecting fake news focuses on the English language. However, automated detection of fake news is important irrespective of the language used for spreading false information. Recognizing the importance of boosting research on fake news detection for low resource languages, this work proposes a novel semantically enriched technique to effectively detect fake news in Urdu—a low resource language. A model based on deep contextual semantics learned from the convolutional neural network is proposed. The features learned from the convolutional neural network are combined with other n-gram-based features and are fed to a conventional majority voting ensemble classifier fitted with three base learners: Adaptive Boosting, Gradient Boosting, and Multi-Layer Perceptron. Experiments are performed with different models, and results show that enriching the traditional ensemble learner with deep contextual semantics along with other standard features shows the best results and outperforms the state-of-the-art Urdu fake news detection model.

Prediction and Clinically Important Factors of Acute Kidney Injury Non-recovery

Frontiers in Medicine ◽

10.3389/fmed.2021.789874 ◽

2022 ◽

Vol 8 ◽

Author(s):

Chien-Liang Liu ◽

You-Lin Tain ◽

Yun-Chun Lin ◽

Chien-Ning Hsu

Keyword(s):

Acute Kidney Injury ◽

Risk Model ◽

Healthcare Delivery ◽

Kidney Injury ◽

Gradient Boosting ◽

Healthcare Delivery System ◽

Post Discharge ◽

Electronic Health Record Data ◽

Temporal Validation ◽

Extreme Gradient Boosting

ObjectiveThis study aimed to identify phenotypic clinical features associated with acute kidney injury (AKI) to predict non-recovery from AKI at hospital discharge using electronic health record data.MethodsData for hospitalized patients in the AKI Recovery Evaluation Study were derived from a large healthcare delivery system in Taiwan between January 2011 and December 2017. Living patients with AKI non-recovery were used to derive and validate multiple predictive models. In total, 64 candidates variables, such as demographic characteristics, comorbidities, healthcare services utilization, laboratory values, and nephrotoxic medication use, were measured within 1 year before the index admission and during hospitalization for AKI.ResultsAmong the top 20 important features in the predictive model, 8 features had a positive effect on AKI non-recovery prediction: AKI during hospitalization, serum creatinine (SCr) level at admission, receipt of dialysis during hospitalization, baseline comorbidity of cancer, AKI at admission, baseline lymphocyte count, baseline potassium, and low-density lipoprotein cholesterol levels. The predicted AKI non-recovery risk model using the eXtreme Gradient Boosting (XGBoost) algorithm achieved an area under the receiver operating characteristic (AUROC) curve statistic of 0.807, discrimination with a sensitivity of 0.724, and a specificity of 0.738 in the temporal validation cohort.ConclusionThe machine learning model approach can accurately predict AKI non-recovery using routinely collected health data in clinical practice. These results suggest that multifactorial risk factors are involved in AKI non-recovery, requiring patient-centered risk assessments and promotion of post-discharge AKI care to prevent AKI complications.

A Signature of 14 Long Non-Coding RNAs (lncRNAs) as a Step towards Precision Diagnosis for NSCLC

Cancers ◽

10.3390/cancers14020439 ◽

2022 ◽

Vol 14 (2) ◽

pp. 439

Author(s):

Anetta Sulewska ◽

Jacek Niklinski ◽

Radoslaw Charkiewicz ◽

Piotr Karabowicz ◽

Przemyslaw Biecek ◽

...

Keyword(s):

Lung Cancer ◽

Logistic Regression ◽

Small Cell Lung Cancer ◽

Cell Lung Cancer ◽

Medical Science ◽

Lung Squamous Cell Carcinoma ◽

Small Cell ◽

Gradient Boosting ◽

Small Cell Lung ◽

Auc Value

LncRNAs have arisen as new players in the world of non-coding RNA. Disrupted expression of these molecules can be tightly linked to the onset, promotion and progression of cancer. The present study estimated the usefulness of 14 lncRNAs (HAGLR, ADAMTS9-AS2, LINC00261, MCM3AP-AS1, TP53TG1, C14orf132, LINC00968, LINC00312, TP73-AS1, LOC344887, LINC00673, SOX2-OT, AFAP1-AS1, LOC730101) for early detection of non-small-cell lung cancer (NSCLC). The total RNA was isolated from paired fresh-frozen cancerous and noncancerous lung tissue from 92 NSCLC patients diagnosed with either adenocarcinoma (LUAD) or lung squamous cell carcinoma (LUSC). The expression level of lncRNAs was evaluated by a quantitative real-time PCR (qPCR). Based on Ct and delta Ct values, logistic regression and gradient boosting decision tree classifiers were built. The latter is a novel, advanced machine learning algorithm with great potential in medical science. The established predictive models showed that a set of 14 lncRNAs accurately discriminates cancerous from noncancerous lung tissues (AUC value of 0.98 ± 0.01) and NSCLC subtypes (AUC value of 0.84 ± 0.09), although the expression of a few molecules was statistically insignificant (SOX2-OT, AFAP1-AS1 and LOC730101 for tumor vs. normal tissue; and TP53TG1, C14orf132, LINC00968 and LOC730101 for LUAD vs. LUSC). However for subtypes discrimination, the simplified logistic regression model based on the four variables (delta Ct AFAP1-AS1, Ct SOX2-OT, Ct LINC00261, and delta Ct LINC00673) had even stronger diagnostic potential than the original one (AUC value of 0.88 ± 0.07). Our results demonstrate that the 14 lncRNA signature can be an auxiliary tool to endorse and complement the histological diagnosis of non-small-cell lung cancer.

Customer Churn in Retail E-Commerce Business: Spatial and Machine Learning Approach

Journal of theoretical and applied electronic commerce research ◽

10.3390/jtaer17010009 ◽

2022 ◽

Vol 17 (1) ◽

pp. 165-198

Author(s):

Kamil Matuszelański ◽

Katarzyna Kopczewska

Keyword(s):

Machine Learning ◽

Urban Areas ◽

Latent Dirichlet Allocation ◽

Demographic Data ◽

Numerical Data ◽

Gradient Boosting ◽

Modern Approach ◽

Rural And Urban Areas ◽

Customer Churn ◽

Extreme Gradient Boosting

This study is a comprehensive and modern approach to predict customer churn in the example of an e-commerce retail store operating in Brazil. Our approach consists of three stages in which we combine and use three different datasets: numerical data on orders, textual after-purchase reviews and socio-geo-demographic data from the census. At the pre-processing stage, we find topics from text reviews using Latent Dirichlet Allocation, Dirichlet Multinomial Mixture and Gibbs sampling. In the spatial analysis, we apply DBSCAN to get rural/urban locations and analyse neighbourhoods of customers located with zip codes. At the modelling stage, we apply machine learning extreme gradient boosting and logistic regression. The quality of models is verified with area-under-curve and lift metrics. Explainable artificial intelligence represented with a permutation-based variable importance and a partial dependence profile help to discover the determinants of churn. We show that customers’ propensity to churn depends on: (i) payment value for the first order, number of items bought and shipping cost; (ii) categories of the products bought; (iii) demographic environment of the customer; and (iv) customer location. At the same time, customers’ propensity to churn is not influenced by: (i) population density in the customer’s area and division into rural and urban areas; (ii) quantitative review of the first purchase; and (iii) qualitative review summarised as a topic.

gradient boosting
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An interpretable probabilistic model for short-term solar power forecasting using natural gradient boosting

Student Profile Modeling Using Boosting Algorithms

Examining non-linear built environment effects on injurious traffic collisions: A gradient boosting decision tree analysis

Novel Machine Learning for Big Data Analytics in Intelligent Support Information Management Systems

A hybrid of convolutional neural network and long short-term memory network approach to predictive maintenance

Geographical and temporal encoding for improving the estimation of PM2.5 concentrations in China using end-to-end gradient boosting

Enriching Conventional Ensemble Learner with Deep Contextual Semantics to Detect Fake News in Urdu

Prediction and Clinically Important Factors of Acute Kidney Injury Non-recovery

A Signature of 14 Long Non-Coding RNAs (lncRNAs) as a Step towards Precision Diagnosis for NSCLC

Customer Churn in Retail E-Commerce Business: Spatial and Machine Learning Approach

Export Citation Format

gradient boostingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An interpretable probabilistic model for short-term solar power forecasting using natural gradient boosting

Student Profile Modeling Using Boosting Algorithms

Examining non-linear built environment effects on injurious traffic collisions: A gradient boosting decision tree analysis

Novel Machine Learning for Big Data Analytics in Intelligent Support Information Management Systems

A hybrid of convolutional neural network and long short-term memory network approach to predictive maintenance

Geographical and temporal encoding for improving the estimation of PM2.5 concentrations in China using end-to-end gradient boosting

Enriching Conventional Ensemble Learner with Deep Contextual Semantics to Detect Fake News in Urdu

Prediction and Clinically Important Factors of Acute Kidney Injury Non-recovery

A Signature of 14 Long Non-Coding RNAs (lncRNAs) as a Step towards Precision Diagnosis for NSCLC

Customer Churn in Retail E-Commerce Business: Spatial and Machine Learning Approach

gradient boosting
Recently Published Documents