Solar Flare Prediction Using Two-tier Ensemble with Deep Learning and Gradient Boosting Machine

Designing a system for analytics of high-frequency data (Big data) is a very challenging and crucial task in data science. Big data analytics involves the development of an efficient machine learning algorithm and big data processing techniques or frameworks. Today, the development of the data processing system is in high demand for processing high-frequency data in a very efficient manner. This paper proposes the processing and analytics of stochastic high-frequency stock market data using a modified version of suitable Gradient Boosting Machine (GBM). The experimental results obtained are compared with deep learning and Auto-Regressive Integrated Moving Average (ARIMA) methods. The results obtained using modified GBM achieves the highest accuracy (R2 = 0.98) and minimum error (RMSE = 0.85) as compared to the other two approaches.

Download Full-text

Rain garden infiltration rate modeling using gradient boosting machine and deep learning techniques

Water Science & Technology ◽

10.2166/wst.2021.444 ◽

2021 ◽

Author(s):

Sandeep Kumar ◽

K. K. Singh

Keyword(s):

Deep Learning ◽

Ground Surface ◽

Correlation Coefficients ◽

Infiltration Rate ◽

Gradient Boosting ◽

Water Runoff ◽

Rain Garden ◽

Rain Gardens ◽

Infiltration Model ◽

Gradient Boosting Machine

Abstract Rain garden are effective in reducing storm water runoff, whose efficiency depends upon several parameters such as soil type, vegetation and metrological factors. Evaluation of rain gardens has been done by various researchers. However, knowledge for sound design of rain gardens is still very limited, particularly the accurate modeling of infiltration rate and how much it differs from infiltration of natural ground surface. The present study uses experimentally observed infiltration rate of rain gardens with different types of vegetation (grass, candytuft, marigold and daisy with different plant densities) and flow conditions. After that, modeling has been done by the popular infiltration model i.e. Philip's model (which is valid for natural ground surface) and soft computing tools viz. Gradient Boosting Machine (GBM) and Deep Learning (DL). Results suggest a promising performance (in terms of CC, RMSE, MAE, MSE and NSE) by GBM and DL in comparison to the relation proposed by Philip's model (1957). Most of the values predicted by both GBM and DL are within scatter limits of ±5%, whereas the values by Philips model are within the range of ±25% error lines and even outside. GBM performs better than DL as the values of the correlation coefficients and Nash-Sutcliffe model efficiency (NSE) coefficient are the highest and the root mean square error is the lowest. The results of the study will be useful in selection of plant type and their density of the rain garden in the urban area.

Download Full-text

Deep learning under H2O framework: A novel approach for quantitative analysis of discharge coefficient in sluice gates

Journal of Hydroinformatics ◽

10.2166/hydro.2020.003 ◽

2020 ◽

Vol 22 (6) ◽

pp. 1603-1619

Author(s):

Mohammad Ali Ghorbani ◽

Farzin Salmasi ◽

Mandeep Kaur Saggi ◽

Amandeep Singh Bhatia ◽

Ercan Kahya ◽

...

Keyword(s):

Deep Learning ◽

Discharge Coefficient ◽

Point Of View ◽

Gradient Boosting ◽

Hydraulic Radius ◽

Sluice Gate ◽

Novel Approach ◽

Gradient Boosting Machine ◽

The University ◽

Positive Effect

Abstract Gates in dams and irrigation canals have been used for the purpose of controlling discharge or water surface regulation. To compute the discharge under a gate, discharge coefficient (Cd) should be first determined precisely. From a novel point of view, this study investigates the effect of sill shape under the vertical sluice gate on Cd using four artificial intelligence methods, which are used to estimate Cd, (i) random forest (RF), (ii) deep learning (DL), (iii) gradient boosting machine (GBM), and (iv) generalized linear model (GLM). A sluice gate along with twelve different forms of sills was fabricated and tested in the University of Tabriz, Iran. Different flow rates were considered in the hydraulic laboratory with four gate openings. As a result, a total of 180 runs could be tested. The results showed that the installation of sill under the vertical gate has a positive effect on flow discharge. Sill shapes can be characterized by their hydraulic radius (Rs). Sensitivity analysis among the dimensionless parameters proved that Rs/G (the ratio of the hydraulic radius of the sills with respect to the gate opening) has a significant role in the determination of Cd. A semi-circular sill shape has a more positive effect on the increase of Cd than the other shapes.

Download Full-text

An Ensemble of Random Forest Gradient Boosting Machine and Deep Learning Methods for Stock Price Prediction

Journal of Information Technology Research ◽

10.4018/jitr.2022010102 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-19

Author(s):

Ravinder Kumar ◽

Lokesh Kumar Shrivastav

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Random Forest ◽

Stock Market ◽

Data Analytics ◽

Gradient Boosting ◽

Ensemble Model ◽

Market Data ◽

Stock Price Prediction ◽

Gradient Boosting Machine

Stochastic time series analysis of high-frequency stock market data is a very challenging task for the analysts due to the lack availability of efficient tool and techniques for big data analytics. This has opened the door of opportunities for the developer and researcher to develop intelligent and machine learning based tools and techniques for data analytics. This paper proposed an ensemble for stock market data prediction using three most prominent machine learning based techniques. The stock market dataset with raw data size of 39364 KB with all attributes and processed data size of 11826 KB having 872435 instances. The proposed work implements an ensemble model comprises of Deep Learning, Gradient Boosting Machine (GBM) and distributed Random Forest techniques of data analytics. The performance results of the ensemble model are compared with each of the individual methods i.e. deep learning, Gradient Boosting Machine (GBM) and Random Forest. The ensemble model performs better and achieves the highest accuracy of 0.99 and lowest error (RMSE) of 0.1.

Download Full-text

Predicting the 10-year risk of cataract surgery using machine learning techniques on questionnaire data: findings from the 45 and Up Study

British Journal of Ophthalmology ◽

10.1136/bjophthalmol-2020-318609 ◽

2021 ◽

pp. bjophthalmol-2020-318609

Author(s):

Wei Wang ◽

Xiaotong Han ◽

Jiaqing Zhang ◽

Xianwen Shang ◽

Jason Ha ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Cataract Surgery ◽

Logistic Model ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Questionnaire Data ◽

Gradient Boosting Machine ◽

Logistic Regression Method ◽

Baseline Information

Background/aimsTo investigate the feasibility and accuracy of using machine learning (ML) techniques on self-reported questionnaire data to predict the 10-year risk of cataract surgery, and to identify meaningful predictors of cataract surgery in middle-aged and older Australians.MethodsBaseline information regarding demographic, socioeconomic, medical history and family history, lifestyle, dietary and self-rated health status were collected as risk factors. Cataract surgery events were confirmed by the Medicare Benefits Schedule Claims dataset. Three ML algorithms (random forests [RF], gradient boosting machine and deep learning) and one traditional regression algorithm (logistic model) were compared on the accuracy of their predictions for the risk of cataract surgery. The performance was assessed using 10-fold cross-validation. The main outcome measures were areas under the receiver operating characteristic curves (AUCs).ResultsIn total, 207 573 participants, aged 45 years and above without a history of cataract surgery at baseline, were recruited from the 45 and Up Study. The performance of gradient boosting machine (AUC 0.790, 95% CI 0.785 to 0.795), RF (AUC 0.785, 95% CI 0.780 to 0.790) and deep learning (AUC 0.781, 95% CI 0.775 to 61 0.786) were robust and outperformed the traditional logistic regression method (AUC 0.767, 95% CI 0.762 to 0.773, all p<0.05). Age, self-rated eye vision and health insurance were consistently identified as important predictors in all models.ConclusionsThe study demonstrated that ML modelling was able to reasonably accurately predict the 10-year risk of cataract surgery based on questionnaire data alone and was marginally superior to the conventional logistic model.

Download Full-text

Investigating the use of random forest, gradient boosting machine, support vector machine and their ensemble applied to fault detection

10.26678/abcm.cobem2017.cob17-1600 ◽

2017 ◽

Author(s):

Luis Felipe Nogoseke ◽

Gabriel Herman Bernardim Andrade ◽

Marco Boaretto ◽

Leandro Coelho

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Fault Detection ◽

Gradient Boosting ◽

Support Vector ◽

Gradient Boosting Machine

Download Full-text

An Ensemble Model for Short-Term Wind Power Forecasting using Deep Learning and Gradient Boosting Algorithms

2020 21st National Power Systems Conference (NPSC) ◽

10.1109/npsc49263.2020.9331902 ◽

2020 ◽

Author(s):

Devesh Kumar ◽

Rishabh Abhinav ◽

Naran Pindoriya

Keyword(s):

Deep Learning ◽

Wind Power ◽

Gradient Boosting ◽

Ensemble Model ◽

Short Term ◽

Wind Power Forecasting ◽

Boosting Algorithms ◽

Power Forecasting

Download Full-text

Estimation of modified expansive soil CBR with multivariate adaptive regression splines, random forest and gradient boosting machine

Innovative Infrastructure Solutions ◽

10.1007/s41062-021-00568-z ◽

2021 ◽

Vol 6 (4) ◽

Author(s):

Chijioke Christopher Ikeagwuani

Keyword(s):

Random Forest ◽

Expansive Soil ◽

Multivariate Adaptive Regression Splines ◽

Gradient Boosting ◽

Regression Splines ◽

Gradient Boosting Machine ◽

Adaptive Regression ◽

Adaptive Regression Splines

Download Full-text

Diagnosing breast cancer tumors using stacked ensemble model

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219176 ◽

2021 ◽

pp. 1-9

Author(s):

Ahmet Haşim Yurttakal ◽

Hasan Erbay ◽

Türkan İkizceli ◽

Seyhan Karaçavuş ◽

Cenker Biçer

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Medical Imaging ◽

Early Stage ◽

False Negative ◽

Gradient Boosting ◽

Physical Sign ◽

Ensemble Model ◽

Learning Methods ◽

Dce Mri

Breast cancer is the most common cancer that progresses from cells in the breast tissue among women. Early-stage detection could reduce death rates significantly, and the detection-stage determines the treatment process. Mammography is utilized to discover breast cancer at an early stage prior to any physical sign. However, mammography might return false-negative, in which case, if it is suspected that lesions might have cancer of chance greater than two percent, a biopsy is recommended. About 30 percent of biopsies result in malignancy that means the rate of unnecessary biopsies is high. So to reduce unnecessary biopsies, recently, due to its excellent capability in soft tissue imaging, Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) has been utilized to detect breast cancer. Nowadays, DCE-MRI is a highly recommended method not only to identify breast cancer but also to monitor its development, and to interpret tumorous regions. However, in addition to being a time-consuming process, the accuracy depends on radiologists’ experience. Radiomic data, on the other hand, are used in medical imaging and have the potential to extract disease characteristics that can not be seen by the naked eye. Radiomics are hard-coded features and provide crucial information about the disease where it is imaged. Conversely, deep learning methods like convolutional neural networks(CNNs) learn features automatically from the dataset. Especially in medical imaging, CNNs’ performance is better than compared to hard-coded features-based methods. However, combining the power of these two types of features increases accuracy significantly, which is especially critical in medicine. Herein, a stacked ensemble of gradient boosting and deep learning models were developed to classify breast tumors using DCE-MRI images. The model makes use of radiomics acquired from pixel information in breast DCE-MRI images. Prior to train the model, radiomics had been applied to the factor analysis to refine the feature set and eliminate unuseful features. The performance metrics, as well as the comparisons to some well-known machine learning methods, state the ensemble model outperforms its counterparts. The ensembled model’s accuracy is 94.87% and its AUC value is 0.9728. The recall and precision are 1.0 and 0.9130, respectively, whereas F1-score is 0.9545.

Download Full-text

Solar Flare Prediction Using Two-tier Ensemble with Deep Learning and Gradient Boosting Machine

A Combined Deep Learning-Gradient Boosting Machine Framework for Fluid Intelligence Prediction

Gradient Boosting Machine and Deep Learning Approach in Big Data Analysis

Rain garden infiltration rate modeling using gradient boosting machine and deep learning techniques

Deep learning under H2O framework: A novel approach for quantitative analysis of discharge coefficient in sluice gates

An Ensemble of Random Forest Gradient Boosting Machine and Deep Learning Methods for Stock Price Prediction

Predicting the 10-year risk of cataract surgery using machine learning techniques on questionnaire data: findings from the 45 and Up Study

Investigating the use of random forest, gradient boosting machine, support vector machine and their ensemble applied to fault detection

An Ensemble Model for Short-Term Wind Power Forecasting using Deep Learning and Gradient Boosting Algorithms

Estimation of modified expansive soil CBR with multivariate adaptive regression splines, random forest and gradient boosting machine

Diagnosing breast cancer tumors using stacked ensemble model

Export Citation Format