scholarly journals Prediction of home energy consumption based on gradient boosting regression tree

2021 ◽  
Vol 7 ◽  
pp. 1246-1255
Author(s):  
Peng Nie ◽  
Michele Roccotelli ◽  
Maria Pia Fanti ◽  
Zhengfeng Ming ◽  
Zhiwu Li
2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


Processes ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 655
Author(s):  
Huanhuan Zhang ◽  
Jigeng Li ◽  
Mengna Hong

With the global energy crisis and environmental pollution intensifying, tissue papermaking enterprises urgently need to save energy. The energy consumption model is essential for the energy saving of tissue paper machines. The energy consumption of tissue paper machine is very complicated, and the workload and difficulty of using the mechanism model to establish the energy consumption model of tissue paper machine are very large. Therefore, this article aims to build an empirical energy consumption model for tissue paper machines. The energy consumption of this model includes electricity consumption and steam consumption. Since the process parameters have a great influence on the energy consumption of the tissue paper machines, this study uses three methods: linear regression, artificial neural network and extreme gradient boosting tree to establish the relationship between process parameters and power consumption, and process parameters and steam consumption. Then, the best power consumption model and the best steam consumption model are selected from the models established by linear regression, artificial neural network and the extreme gradient boosting tree. Further, they are combined into the energy consumption model of the tissue paper machine. Finally, the models established by the three methods are evaluated. The experimental results show that using the empirical model for tissue paper machine energy consumption modeling is feasible. The result also indicates that the power consumption model and steam consumption model established by the extreme gradient boosting tree are better than the models established by linear regression and artificial neural network. The experimental results show that the power consumption model and steam consumption model established by the extreme gradient boosting tree are better than the models established by linear regression and artificial neural network. The mean absolute percentage error of the electricity consumption model and the steam consumption model built by the extreme gradient boosting tree is approximately 2.72 and 1.87, respectively. The root mean square errors of these two models are about 4.74 and 0.03, respectively. The result also indicates that using the empirical model for tissue paper machine energy consumption modeling is feasible, and the extreme gradient boosting tree is an efficient method for modeling energy consumption of tissue paper machines.


Metals ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 833
Author(s):  
Irene Mirandola ◽  
Guido A. Berti ◽  
Roberto Caracciolo ◽  
Seungro Lee ◽  
Naksoo Kim ◽  
...  

This research provides an insight on the performances of machine learning (ML)-based algorithms for the estimation of the energy consumption in metal forming processes and is applied to the radial-axial ring rolling process. To define the mutual influence between ring geometry, process settings, and ring rolling mill geometries with the resulting energy consumption, measured in terms of the force integral over the processing time (FIOT), FEM simulations have been implemented in the commercial SW Simufact Forming 15. A total of 380 finite element simulations with rings ranging from 650 mm < DF < 2000 mm have been implemented and constitute the bulk of the training and validation datasets. Both finite element simulation settings (input), as well as the FI (output), have been utilized for the training of eight machine learning models, implemented with Python scripts. The results allow defining that the Gradient Boosting (GB) method is the most reliable for the FIOT prediction in forming processes, being its maximum and average errors equal to 9.03% and 3.18%, respectively. The trained ML models have been also applied to own and literature experimental cases, showing a maximum and average error equal to 8.00% and 5.70%, respectively, thus proving once again its reliability.


2020 ◽  
Vol 2020 ◽  
pp. 1-19
Author(s):  
De-Cheng Feng ◽  
Bo Fu

In this paper, an intelligent modeling approach is presented to predict the shear strength of the internal reinforced concrete (RC) beam-column joints and used to analyze the sensitivity of the influence factors on the shear strength. The proposed approach is established based on the famous boosting-family ensemble machine learning (ML) algorithms, i.e., gradient boosting regression tree (GBRT), which generates a strong predictive model by integrating several weak predictors, which are obtained by the well-known individual ML algorithms, e.g., DT, ANN, and SVM. The strong model is boosted as each weak predictor has its own weight in the final combination according to the performance. Compared with the conventional mechanical-driven shear strength models, e.g., the well-known modified compression field theory (MCFT), the proposed model can avoid the complicated derivation process of shear mechanism and calibration of the involved empirical parameters; thus, it provides a more convenient, fast, and robust alternative way for predicting the shear strength of the internal RC joints. To train and test the GBRT model, a total of 86 internal RC joint specimens are collected from the literatures, and four traditional ML models and the MCFT model are also employed as comparisons. The results indicate that the GBRT model is superior to both the traditional ML models and MCFT model, as its degree-of-fitting is the highest and the predicting dispersion is the lowest. Finally, the model is used to investigate the influences of different parameters on the shear strength of the internal RC joint, and the sensitivity and importance of the corresponding parameters are obtained.


Author(s):  
Ricardo Nascimento dos Santos ◽  
Sami Yamouni ◽  
Beatriz Albiero ◽  
Estevão Uyrá ◽  
Ramon Vilarino ◽  
...  

SPE Journal ◽  
2018 ◽  
Vol 23 (04) ◽  
pp. 1075-1089 ◽  
Author(s):  
Jared Schuetter ◽  
Srikanta Mishra ◽  
Ming Zhong ◽  
Randy LaFollette (ret.)

Summary Considerable amounts of data are being generated during the development and operation of unconventional reservoirs. Statistical methods that can provide data-driven insights into production performance are gaining in popularity. Unfortunately, the application of advanced statistical algorithms remains somewhat of a mystery to petroleum engineers and geoscientists. The objective of this paper is to provide some clarity to this issue, focusing on how to build robust predictive models and how to develop decision rules that help identify factors separating good wells from poor performers. The data for this study come from wells completed in the Wolfcamp Shale Formation in the Permian Basin. Data categories used in the study included well location and assorted metrics capturing various aspects of well architecture, well completion, stimulation, and production. Predictive models for the production metric of interest are built using simple regression and other advanced methods such as random forests (RFs), support-vector regression (SVR), gradient-boosting machine (GBM), and multidimensional Kriging. The data-fitting process involves splitting the data into a training set and a test set, building a regression model on the training set and validating it with the test set. Repeated application of a “cross-validation” procedure yields valuable information regarding the robustness of each regression-modeling approach. Furthermore, decision rules that can identify extreme behavior in production wells (i.e., top x% of the wells vs. bottom x%, as ranked by the production metric) are generated using the classification and regression-tree algorithm. The resulting decision tree (DT) provides useful insights regarding what variables (or combinations of variables) can drive production performance into such extreme categories. The main contributions of this paper are to provide guidelines on how to build robust predictive models, and to demonstrate the utility of DTs for identifying factors responsible for good vs. poor wells.


2018 ◽  
Vol 7 (4) ◽  
pp. 12
Author(s):  
Sagayaraj S ◽  
Vetrivelan N

In recent years, air pollution introduces different biological molecules, particulate and several harmful materials which affect the human health and activities. So, the quality of the air should be maintained for avoiding the above issues. To manage the air quality initially the meteorological data have been collected from Ariyalur that includes the condition of air, data collected date, high and low temperature, wind speed, wind direction and relative humidity. The collected data has to be preprocessed by applying the normalization and data mining techniques and those preprocessed data’s are used to predict the pollutants and the concentration level of the pollutants such as sulfur dioxide (SO2), carbon monoxide (CO), nitrogen dioxide (NO2), and nitric oxide (NO). Then the particulate matter level in the air has to be predicted by Gradient Boosting based Hierarchical Temporal Memory Neural Networks (BHTMNN). From the predicted value the strength of the pollutants is classified by using the Fuzzy based Classification based Regression Tree (FCART) which is used to recognize the disease arises in the human respiratory system. Then the performance of the proposed system is evaluated using the mean square error, classification accuracy, sensitivity and specificity.


Sign in / Sign up

Export Citation Format

Share Document