scholarly journals Determining residuary resistance per unit weight of displacement with Symbolic Regression and Gradient Boosted Tree algorithms

Pomorstvo ◽  
2021 ◽  
Vol 35 (2) ◽  
pp. 287-296
Author(s):  
Sandi Baressi Šegota ◽  
Ivan Lorencin ◽  
Mario Šercer ◽  
Zlatan Car

Determining the residuary resistance per unit weight of displacement is one of the key factors in the design of vessels. In this paper, the authors utilize two novel methods – Symbolic Regression (SR) and Gradient Boosted Trees (GBT) to achieve a model which can be used to calculate the value of residuary resistance per unit weight, of displacement from the longitudinal position of the center of buoyancy, prismatic coefficient, length-displacement ratio, beam-draught ratio, length-beam ratio, and Froude number. This data is given as results of 308 experiments provided as a part of a publicly available dataset. The results are evaluated using the coefficient of determination (R2) and Mean Absolute Percentage Error (MAPE). Pre-processing, in the shape of correlation analysis combined with variable elimination and variable scaling, is applied to the dataset. The results show that while both methods achieve regression results, the result of regression of SR is relatively poor in comparison to GBT. Both methods provide slightly poorer, but comparable results to previous research focussing on the use of “black-box” methods, such as neural networks. The elimination of variables does not show a high influence on the modeling performance in the presented case, while variable scaling does achieve better results compared to the models trained with the non-scaled dataset.

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Shengpu Li ◽  
Yize Sun

Ink transfer rate (ITR) is a reference index to measure the quality of 3D additive printing. In this study, an ink transfer rate prediction model is proposed by applying the least squares support vector machine (LSSVM). In addition, enhanced garden balsam optimization (EGBO) is used for selection and optimization of hyperparameters that are embedded in the LSSVM model. 102 sets of experimental sample data have been collected from the production line to train and test the hybrid prediction model. Experimental results show that the coefficient of determination (R2) for the introduced model is equal to 0.8476, the root-mean-square error (RMSE) is 6.6 × 10 (−3), and the mean absolute percentage error (MAPE) is 1.6502 × 10 (−3) for the ink transfer rate of 3D additive printing.


2021 ◽  
Vol 149 ◽  
Author(s):  
Junwen Tao ◽  
Yue Ma ◽  
Xuefei Zhuang ◽  
Qiang Lv ◽  
Yaqiong Liu ◽  
...  

Abstract This study proposed a novel ensemble analysis strategy to improve hand, foot and mouth disease (HFMD) prediction by integrating environmental data. The approach began by establishing a vector autoregressive model (VAR). Then, a dynamic Bayesian networks (DBN) model was used for variable selection of environmental factors. Finally, a VAR model with constraints (CVAR) was established for predicting the incidence of HFMD in Chengdu city from 2011 to 2017. DBN showed that temperature was related to HFMD at lags 1 and 2. Humidity, wind speed, sunshine, PM10, SO2 and NO2 were related to HFMD at lag 2. Compared with the autoregressive integrated moving average model with external variables (ARIMAX), the CVAR model had a higher coefficient of determination (R2, average difference: + 2.11%; t = 6.2051, P = 0.0003 < 0.05), a lower root mean-squared error (−24.88%; t = −5.2898, P = 0.0007 < 0.05) and a lower mean absolute percentage error (−16.69%; t = −4.3647, P = 0.0024 < 0.05). The accuracy of predicting the time-series shape was 88.16% for the CVAR model and 86.41% for ARIMAX. The CVAR model performed better in terms of variable selection, model interpretation and prediction. Therefore, it could be used by health authorities to identify potential HFMD outbreaks and develop disease control measures.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4655
Author(s):  
Dariusz Czerwinski ◽  
Jakub Gęca ◽  
Krzysztof Kolano

In this article, the authors propose two models for BLDC motor winding temperature estimation using machine learning methods. For the purposes of the research, measurements were made for over 160 h of motor operation, and then, they were preprocessed. The algorithms of linear regression, ElasticNet, stochastic gradient descent regressor, support vector machines, decision trees, and AdaBoost were used for predictive modeling. The ability of the models to generalize was achieved by hyperparameter tuning with the use of cross-validation. The conducted research led to promising results of the winding temperature estimation accuracy. In the case of sensorless temperature prediction (model 1), the mean absolute percentage error MAPE was below 4.5% and the coefficient of determination R2 was above 0.909. In addition, the extension of the model with the temperature measurement on the casing (model 2) allowed reducing the error value to about 1% and increasing R2 to 0.990. The results obtained for the first proposed model show that the overheating protection of the motor can be ensured without direct temperature measurement. In addition, the introduction of a simple casing temperature measurement system allows for an estimation with accuracy suitable for compensating the motor output torque changes related to temperature.


Processes ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 1166
Author(s):  
Bashir Musa ◽  
Nasser Yimen ◽  
Sani Isah Abba ◽  
Humphrey Hugh Adun ◽  
Mustafa Dagbasi

The prediction accuracy of support vector regression (SVR) is highly influenced by a kernel function. However, its performance suffers on large datasets, and this could be attributed to the computational limitations of kernel learning. To tackle this problem, this paper combines SVR with the emerging Harris hawks optimization (HHO) and particle swarm optimization (PSO) algorithms to form two hybrid SVR algorithms, SVR-HHO and SVR-PSO. Both the two proposed algorithms and traditional SVR were applied to load forecasting in four different states of Nigeria. The correlation coefficient (R), coefficient of determination (R2), mean square error (MSE), root mean square error (RMSE), and mean absolute percentage error (MAPE) were used as indicators to evaluate the prediction accuracy of the algorithms. The results reveal that there is an increase in performance for both SVR-HHO and SVR-PSO over traditional SVR. SVR-HHO has the highest R2 values of 0.9951, 0.8963, 0.9951, and 0.9313, the lowest MSE values of 0.0002, 0.0070, 0.0002, and 0.0080, and the lowest MAPE values of 0.1311, 0.1452, 0.0599, and 0.1817, respectively, for Kano, Abuja, Niger, and Lagos State. The results of SVR-HHO also prove more advantageous over SVR-PSO in all the states concerning load forecasting skills. This paper also designed a hybrid renewable energy system (HRES) that consists of solar photovoltaic (PV) panels, wind turbines, and batteries. As inputs, the system used solar radiation, temperature, wind speed, and the predicted load demands by SVR-HHO in all the states. The system was optimized by using the PSO algorithm to obtain the optimal configuration of the HRES that will satisfy all constraints at the minimum cost.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Boluwaji M. Olomiyesan ◽  
Onyedi D. Oyedum

In this study, the performance of three global solar radiation models and the accuracy of global solar radiation data derived from three sources were compared. Twenty-two years (1984–2005) of surface meteorological data consisting of monthly mean daily sunshine duration, minimum and maximum temperatures, and global solar radiation collected from the Nigerian Meteorological (NIMET) Agency, Oshodi, Lagos, and the National Aeronautics Space Agency (NASA) for three locations in North-Western region of Nigeria were used. A new model incorporating Garcia model into Angstrom-Prescott model was proposed for estimating global radiation in Nigeria. The performances of the models used were determined by using mean bias error (MBE), mean percentage error (MPE), root mean square error (RMSE), and coefficient of determination (R2). Based on the statistical error indices, the proposed model was found to have the best accuracy with the least RMSE values (0.376 for Sokoto, 0.463 for Kaduna, and 0.449 for Kano) and highest coefficient of determination, R2 values of 0.922, 0.938, and 0.961 for Sokoto, Kano, and Kaduna, respectively. Also, the comparative study result indicates that the estimated global radiation from the proposed model has a better error range and fits the ground measured data better than the satellite-derived data.


2020 ◽  
Vol 9 (3) ◽  
pp. 674 ◽  
Author(s):  
Mohammed A. A. Al-qaness ◽  
Ahmed A. Ewees ◽  
Hong Fan ◽  
Mohamed Abd El Aziz

In December 2019, a novel coronavirus, called COVID-19, was discovered in Wuhan, China, and has spread to different cities in China as well as to 24 other countries. The number of confirmed cases is increasing daily and reached 34,598 on 8 February 2020. In the current study, we present a new forecasting model to estimate and forecast the number of confirmed cases of COVID-19 in the upcoming ten days based on the previously confirmed cases recorded in China. The proposed model is an improved adaptive neuro-fuzzy inference system (ANFIS) using an enhanced flower pollination algorithm (FPA) by using the salp swarm algorithm (SSA). In general, SSA is employed to improve FPA to avoid its drawbacks (i.e., getting trapped at the local optima). The main idea of the proposed model, called FPASSA-ANFIS, is to improve the performance of ANFIS by determining the parameters of ANFIS using FPASSA. The FPASSA-ANFIS model is evaluated using the World Health Organization (WHO) official data of the outbreak of the COVID-19 to forecast the confirmed cases of the upcoming ten days. More so, the FPASSA-ANFIS model is compared to several existing models, and it showed better performance in terms of Mean Absolute Percentage Error (MAPE), Root Mean Squared Relative Error (RMSRE), Root Mean Squared Relative Error (RMSRE), coefficient of determination ( R 2 ), and computing time. Furthermore, we tested the proposed model using two different datasets of weekly influenza confirmed cases in two countries, namely the USA and China. The outcomes also showed good performances.


2017 ◽  
Vol 24 (1) ◽  
pp. 21-39 ◽  
Author(s):  
Richard Ohene Asiedu ◽  
Nana Kena Frempong ◽  
Hans Wilhelm Alfen

Purpose Being able to predict the likelihood of a project to overrun its cost before the contract signing phase is crucial in developing the required mitigating measures to avert it. Known parameters that permit the timely prediction of cost overrun provide the basis for such predictions. Therefore, the purpose of this paper is to develop a model for forecasting cost overruns. Design/methodology/approach Ten predictive variables known before the contract signing phase of a project are identified. Based on a survey approach, information on 321 educational projects completed are compiled. A multiple linear regression analysis is adopted for the model development. Findings Five variables – initial contract sum, gross floor area, number of storeys, source of funds and contractors’ financial classification are observed to influence cost overruns. The model, however, yields a fairly weak coefficient of determination with a mean absolute percentage error of 30.22 and 138 per cent, respectively. Research limitations/implications The model developed focussed on data only educational projects sampled from three out of the ten administration regions in Ghana based on a purposive sampling approach. Practical implications Policy makers and construction managers working on public projects stand to gain tremendous assistance in formulating and strengthening their own in-house cost forecasting at the precontract phase based on “what if” analysis to generate various alternative predictions of cost overruns. Originality/value Considering the innate nature of cost overruns within the Ghanaian construction industry often resulting to project abandonment, this research presents a unique dimension for tackling cost overruns based on a predictive approach.


2019 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Eman Khorsheed

Purpose The purpose of this study is to present a hybrid approach to model and predict long-term energy peak load using Bayesian and Holt–Winters (HW) exponential smoothing techniques. Design/methodology/approach Bayesian inference is administered by Markov chain Monte Carlo (MCMC) sampling techniques. Machine learning tools are used to calibrate the values of the HW model parameters. Hybridization is conducted to reduce modeling uncertainty. The technique is applied to real load data. Monthly peak load forecasts are calculated as weighted averages of HW and MCMC estimates. Mean absolute percentage error and the coefficient of determination (R2) indices are used to evaluate forecasts. Findings The developed hybrid methodology offers advantages over both individual combined techniques and reveals more accurate and impressive results with R2 above 0.97. The new technique can be used to assist energy networks in planning and implementing production projects that can ensure access to reliable and modern energy services to meet the sustainable development goal in this sector. Originality/value This is original research.


2020 ◽  
Vol 12 (11) ◽  
pp. 1814
Author(s):  
Phamchimai Phan ◽  
Nengcheng Chen ◽  
Lei Xu ◽  
Zeqiang Chen

Tea is a cash crop that improves the quality of life for people in the Tanuyen District of Laichau Province, Vietnam. Tea yield, however, has stagnated in recent years, due to changes in temperature, precipitation, the age of the tea bushes, and diseases. Developing an approach for monitoring tea bushes by remote sensing and Geographic Information Systems (GIS) might be a way to alleviate this problem. Using multi-temporal remote sensing data, the paper details an investigation of the changes in tea health and yield forecasting through the normalized difference vegetation index (NDVI). In this study, we used NDVI as a support tool to demonstrate the temporal and spatial changes in NDVI through the extract tea NDVI value and calculate the mean NDVI value. The results of the study showed that the minimum NDVI value was 0.42 during January 2013 and February 2015 and 2016. The maximum NDVI value was in August 2015 and June 2017. We indicate that the linear relationship between NDVI value and mean temperature was strong with R 2 = 0.79 Our results confirm that the combination of meteorological data and NDVI data can achieve a high performance of yield prediction. Three models to predict tea yield were conducted: support vector machine (SVM), random forest (RF), and the traditional linear regression model (TLRM). For period 2009 to 2018, the prediction tea yield by the RF model was the best with a R 2 = 0.73 , by SVM it was 0.66, and 0.57 with the TLRM. Three evaluation indicators were used to consider accuracy: the coefficient of determination ( R 2 ), root-mean-square error (RMSE), and percentage error of tea yield (PETY). The highest accuracy for the three models was in 2015 with a R 2 ≥ 0.87, RMSE < 50 kg/ha, and PETY less 3% error. In the other years, the prediction accuracy was higher in the SVM and RF models. Meanwhile, the RF algorithm was better than PETY (≤10%) and the root mean square error for this algorithm was significantly less (≤80 kg/ha). RMSE and PETY showed relatively good values in the TLRM model with a RMSE from 80 to 100 kg/ha and a PETY from 8 to 15%.


2017 ◽  
Vol 52 (12) ◽  
pp. 1158-1166
Author(s):  
Adriana Ferreira de Moraes-Oliveira ◽  
Lucas Eduardo de Oliveira Aparecido ◽  
Sérgio Rangel Fernandes Figueira

Abstract: The objective of this work was to estimate the coffee supply by calibrating statistical models with economic and climatic variables for the main producing regions of the state of São Paulo, Brazil. The regions were Batatais, Caconde, Cássia dos Coqueiros, Cristais Paulista, Espírito Santo do Pinhal, Marília, Mococa, and Osvaldo Cruz. Data on coffee supply, economic variables (rural credit, rural agricultural credit, and production value), and climatic variables (air temperature, rainfall, potential evapotranspiration, water deficit, and water surplus) for each region, during the period from 2000-2014, were used. The models were calibrated using multiple linear regression, and all possible combinations were tested for selecting the variables. Coffee supply was the dependent variable, and the other ones were considered independent. The accuracy and precision of the models were assessed by the mean absolute percentage error and the adjusted coefficient of determination, respectively. The variables that most affect coffee supply are production value and air temperature. Coffee supply can be estimated with multiple linear regressions using economic and climatic variables. The most accurate models are those calibrated to estimate coffee supply for the regions of Cássia dos Coqueiros and Osvaldo Cruz.


Sign in / Sign up

Export Citation Format

Share Document