scholarly journals Using data mining methods to improve discharge coefficient prediction in Piano Key and Labyrinth weirs

Author(s):  
Mahdi Majedi-Asl ◽  
Mehdi Foladipanah ◽  
Venkat Arun ◽  
Ravi Prakash Tripathi

Abstract As a remarkable parameter, the discharge coefficient (Cd) plays an important role in determining weirs' passing capacity. In this research work, the support vector machine (SVM) and the gene expression programming (GEP) algorithms were assessed to predict Cd of piano key weir (PKW), rectangular labyrinth weir (RLW), and trapezoidal labyrinth weir (TLW) with gathered experimental data set. Using dimensional analysis, various combinations of hydraulic and geometric non-dimensional parameters were extracted to perform simulation. The superior model for the SVM and the GEP predictor for PKW, RLW, and TLW included , and respectively. The results showed that both algorithms are potential in predicting discharge coefficient, but the coefficient of determination (RMSE, R2, Cd(DDR)max) illustrated the superiority of the GEP performance over the SVM. The results of the sensitivity analysis determined the highest effective parameters for PKW, RLW, and TLW in predicting discharge coefficients are , , and Fr respectively.

2016 ◽  
Vol 16 (4) ◽  
pp. 1002-1016 ◽  
Author(s):  
Hazi Mohammad Azamathulla ◽  
Amir Hamzeh Haghiabi ◽  
Abbas Parsaie

Side weirs have many possible applications in the field of hydraulic engineering. They are also considered an important structure in hydro systems. In this study, the support vector machine (SVM) technique was employed to predict the side weir discharge coefficient. The performance of SVM was compared with other types of soft computing techniques such as artificial neural networks (ANN) and adaptive neuro fuzzy inference systems (ANFIS). While ANN and ANFIS models provided a good prediction performance, the SVM model with a radial basis function kernel function outperforms them. The best SVM model was developed with a gamma coefficient and epsilon of 15 and 0.3, respectively. The SVM yielded a coefficient of determination (R2) equal to 0.96 and 0.93 for the training and testing data. Sensitivity analyses of the ANN, ANFIS and SVM models showed that the Froude number and ratio of weir length to the flow depth upstream of the weir are the most effective parameters for the prediction of the discharge coefficient.


Author(s):  
Umar Sidiq ◽  
Syed Mutahar Aaqib ◽  
Rafi Ahmad Khan

Classification is one of the most considerable supervised learning data mining technique used to classify predefined data sets the classification is mainly used in healthcare sectors for making decisions, diagnosis system and giving better treatment to the patients. In this work, the data set used is taken from one of recognized lab of Kashmir. The entire research work is to be carried out with ANACONDA3-5.2.0 an open source platform under Windows 10 environment. An experimental study is to be carried out using classification techniques such as k nearest neighbors, Support vector machine, Decision tree and Naïve bayes. The Decision Tree obtained highest accuracy of 98.89% over other classification techniques.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ruolan Zeng ◽  
Jiyong Deng ◽  
Limin Dang ◽  
Xinliang Yu

AbstractA three-descriptor quantitative structure–activity/toxicity relationship (QSAR/QSTR) model was developed for the skin permeability of a sufficiently large data set consisting of 274 compounds, by applying support vector machine (SVM) together with genetic algorithm. The optimal SVM model possesses the coefficient of determination R2 of 0.946 and root mean square (rms) error of 0.253 for the training set of 139 compounds; and a R2 of 0.872 and rms of 0.302 for the test set of 135 compounds. Compared with other models reported in the literature, our SVM model shows better statistical performance in a model that deals with more samples in the test set. Therefore, applying a SVM algorithm to develop a nonlinear QSAR model for skin permeability was achieved.


Materials ◽  
2022 ◽  
Vol 15 (2) ◽  
pp. 489
Author(s):  
Fadi Almohammed ◽  
Parveen Sihag ◽  
Saad Sh. Sammen ◽  
Krzysztof Adam Ostrowski ◽  
Karan Singh ◽  
...  

In this investigation, the potential of M5P, Random Tree (RT), Reduced Error Pruning Tree (REP Tree), Random Forest (RF), and Support Vector Regression (SVR) techniques have been evaluated and compared with the multiple linear regression-based model (MLR) to be used for prediction of the compressive strength of bacterial concrete. For this purpose, 128 experimental observations have been collected. The total data set has been divided into two segments such as training (87 observations) and testing (41 observations). The process of data set separation was arbitrary. Cement, Aggregate, Sand, Water to Cement Ratio, Curing time, Percentage of Bacteria, and type of sand were the input variables, whereas the compressive strength of bacterial concrete has been considered as the final target. Seven performance evaluation indices such as Correlation Coefficient (CC), Coefficient of determination (R2), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Bias, Nash-Sutcliffe Efficiency (NSE), and Scatter Index (SI) have been used to evaluate the performance of the developed models. Outcomes of performance evaluation indices recommend that the Polynomial kernel function based SVR model works better than other developed models with CC values as 0.9919, 0.9901, R2 values as 0.9839, 0.9803, NSE values as 0.9832, 0.9800, and lower values of RMSE are 1.5680, 1.9384, MAE is 0.7854, 1.5155, Bias are 0.2353, 0.1350 and SI are 0.0347, 0.0414 for training and testing stages, respectively. The sensitivity investigation shows that the curing time (T) is the vital input variable affecting the prediction of the compressive strength of bacterial concrete, using this data set.


Water ◽  
2019 ◽  
Vol 11 (3) ◽  
pp. 582 ◽  
Author(s):  
Sultan Noman Qasem ◽  
Saeed Samadianfard ◽  
Hamed Sadri Nahand ◽  
Amir Mosavi ◽  
Shahaboddin Shamshirband ◽  
...  

In the current study, the ability of three data-driven methods of Gene Expression Programming (GEP), M5 model tree (M5), and Support Vector Regression (SVR) were investigated in order to model and estimate the dew point temperature (DPT) at Tabriz station, Iran. For this purpose, meteorological parameters of daily average temperature (T), relative humidity (RH), actual vapor pressure (Vp), wind speed (W), and sunshine hours (S) were obtained from the meteorological organization of East Azerbaijan province, Iran for the period 1998 to 2016. Following this, the methods mentioned above were examined by defining 15 different input combinations of meteorological parameters. Additionally, root mean square error (RMSE) and the coefficient of determination (R2) were implemented to analyze the accuracy of the proposed methods. The results showed that the GEP-10 method, using three input parameters of T, RH, and S, with RMSE of 0.96°, the SVR-5, using two input parameters of T and RH, with RMSE of 0.44, and M5-15, using five input parameters of T, RH, Vp, W, and S with RMSE of 0.37 present better performance in the estimation of the DPT. As a conclusion, the M5-15 is recommended as the most precise model in the estimation of DPT in comparison with other considered models. As a conclusion, the obtained results proved the high capability of proposed M5 models in DPT estimation.


2019 ◽  
Vol 21 (6) ◽  
pp. 1014-1029 ◽  
Author(s):  
Kiyoumars Roushangar ◽  
Ghazaleh Nasssaji Matin ◽  
Roghayeh Ghasempour ◽  
Seyed Mahdi Saghebian

Abstract Energy dissipation in culverts is a complex phenomenon due to the nonlinearity and uncertainties of the process. In the current study, the capability of Gaussian process regression (GPR) and support vector machine (SVM) as kernel-based approaches and the gene expression programming (GEP) method was assessed in predicting energy losses in culverts. Two types of bend loss in rectangular culverts and entrance loss in circular culverts with different inlet end treatments were considered. Various input combinations were developed and tested using experimental data. The OAT (one-at-a-time), factorial sensitivity analysis and Monte Carlo uncertainty analysis were used to select the effective parameters in modeling. The results of performance criteria proved the capability of the applied methods (i.e. high correlation coefficient (R) and determination coefficient (DC) and low root mean square error (RSME)). For rectangular culverts, the model with parameters Fr (Froude number) and θ (bend angle), and for circular culverts, the model with parameters Fr and Hw/D (depth ratio), were the superior models. It showed that using the bend downstream Froude number caused an increment in model efficiency. Among the four end inlet treatments, mitered flush to 1.5:1 fill slope inlet yielded more accurate prediction. The sensitivity and uncertainty analysis showed that θ and Hw/D had the most significant impact on modeling, and Fr had the highest uncertainty.


2013 ◽  
Vol 16 (3) ◽  
pp. 671-689 ◽  
Author(s):  
Daniel J. Karran ◽  
Efrat Morin ◽  
Jan Adamowski

Considering the popularity of using data-driven non-linear methods for forecasting streamflow, there has been no exploration of how well such models perform in climate regimes with differing hydrological characteristics, nor has the performance of these models, coupled with wavelet transforms, been compared for lead times of less than 1 month. This study compares the use of four different models, namely artificial neural networks (ANNs), support vector regression (SVR), wavelet-ANN, and wavelet-SVR in a Mediterranean, Oceanic, and Hemiboreal watershed. Model performance was tested for 1, 2 and 3 day forecasting lead times, measured by fractional standard error, the coefficient of determination, Nash–Sutcliffe model efficiency, multiplicative bias, probability of detection and false alarm rate. SVR based models performed best overall, but no one model outperformed the others in more than one watershed, suggesting that some models may be more suitable for certain types of data. Overall model performance varied greatly between climate regimes, suggesting that higher persistence and slower hydrological processes (i.e. snowmelt, glacial runoff, and subsurface flow) support reliable forecasting using daily and multi-day lead times.


Jurnal Segara ◽  
2020 ◽  
Vol 16 (3) ◽  
Author(s):  
Arip Rahman

Shallow water bathymetry estimation from remote sensing data has been increasing widespread, as an alternative to traditional bathymetry measurement that has disturbed by technical and logistic problem. Deriving bathymetry data from Sentinel 2A images, at visible wavelength (blue, green and red) 10 meter spatial resolution was carried out around the waters of the Kemujan Island Karimunjawa National Park Central Java. Amount of 1280 points data are used as training data sets and 854 points data as test data set produced from sounding. Dark Object Substraction (DOS) has been to correct atmospherically the Sentinel-2A images. Several algorithm has been applied to derive bathymetry data, including: linear transform, ratio transform and support vector machine (SVM). The highest correlation between depth prediction and observe resulted from SVM algorithm with a coefficient of determination (R2) 0.71 (training data) and 0.56 (test data). The assessment of the accuracy of the three methods using RMSE and MAE values, the SVM algorithm has the smallest value (< 1 m). This indicates that the SVM algorithm has a high accuracy compared to the other two methods. The bathymetry map derived from Sentinel 2A imagery cannot be used as a reference for navigation.


Author(s):  
Hossein Bonakdari ◽  
Isa Ebtehaj ◽  
Bahram Gharabaghi ◽  
Ali Sharifi ◽  
Amir Mosavi

This paper proposes a model based on gene expression programming for predicting discharge coefficient of triangular labyrinth weirs. The parameters influencing discharge coefficient prediction were first examined and presented as crest height ratio to the head over the crest of the weir (p/y), crest length of water to channel width (L/W), crest length of water to the head over the crest of the weir (L/y), Froude number (F=V/&radic;(gy)) and vertex angle () dimensionless parameters. Different models were then presented using sensitivity analysis in order to examine each of the dimensionless parameters presented in this study. In addition, an equation was presented through the use of nonlinear regression (NLR) for the purpose of comparison with GEP. The results of the studies conducted by using different statistical indexes indicated that GEP is more capable than NLR. This is to the extent that GEP predicts the discharge coefficient with an average relative error of approximately 2.5% in such a manner that the predicted values have less than 5% relative error in the worst model.


Sign in / Sign up

Export Citation Format

Share Document