Comparative Analysis of Artificial Intelligence Models for Accurate Estimation of Groundwater Nitrate Concentration

Prediction of the groundwater nitrate concentration is of utmost importance for pollution control and water resource management. This research aims to model the spatial groundwater nitrate concentration in the Marvdasht watershed, Iran, based on several artificial intelligence methods of support vector machine (SVM), Cubist, random forest (RF), and Bayesian artificial neural network (Baysia-ANN) machine learning models. For this purpose, 11 independent variables affecting groundwater nitrate changes include elevation, slope, plan curvature, profile curvature, rainfall, piezometric depth, distance from the river, distance from residential, Sodium (Na), Potassium (K), and topographic wetness index (TWI) in the study area were prepared. Nitrate levels were also measured in 67 wells and used as a dependent variable for modeling. Data were divided into two categories of training (70%) and testing (30%) for modeling. The evaluation criteria coefficient of determination (R2), mean absolute error (MAE), root mean square error (RMSE), and Nash–Sutcliffe efficiency (NSE) were used to evaluate the performance of the models used. The results of modeling the susceptibility of groundwater nitrate concentration showed that the RF (R2 = 0.89, RMSE = 4.24, NSE = 0.87) model is better than the other Cubist (R2 = 0.87, RMSE = 5.18, NSE = 0.81), SVM (R2 = 0.74, RMSE = 6.07, NSE = 0.74), Bayesian-ANN (R2 = 0.79, RMSE = 5.91, NSE = 0.75) models. The results of groundwater nitrate concentration zoning in the study area showed that the northern parts of the case study have the highest amount of nitrate, which is higher in these agricultural areas than in other areas. The most important cause of nitrate pollution in these areas is agriculture activities and the use of groundwater to irrigate these crops and the wells close to agricultural areas, which has led to the indiscriminate use of chemical fertilizers by irrigation or rainwater of these fertilizers is washed and penetrates groundwater and pollutes the aquifer.

Download Full-text

Forecasting Daily River Flow Using an Artificial Flora–Support Vector Machine Hybrid Modeling Approach (Case Study: Karkheh Catchment, Iran)

Air Soil and Water Research ◽

10.1177/1178622120969659 ◽

2020 ◽

Vol 13 ◽

pp. 117862212096965

Author(s):

Reza Dehghani ◽

Hassan Torabi Poudeh ◽

Hojatolah Younesi ◽

Babak Shahinejad

Keyword(s):

Support Vector Machine ◽

River Flow ◽

Evaluation Criteria ◽

Absolute Error ◽

Coefficient Of Determination ◽

Support Vector ◽

Flow Modeling ◽

Machine Model ◽

Daily Discharge

In this study, the hybrid support vector machine–artificial flora algorithm method was developed and the obtained results were compared with those of the support vector–wave vector machine model. Karkheh catchment area was considered as a case study to estimate the flow rate of rivers using the daily discharge statistics taken from hydrometric stations located upstream of the dam in the statistical period of 2008 to 2018. Necessary criteria including coefficient of determination, root mean square error (RMSE), mean absolute error (MAE), and Nash–Sutcliffe coefficient were used to evaluate and compare the models. The results illustrated that the combined structures provided acceptable results in terms of river flow modeling. Also, a comparison of the models based on the evaluation criteria and Taylor’s diagram demonstrated that the proposed hybrid method with the correlation coefficient of R2 = 0.924 to 0.974, RMSE = 0.022 to 0.066 m3/s, MAE = 0.011 to 0.034 m3/s, and Nash-Sutcliffe (NS) coefficient = 0.947 to 0.986 outperformed other methods in terms of estimating the daily flow rates of rivers.

Download Full-text

Assessment of Soft Computing Techniques for the Prediction of Compressive Strength of Bacterial Concrete

Materials ◽

10.3390/ma15020489 ◽

2022 ◽

Vol 15 (2) ◽

pp. 489

Author(s):

Fadi Almohammed ◽

Parveen Sihag ◽

Saad Sh. Sammen ◽

Krzysztof Adam Ostrowski ◽

Karan Singh ◽

...

Keyword(s):

Compressive Strength ◽

Performance Evaluation ◽

Absolute Error ◽

Coefficient Of Determination ◽

Polynomial Kernel ◽

Support Vector ◽

Curing Time ◽

Data Set ◽

Soft Computing Techniques ◽

Bacterial Concrete

In this investigation, the potential of M5P, Random Tree (RT), Reduced Error Pruning Tree (REP Tree), Random Forest (RF), and Support Vector Regression (SVR) techniques have been evaluated and compared with the multiple linear regression-based model (MLR) to be used for prediction of the compressive strength of bacterial concrete. For this purpose, 128 experimental observations have been collected. The total data set has been divided into two segments such as training (87 observations) and testing (41 observations). The process of data set separation was arbitrary. Cement, Aggregate, Sand, Water to Cement Ratio, Curing time, Percentage of Bacteria, and type of sand were the input variables, whereas the compressive strength of bacterial concrete has been considered as the final target. Seven performance evaluation indices such as Correlation Coefficient (CC), Coefficient of determination (R2), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Bias, Nash-Sutcliffe Efficiency (NSE), and Scatter Index (SI) have been used to evaluate the performance of the developed models. Outcomes of performance evaluation indices recommend that the Polynomial kernel function based SVR model works better than other developed models with CC values as 0.9919, 0.9901, R2 values as 0.9839, 0.9803, NSE values as 0.9832, 0.9800, and lower values of RMSE are 1.5680, 1.9384, MAE is 0.7854, 1.5155, Bias are 0.2353, 0.1350 and SI are 0.0347, 0.0414 for training and testing stages, respectively. The sensitivity investigation shows that the curing time (T) is the vital input variable affecting the prediction of the compressive strength of bacterial concrete, using this data set.

Download Full-text

Heuristic methods applied in reference evapotranspiration modeling

Ciência e Agrotecnologia ◽

10.1590/1413-70542018423006818 ◽

2018 ◽

Vol 42 (3) ◽

pp. 314-324 ◽

Cited By ~ 5

Author(s):

Daniel Althoff ◽

Helizani Couto Bazame ◽

Roberto Filgueiras ◽

Santos Henrique Brant Dias

Keyword(s):

Reference Evapotranspiration ◽

Performance Criteria ◽

Superior Performance ◽

Weather Data ◽

Coefficient Of Determination ◽

Accurate Estimation ◽

Support Vector ◽

Computational Techniques ◽

Heuristic Methods ◽

Scarce Data

ABSTRACT The importance of the precise estimation of evapotranspiration is directly related to sustainable water usage. Since agriculture represents 70% of Brazil’s water consumption, adequate and efficient application of water may reduce the conflicts over the use of water among the multiple users. Considering the importance of accurate estimation of evapotranspiration, the objective of the present study was to model and compare the reference evapotranspiration from different heuristic methodologies. The standard Penman-Monteith method was used as reference for evapotranspiration, however, to evaluate the heuristic methodologies with scarce data, two widely known methods had their performances assessed in relation to Penman-Monteith. The methods used to estimate evapotranspiration from scarce data were Priestley-Taylor and Thornthwaite. The computational techniques Stepwise Regression (SWR), Random Forest (RF), Cubist (CB), Bayesian Regularized Neural Network (BRNN) and Support Vector Machines (SVM) were used to estimate evapotranspiration with scarce and full meteorological data. The results show the robustness of the heuristic methods in the prediction of the evapotranspiration. The performance criteria of machine learning methods for full weather data varied from 0.14 to 0.22 mm d-1 for mean absolute error (MAE), from 0.21 to 0.29 mm d-1 for root mean squared error (RMSE) and from 0.95 to 0.99 coefficient of determination (r²). The computational techniques proved superior performance to established methods in literature, even in scenarios of scarce variables. The BRNN presented the best performance overall.

Download Full-text

Estimation of Oil Recovery Factor for Water Drive Sandy Reservoirs through Applications of Artificial Intelligence

Energies ◽

10.3390/en12193671 ◽

2019 ◽

Vol 12 (19) ◽

pp. 3671 ◽

Cited By ~ 8

Author(s):

Ahmed Mahmoud ◽

Salaheldin Elkatatny ◽

Weiqing Chen ◽

Abdulazeez Abdulraheem

Keyword(s):

Artificial Intelligence ◽

Oil Recovery ◽

Water Saturation ◽

Data Availability ◽

Recovery Factor ◽

Coefficient Of Determination ◽

Support Vector ◽

Subtractive Clustering ◽

Oil Viscosity ◽

Inference System

Hydrocarbon reserve evaluation is the major concern for all oil and gas operating companies. Nowadays, the estimation of oil recovery factor (RF) could be achieved through several techniques. The accuracy of these techniques depends on data availability, which is strongly dependent on the reservoir age. In this study, 10 parameters accessible in the early reservoir life are considered for RF estimation using four artificial intelligence (AI) techniques. These parameters are the net pay (effective reservoir thickness), stock-tank oil initially in place, original reservoir pressure, asset area (reservoir area), porosity, Lorenz coefficient, effective permeability, API gravity, oil viscosity, and initial water saturation. The AI techniques used are the artificial neural networks (ANNs), radial basis neuron networks, adaptive neuro-fuzzy inference system with subtractive clustering, and support vector machines. AI models were trained using data collected from 130 water drive sandstone reservoirs; then, an empirical correlation for RF estimation was developed based on the trained ANN model’s weights and biases. Data collected from another 38 reservoirs were used to test the predictability of the suggested AI models and the ANNs-based correlation; then, performance of the ANNs-based correlation was compared with three of the currently available empirical equations for RF estimation. The developed ANNs-based equation outperformed the available equations in terms of all the measures of error evaluation considered in this study, and also has the highest coefficient of determination of 0.94 compared to only 0.55 obtained from Gulstad correlation, which is one of the most accurate correlations currently available.

Download Full-text

Improvement of SVR-Based Drought Forecasting Models using Wavelet Pre-Processing Technique

E3S Web of Conferences ◽

10.1051/e3sconf/20186507007 ◽

2018 ◽

Vol 65 ◽

pp. 07007 ◽

Cited By ~ 3

Author(s):

Kit Fai Fung ◽

Yuk Feng Huang ◽

Chai Hoon Koo

Keyword(s):

Support Vector Regression ◽

River Basin ◽

Absolute Error ◽

Processing Technique ◽

Coefficient Of Determination ◽

Support Vector ◽

Drought Risk ◽

Drought Forecasting ◽

Drought Prediction ◽

Langat River

Drought is a damaging natural hazard due to the lack of precipitation from the expected amount for a period of time. Mitigations are required to reduced its impact. Due to the difficulty in determining the onset and offset of droughts, accurate drought forecasting approaches are required for drought risk management. Given the growing use of machine learning in the field, Wavelet-Boosting Support Vector Regression (W-BS-SVR) was proposed for drought forecasting at Langat River Basin, Malaysia. Monthly rainfall, mean temperature and evapotranspiration for years 1976 - 2015 were used to compute Standardized Precipitation Evapotranspiration Index (SPEI) in this study, producing SPEI-1, SPEI-3 and SPEI-6. The 1-month lead time SPEIs forecasting capability of W-BS-SVR model was compared with the Support Vector Regression (SVR) and Boosting-Support Vector Regression (BS-SVR) models using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), coefficient of determination (R2) and Adjusted R2. The results demonstrated that W-BS-SVR provides higher accuracy for drought prediction in Langat River Basin.

Download Full-text

Deep-learning GIS hybrid approach in precipitation modeling based on spatio-temporal variables in the coastal zone of Turkey

Climate Research ◽

10.3354/cr01612 ◽

2020 ◽

Vol 81 ◽

pp. 149-165

Author(s):

H Apaydin ◽

MT Sattari

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Water Resource Management ◽

Short Term Memory ◽

Hybrid Approach ◽

Support Vector ◽

Monthly Precipitation ◽

Artificial Intelligence Methods ◽

Temporal Variables ◽

Spatio Temporal

It is clearly known that precipitation is essential for fauna and flora. Studies have shown that location and temporal factors have an effect on precipitation. Accurate prediction of precipitation is very important for water resource management, and artificial intelligence methods are frequently used to make such predictions. In this study, the deep-learning and geographic information system (GIS) hybrid approach based on spatio-temporal variables was applied in order to model the amount of precipitation on Turkey’s coastline. Information about latitude, longitude, altitude, distance to the sea, and aspect was taken from meteorological stations, and these factors were utilized as spatial variables. The change in monthly precipitation was taken into account as a temporal variable. Artificial intelligence methods such as Gaussian process regression, support vector regression, the Broyden-Fletcher-Goldfarb-Shanno artificial neural network, M5, random forest, and long short-term memory (LSTM) were used. According to the results of the study, in which different input variable alternatives were also evaluated, LSTM was the most successful method for predicting precipitation with a value of 0.93 R. The study shows that the amount of precipitation can be estimated and a distribution map can be drawn by using spatio-temporal data and the deep-learning and GIS hybrid method at points where the measurement is not performed.

Download Full-text

Improving Soil Thickness Estimations Based on Multiple Environmental Variables with Stacking Ensemble Methods

Remote Sensing ◽

10.3390/rs12213609 ◽

2020 ◽

Vol 12 (21) ◽

pp. 3609

Author(s):

Xinchuan Li ◽

Juhua Luo ◽

Xiuliang Jin ◽

Qiaoning He ◽

Yun Niu

Keyword(s):

Machine Learning ◽

Soil Properties ◽

Environmental Variables ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Accurate Estimation ◽

Support Vector ◽

Topographic Wetness Index ◽

Soil Thickness ◽

Extreme Gradient Boosting

Spatially continuous soil thickness data at large scales are usually not readily available and are often difficult and expensive to acquire. Various machine learning algorithms have become very popular in digital soil mapping to predict and map the spatial distribution of soil properties. Identifying the controlling environmental variables of soil thickness and selecting suitable machine learning algorithms are vitally important in modeling. In this study, 11 quantitative and four qualitative environmental variables were selected to explore the main variables that affect soil thickness. Four commonly used machine learning algorithms (multiple linear regression (MLR), support vector regression (SVR), random forest (RF), and extreme gradient boosting (XGBoost) were evaluated as individual models to separately predict and obtain a soil thickness distribution map in Henan Province, China. In addition, the two stacking ensemble models using least absolute shrinkage and selection operator (LASSO) and generalized boosted regression model (GBM) were tested and applied to build the most reliable and accurate estimation model. The results showed that variable selection was a very important part of soil thickness modeling. Topographic wetness index (TWI), slope, elevation, land use and enhanced vegetation index (EVI) were the most influential environmental variables in soil thickness modeling. Comparative results showed that the XGBoost model outperformed the MLR, RF and SVR models. Importantly, the two stacking models achieved higher performance than the single model, especially when using GBM. In terms of accuracy, the proposed stacking method explained 64.0% of the variation for soil thickness. The results of our study provide useful alternative approaches for mapping soil thickness, with potential for use with other soil properties.

Download Full-text

Application of a Hybrid ARIMA–SVR Model Based on the SPI for the Forecast of Drought—A Case Study in Henan Province, China

Journal of Applied Meteorology and Climatology ◽

10.1175/jamc-d-19-0270.1 ◽

2020 ◽

Vol 59 (7) ◽

pp. 1239-1259

Author(s):

Dehe Xu ◽

Qi Zhang ◽

Yan Ding ◽

Huiping Huang

Keyword(s):

Time Scale ◽

Interpolation Method ◽

Moving Average ◽

Arima Model ◽

Absolute Error ◽

Data Driven ◽

Coefficient Of Determination ◽

Kriging Interpolation ◽

Support Vector ◽

Kriging Interpolation Method

AbstractDrought forecasts could effectively reduce the risk of drought. Data-driven models are suitable forecast tools because of their minimal information requirements. The motivation for this study is that because most data-driven models, such as autoregressive integrated moving average (ARIMA) models, can capture linear relationships but cannot capture nonlinear relationships they are insufficient for long-term prediction. The hybrid ARIMA–support vector regression (SVR) model proposed in this paper is based on the advantages of a linear model and a nonlinear model. The multiscale standard precipitation indices (SPI: SPI1, SPI3, SPI6, and SPI12) were forecast and compared using the ARIMA model and the hybrid ARIMA–SVR model. The performance of all models was compared using measures of persistence, such as the coefficient of determination, root-mean-square error, mean absolute error, Nash–Sutcliffe coefficient, and kriging interpolation method in the ArcGIS software. The results show that the prediction accuracies of the multiscale SPI of the combined ARIMA–SVR model and the single ARIMA model were related to the time scale of the index, and they gradually increase with an increase in time scale. The predicted value decreases with increase in lead time. Comparing the measured data with the predicted data from the model shows that the combined ARIMA–SVR model had higher prediction accuracy than the single ARIMA model and that the predicted results 1–2 months ahead show reasonably good agreement with the actual data.

Download Full-text

Estimation of municipal waste generation of Turkey using socio-economic indicators by Bayesian optimization tuned Gaussian process regression

Waste Management & Research ◽

10.1177/0734242x20906877 ◽

2020 ◽

Vol 38 (8) ◽

pp. 840-850 ◽

Cited By ~ 5

Author(s):

Zeynep Ceylan

Keyword(s):

Gaussian Process ◽

Gaussian Process Regression ◽

Kernel Functions ◽

Machine Learning Algorithms ◽

Economic Indicators ◽

Bayesian Optimization ◽

Superior Performance ◽

Coefficient Of Determination ◽

Accurate Estimation ◽

Support Vector

Accurate estimation of municipal solid waste (MSW) generation has become a crucial task in decision-making processes for the MSW planning and management systems. In this study, the Gaussian process regression (GPR) model tuned by Bayesian optimization was used to forecast the MSW generation of Turkey. The Bayesian optimization method, which can efficiently optimize the hyperparameters of kernel functions in the machine learning algorithms, was applied to reduce the computation redundancy and enhance the estimation performance of the models. Four socio-economic indicators such as population, gross domestic product per capita, inflation rate, and the unemployment rate were used as input variables. The performance of the Bayesian GPR (BGPR) model was compared with the multiple linear regression (MLR) and Bayesian support vector regression (BSVR) models. Different performance measures such as mean absolute deviation (MAD), root mean square error (RMSE), and coefficient of determination (R2) values were used to evaluate the performance of the models. The exponential-GPR model tuned by Bayesian optimization showed superior performance with minimum MAD (0.0182), RMSE (0.0203), and high R2 (0.9914) values in the training phase and minimum MAD (0.0342), RMSE (0.0463), and high R2 (0.9841) values in the testing phase. The results of this study can help decision-makers to be aware of social-economic factors associated with waste management and ensure optimal usage of their resources in future planning.

Download Full-text

Development of Hybrid Artificial Intelligence Approaches and a Support Vector Machine Algorithm for Predicting the Marshall Parameters of Stone Matrix Asphalt

Applied Sciences ◽

10.3390/app9153172 ◽

2019 ◽

Vol 9 (15) ◽

pp. 3172 ◽

Cited By ~ 24

Author(s):

Hoang-Long Nguyen ◽

Thanh-Hai Le ◽

Cao-Thang Pham ◽

Tien-Thinh Le ◽

Lanh Si Ho ◽

...

Keyword(s):

Artificial Intelligence ◽

Support Vector Machine ◽

Mean Squared Error ◽

Fuzzy Inference ◽

Absolute Error ◽

Support Vector ◽

Inference System ◽

Stone Matrix Asphalt ◽

Hybrid Artificial Intelligence ◽

Stone Matrix

The main objective of this study is to develop and compare hybrid Artificial Intelligence (AI) approaches, namely Adaptive Network-based Fuzzy Inference System (ANFIS) optimized by Genetic Algorithm (GAANFIS) and Particle Swarm Optimization (PSOANFIS) and Support Vector Machine (SVM) for predicting the Marshall Stability (MS) of Stone Matrix Asphalt (SMA) materials. Other important properties of the SMA, namely Marshall Flow (MF) and Marshall Quotient (MQ) were also predicted using the best model found. With that goal, the SMA samples were fabricated in a local laboratory and used to generate datasets for the modeling. The considered input parameters were coarse and fine aggregates, bitumen content and cellulose. The predicted targets were Marshall Parameters such as MS, MF and MQ. Models performance assessment was evaluated thanks to criteria such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and correlation coefficient (R). A Monte Carlo approach with 1000 simulations was used to deduce the statistical results to assess the performance of the three proposed AI models. The results showed that the SVM is the best predictor regarding the converged statistical criteria and probability density functions of RMSE, MAE and R. The results of this study represent a contribution towards the selection of a suitable AI approach to quickly and accurately determine the Marshall Parameters of SMA mixtures.

Download Full-text