Using machine learning methods for supporting GR2M model in runoff estimation in an ungauged basin

AbstractEstimating monthly runoff variation, especially in ungauged basins, is inevitable for water resource planning and management. The present study aimed to evaluate the regionalization methods for determining regional parameters of the rainfall-runoff model (i.e., GR2M model). Two regionalization methods (i.e., regression-based methods and distance-based methods) were investigated in this study. Three regression-based methods were selected including Multiple Linear Regression (MLR), Random Forest (RF), and M5 Model Tree (M5), and two distance-based methods included Spatial Proximity Approach and Physical Similarity Approach (PSA). Hydrological data and the basin's physical attributes were analyzed from 37 runoff stations in Thailand's southern basin. The results showed that using hydrological data for estimating the GR2M model parameters is better than using the basin's physical attributes. RF had the most accuracy in estimating regional GR2M model’s parameters by giving the lowest error, followed by M5, MLR, SPA, and PSA. Such regional parameters were then applied in estimating monthly runoff using the GR2M model. Then, their performance was evaluated using three performance criteria, i.e., Nash–Sutcliffe Efficiency (NSE), Correlation Coefficient (r), and Overall Index (OI). The regionalized monthly runoff with RF performed the best, followed by SPA, M5, MLR, and PSA. The Taylor diagram was also used to graphically evaluate the obtained results, which indicated that RF provided the products closest to GR2M's results, followed by SPA, M5, PSA, and MLR. Our finding revealed the applicability of machine learning for estimating monthly runoff in the ungauged basins. However, the SPA would be recommended in areas where lacking the basin's physical attributes and hydrological information.

Download Full-text

Application of the Regression-Augmented Regionalization Approach for BTOP Model in Ungauged Basins

Water ◽

10.3390/w13162294 ◽

2021 ◽

Vol 13 (16) ◽

pp. 2294

Author(s):

Ying Zhu ◽

Lingxue Liu ◽

Fangling Qin ◽

Li Zhou ◽

Xing Zhang ◽

...

Keyword(s):

River Discharge ◽

Regression Equation ◽

Spatial Distance ◽

Spatial Proximity ◽

Future Research ◽

Model Parameters ◽

Regression Equations ◽

Ungauged Basins ◽

Discharge Simulation ◽

Watershed Characteristic

Ten years after the Predictions in Ungauged Basins (PUB) initiative was put forward, known as the post-PUB era (2013 onwards), reducing uncertainty in hydrological prediction in ungauged basins still receives considerable attention. This integration or optimization of the traditional regionalization approaches is an effective way to improve the river discharge simulation in the ungauged basins. In the Jialing River, southwest of China, the regression equations of hydrological model parameters and watershed characteristic factors were firstly established, based on the block-wise use of TOPMODEL (BTOP). This paper explored the application of twelve regionalization approaches that were combined with the spatial proximity, physical similarity, integration similarity, and regression-augmented approach in five ungauged target basins. The results showed that the spatial proximity approach performs best in the river discharge simulation of the studied basins, while the regression-augmented regionalization approach is satisfactory as well, indicating a good potential for the application in ungauged basins. However, for the regression-augmented approach, the number of watershed characteristic factors considered in the regression equation impacts the simulated effect, implying that the determination of optimal watershed characteristic factors set by the model parameter regression equation is a crux for the regression-augmented approach, and the regression strength may also be an influencing factor. These findings provide meaningful information to establish a parametric transfer equation, as well as references for the application in data-sparse regions for the BTOP model. Future research should address the classification of the donor basins under the spatial distance between the reference basin and the target basin, and build regression equations of model parameters adopted to regression-augmented regionalization in each classification group, to further explore this approach’s potential.

Download Full-text

Hybrid features prediction model of movie quality using Multi-machine learning techniques for effective business resource planning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201844 ◽

2021 ◽

Vol 40 (5) ◽

pp. 9361-9382 ◽

Cited By ~ 1

Author(s):

Naeem Iqbal ◽

Rashid Ahmad ◽

Faisal Jamil ◽

Do-Hyeun Kim

Keyword(s):

Machine Learning ◽

Social Media ◽

Resource Planning ◽

Experimental Results ◽

Quality Prediction ◽

Classification Models ◽

Hybrid Features ◽

Social Media Data ◽

Media Data

Quality prediction plays an essential role in the business outcome of the product. Due to the business interest of the concept, it has extensively been studied in the last few years. Advancement in machine learning (ML) techniques and with the advent of robust and sophisticated ML algorithms, it is required to analyze the factors influencing the success of the movies. This paper presents a hybrid features prediction model based on pre-released and social media data features using multiple ML techniques to predict the quality of the pre-released movies for effective business resource planning. This study aims to integrate pre-released and social media data features to form a hybrid features-based movie quality prediction (MQP) model. The proposed model comprises of two different experimental models; (i) predict movies quality using the original set of features and (ii) develop a subset of features based on principle component analysis technique to predict movies success class. This work employ and implement different ML-based classification models, such as Decision Tree (DT), Support Vector Machines with the linear and quadratic kernel (L-SVM and Q-SVM), Logistic Regression (LR), Bagged Tree (BT) and Boosted Tree (BOT), to predict the quality of the movies. Different performance measures are utilized to evaluate the performance of the proposed ML-based classification models, such as Accuracy (AC), Precision (PR), Recall (RE), and F-Measure (FM). The experimental results reveal that BT and BOT classifiers performed accurately and produced high accuracy compared to other classifiers, such as DT, LR, LSVM, and Q-SVM. The BT and BOT classifiers achieved an accuracy of 90.1% and 89.7%, which shows an efficiency of the proposed MQP model compared to other state-of-art- techniques. The proposed work is also compared with existing prediction models, and experimental results indicate that the proposed MQP model performed slightly better compared to other models. The experimental results will help the movies industry to formulate business resources effectively, such as investment, number of screens, and release date planning, etc.

Download Full-text

Trigonometric Inference Providing Learning in Deep Neural Networks

Applied Sciences ◽

10.3390/app11156704 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6704

Author(s):

Jingyong Cai ◽

Masashi Takemoto ◽

Yuming Qiu ◽

Hironori Nakajo

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Trigonometric Approximation ◽

Model Parameters ◽

Training Algorithms ◽

Activation Functions ◽

Classical Training ◽

Sum Formula

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

Download Full-text

Analysis and Prediction of COVID-19 Using SIR, SEIQR, and Machine Learning Models: Australia, Italy, and UK Cases

Information ◽

10.3390/info12030109 ◽

2021 ◽

Vol 12 (3) ◽

pp. 109 ◽

Cited By ~ 1

Author(s):

Iman Rahimi ◽

Amir H. Gandomi ◽

Panagiotis G. Asteris ◽

Fang Chen

Keyword(s):

Machine Learning ◽

Logistic Function ◽

Prediction Performance ◽

Machine Learning Algorithms ◽

Model Parameters ◽

The Novel ◽

Chinese City ◽

Limited Memory ◽

Increasing Trend ◽

Novel Coronavirus

The novel coronavirus disease, also known as COVID-19, is a disease outbreak that was first identified in Wuhan, a Central Chinese city. In this report, a short analysis focusing on Australia, Italy, and UK is conducted. The analysis includes confirmed and recovered cases and deaths, the growth rate in Australia compared with that in Italy and UK, and the trend of the disease in different Australian regions. Mathematical approaches based on susceptible, infected, and recovered (SIR) cases and susceptible, exposed, infected, quarantined, and recovered (SEIQR) cases models are proposed to predict epidemiology in the above-mentioned countries. Since the performance of the classic forms of SIR and SEIQR depends on parameter settings, some optimization algorithms, namely Broyden–Fletcher–Goldfarb–Shanno (BFGS), conjugate gradients (CG), limited memory bound constrained BFGS (L-BFGS-B), and Nelder–Mead, are proposed to optimize the parameters and the predictive capabilities of the SIR and SEIQR models. The results of the optimized SIR and SEIQR models were compared with those of two well-known machine learning algorithms, i.e., the Prophet algorithm and logistic function. The results demonstrate the different behaviors of these algorithms in different countries as well as the better performance of the improved SIR and SEIQR models. Moreover, the Prophet algorithm was found to provide better prediction performance than the logistic function, as well as better prediction performance for Italy and UK cases than for Australian cases. Therefore, it seems that the Prophet algorithm is suitable for data with an increasing trend in the context of a pandemic. Optimization of SIR and SEIQR model parameters yielded a significant improvement in the prediction accuracy of the models. Despite the availability of several algorithms for trend predictions in this pandemic, there is no single algorithm that would be optimal for all cases.

Download Full-text

Conditioning Model Ensembles to Various Observed Data (Field and Regional Level) by Applying Machine-Learning-Augmented Workflows to a Mature Field with 70 Years of Production History

SPE Reservoir Evaluation & Engineering ◽

10.2118/205188-pa ◽

2021 ◽

pp. 1-18

Author(s):

Gisela Vanegas ◽

John Nejedlik ◽

Pascale Neff ◽

Torsten Clemens

Keyword(s):

Machine Learning ◽

Oil Recovery ◽

Numerical Models ◽

Operating Conditions ◽

Model Parameters ◽

Large Set ◽

Model Parameter ◽

Production History ◽

Hydrocarbon Fields ◽

Parameter Distributions

Summary Forecasting production from hydrocarbon fields is challenging because of the large number of uncertain model parameters and the multitude of observed data that are measured. The large number of model parameters leads to uncertainty in the production forecast from hydrocarbon fields. Changing operating conditions [e.g., implementation of improved oil recovery or enhanced oil recovery (EOR)] results in model parameters becoming sensitive in the forecast that were not sensitive during the production history. Hence, simulation approaches need to be able to address uncertainty in model parameters as well as conditioning numerical models to a multitude of different observed data. Sampling from distributions of various geological and dynamic parameters allows for the generation of an ensemble of numerical models that could be falsified using principal-component analysis (PCA) for different observed data. If the numerical models are not falsified, machine-learning (ML) approaches can be used to generate a large set of parameter combinations that can be conditioned to the different observed data. The data conditioning is followed by a final step ensuring that parameter interactions are covered. The methodology was applied to a sandstone oil reservoir with more than 70 years of production history containing dozens of wells. The resulting ensemble of numerical models is conditioned to all observed data. Furthermore, the resulting posterior-model parameter distributions are only modified from the prior-model parameter distributions if the observed data are informative for the model parameters. Hence, changes in operating conditions can be forecast under uncertainty, which is essential if nonsensitive parameters in the history are sensitive in the forecast.

Download Full-text

Estimation of baseflow parameters of variable infiltration capacity model with soil and topography properties for predictions in ungauged basins

Hydrology and Earth System Sciences Discussions ◽

10.5194/hessd-8-7017-2011 ◽

2011 ◽

Vol 8 (4) ◽

pp. 7017-7053 ◽

Cited By ~ 4

Author(s):

Z. Bao ◽

J. Liu ◽

J. Zhang ◽

G. Fu ◽

G. Wang ◽

...

Keyword(s):

Goodness Of Fit ◽

Model Simulation ◽

Parameters Estimation ◽

Model Parameters ◽

Ungauged Basins ◽

Variable Infiltration Capacity ◽

Infiltration Capacity ◽

Ungauged Catchments ◽

Vic Model ◽

Predictions In Ungauged Basins

Abstract. Equifinality is unavoidable when transferring model parameters from gauged catchments to ungauged catchments for predictions in ungauged basins (PUB). A framework for estimating the three baseflow parameters of variable infiltration capacity (VIC) model, directly with soil and topography properties is presented. When the new parameters setting methodology is used, the number of parameters needing to be calibrated is reduced from six to three, that leads to a decrease of equifinality and uncertainty. This is validated by Monte Carlo simulations in 24 hydro-climatic catchments in China. Using the new parameters estimation approach, model parameters become more sensitive and the extent of parameters space will be smaller when a threshold of goodness-of-fit is given. That means the parameters uncertainty is reduced with the new parameters setting methodology. In addition, the uncertainty of model simulation is estimated by the generalised likelihood uncertainty estimation (GLUE) methodology. The results indicate that the uncertainty of streamflow simulations, i.e., confidence interval, is lower with the new parameters estimation methodology compared to that used by original calibration methodology. The new baseflow parameters estimation framework could be applied in VIC model and other appropriate models for PUB.

Download Full-text

Dynamic Detection of Delayed Cerebral Ischemia Using Machine Learning

10.1101/2020.04.15.20067041 ◽

2020 ◽

Author(s):

Murad Megjhani ◽

Kalijah Terilli ◽

Ayham Alkhachroum ◽

David J. Roh ◽

Sachin Agarwal ◽

...

Keyword(s):

Machine Learning ◽

Cerebral Ischemia ◽

Characteristic Curve ◽

Delayed Cerebral Ischemia ◽

Risk Scores ◽

Support Vector ◽

Model Parameters ◽

Learning Approaches ◽

Physiologic Data ◽

Over Time

AbstractObjectiveTo develop a machine learning based tool, using routine vital signs, to assess delayed cerebral ischemia (DCI) risk over time.MethodsIn this retrospective analysis, physiologic data for 540 consecutive acute subarachnoid hemorrhage patients were collected and annotated as part of a prospective observational cohort study between May 2006 and December 2014. Patients were excluded if (i) no physiologic data was available, (ii) they expired prior to the DCI onset window (< post bleed day 3) or (iii) early angiographic vasospasm was detected on admitting angiogram. DCI was prospectively labeled by consensus of treating physicians. Occurrence of DCI was classified using various machine learning approaches including logistic regression, random forest, support vector machine (linear and kernel), and an ensemble classifier, trained on vitals and subject characteristic features. Hourly risk scores were generated as the posterior probability at time t. We performed five-fold nested cross validation to tune the model parameters and to report the accuracy. All classifiers were evaluated for good discrimination using the area under the receiver operating characteristic curve (AU-ROC) and confusion matrices.ResultsOf 310 patients included in our final analysis, 101 (32.6%) patients developed DCI. We achieved maximal classification of 0.81 [0.75-0.82] AU-ROC. We also predicted 74.7 % of all DCI events 12 hours before typical clinical detection with a ratio of 3 true alerts for every 2 false alerts.ConclusionA data-driven machine learning based detection tool offered hourly assessments of DCI risk and incorporated new physiologic information over time.

Download Full-text

On the selection of precipitation products for the regionalisation of hydrological model parameters

Hydrology and Earth System Sciences ◽

10.5194/hess-25-5805-2021 ◽

2021 ◽

Vol 25 (11) ◽

pp. 5805-5837

Author(s):

Oscar M. Baez-Villanueva ◽

Mauricio Zambrano-Bigiarini ◽

Pablo A. Mendoza ◽

Ian McNamara ◽

Hylke E. Beck ◽

...

Keyword(s):

Hydrological Model ◽

Model Performance ◽

Temporal Distribution ◽

Hydrological Regime ◽

Hydrologic Model ◽

Spatial Proximity ◽

Model Parameters ◽

Ungauged Catchments ◽

Water Balance Components ◽

Feature Similarity

Abstract. Over the past decades, novel parameter regionalisation techniques have been developed to predict streamflow in data-scarce regions. In this paper, we examined how the choice of gridded daily precipitation (P) products affects the relative performance of three well-known parameter regionalisation techniques (spatial proximity, feature similarity, and parameter regression) over 100 near-natural catchments with diverse hydrological regimes across Chile. We set up and calibrated a conceptual semi-distributed HBV-like hydrological model (TUWmodel) for each catchment, using four P products (CR2MET, RF-MEP, ERA5, and MSWEPv2.8). We assessed the ability of these regionalisation techniques to transfer the parameters of a rainfall-runoff model, implementing a leave-one-out cross-validation procedure for each P product. Despite differences in the spatio-temporal distribution of P, all products provided good performance during calibration (median Kling–Gupta efficiencies (KGE′s) > 0.77), two independent verification periods (median KGE′s >0.70 and 0.61, for near-normal and dry conditions, respectively), and regionalisation (median KGE′s for the best method ranging from 0.56 to 0.63). We show how model calibration is able to compensate, to some extent, differences between P forcings by adjusting model parameters and thus the water balance components. Overall, feature similarity provided the best results, followed by spatial proximity, while parameter regression resulted in the worst performance, reinforcing the importance of transferring complete model parameter sets to ungauged catchments. Our results suggest that (i) merging P products and ground-based measurements does not necessarily translate into an improved hydrologic model performance; (ii) the spatial resolution of P products does not substantially affect the regionalisation performance; (iii) a P product that provides the best individual model performance during calibration and verification does not necessarily yield the best performance in terms of parameter regionalisation; and (iv) the model parameters and the performance of regionalisation methods are affected by the hydrological regime, with the best results for spatial proximity and feature similarity obtained for rain-dominated catchments with a minor snowmelt component.

Download Full-text

Spatial Relationship between Precipitation and Runoff in Africa

10.5194/hess-2018-424 ◽

2018 ◽

Cited By ~ 1

Author(s):

Fidele Karamage ◽

Yuanbo Liu ◽

Xingwang Fan ◽

Meta Francis Justine ◽

Guiping Wu ◽

...

Keyword(s):

Land Surface ◽

Spatial Scales ◽

Spatial Relationship ◽

Resource Planning ◽

Climatic Conditions ◽

Runoff Coefficient ◽

Global Precipitation Climatology Centre ◽

High Runoff ◽

Monthly Runoff

Abstract. Lack of sufficient and reliable hydrological information is a key hindrance to water resource planning and management in Africa. Hence, the objective of this research is to examine the relationship between precipitation and runoff at three spatial scales, including the whole continent, 25 major basins and 55 countries. For this purpose, the long-term monthly runoff coefficient (Rc) was estimated using the long-term monthly runoff data (R) calculated from the Global Runoff Data Centre (GRDC) streamflow records and Global Precipitation Climatology Centre (GPCC) precipitation datasets for the period of time spanning from 1901 to 2017. Subsequently, the observed Rc data were interpolated in order to estimate Rc over the ungauged basins under guidance of key runoff controlling factors, including the land-surface temperature (T), precipitation (P) and potential runoff coefficient (Co) inferred from the land use and land cover, slope and soil texture information. The results show that 16 % of the annual mean precipitation (672.52 mm) becomes runoff (105.72 mm), with a runoff coefficient of 0.16, and the remaining 84 % (566.80 mm) evapotranspirates over the continent during 1901–2017. Spatial analysis reveals that the precipitation–runoff relationship varies significantly among different basins and countries, mainly dependent on climatic conditions and its inter-annual variability. Generally, high runoff depths and runoff coefficients are observed over humid tropical basins and countries with high precipitation intensity compared to those located in subtropical and temperate drylands.

Download Full-text

PERFORMANCE COMPARISON OF MACHINE LEARNING ALGORITHMS FOR PREDICTIVE MAINTENANCE

Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska ◽

10.35784/iapgos.1834 ◽

2020 ◽

Vol 10 (3) ◽

pp. 32-35

Author(s):

Jakub Gęca

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Performance Comparison ◽

Machine Learning Algorithms ◽

Predictive Maintenance ◽

Model Parameters ◽

Data Set ◽

Reduction Techniques ◽

Machine Reliability ◽

Dimensionality Reduction Techniques

The consequences of failures and unscheduled maintenance are the reasons why engineers have been trying to increase the reliability of industrial equipment for years. In modern solutions, predictive maintenance is a frequently used method. It allows to forecast failures and alert about their possibility. This paper presents a summary of the machine learning algorithms that can be used in predictive maintenance and comparison of their performance. The analysis was made on the basis of data set from Microsoft Azure AI Gallery. The paper presents a comprehensive approach to the issue including feature engineering, preprocessing, dimensionality reduction techniques, as well as tuning of model parameters in order to obtain the highest possible performance. The conducted research allowed to conclude that in the analysed case , the best algorithm achieved 99.92% accuracy out of over 122 thousand test data records. In conclusion, predictive maintenance based on machine learning represents the future of machine reliability in industry.

Download Full-text