scholarly journals Application of Machine Learning Models to Predict Maximum Event Water Fractions in Streamflow

2021 ◽  
Vol 3 ◽  
Author(s):  
Amir Sahraei ◽  
Alejandro Chamorro ◽  
Philipp Kraft ◽  
Lutz Breuer

Estimating the maximum event water fraction, at which the event water contribution to streamflow reaches its peak value during a precipitation event, gives insight into runoff generation mechanisms and hydrological response characteristics of a catchment. Stable isotopes of water are ideal tracers for accurate estimation of maximum event water fractions using isotopic hydrograph separation techniques. However, sampling and measuring of stable isotopes of water is laborious, cost intensive, and often not conceivable under difficult spatiotemporal conditions. Therefore, there is a need for a proper predictive model to predict maximum event water fractions even at times when no direct sampling and measurements of stable isotopes of water are available. The behavior of maximum event water fraction at the event scale is highly dynamic and its relationships with the catchment drivers are complex and non-linear. In last two decades, machine learning algorithms have become increasingly popular in the various branches of hydrology due to their ability to represent complex and non-linear systems without any a priori assumption about the structure of the data and knowledge about the underlying physical processes. Despite advantages of machine learning, its potential in the field of isotope hydrology has rarely been investigated. Present study investigates the applicability of Artificial Neural Network (ANN) and Support Vector Machine (SVM) algorithms to predict maximum event water fractions in streamflow using precipitation, soil moisture, and air temperature as a set of explanatory input features that are more straightforward and less expensive to measure compared to stable isotopes of water, in the Schwingbach Environmental Observatory (SEO), Germany. The influence of hyperparameter configurations on the model performance and the comparison of prediction performance between optimized ANN and optimized SVM are further investigated in this study. The performances of the models are evaluated using mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R2), and Nash-Sutcliffe Efficiency (NSE). For the ANN, the results showed that an appropriate number of hidden nodes and a proper activation function enhanced the model performance, whereas changes of the learning rate did not have a major impact on the model performance. For the SVM, Polynomial kernel achieved the best performance, whereas Linear yielded the weakest performance among the kernel functions. The result showed that maximum event water fraction could be successfully predicted using only precipitation, soil moisture, and air temperature. The optimized ANN showed a satisfactory prediction performance with MAE of 10.27%, RMSE of 12.91%, R2 of 0.70, and NSE of 0.63. The optimized SVM was superior to that of ANN with MAE of 7.89%, RMSE of 9.43%, R2 of 0.83, and NSE of 0.78. SVM could better capture the dynamics of maximum event water fractions across the events and the predictions were generally closer to the corresponding observed values. ANN tended to underestimate the events with high maximum event water fractions and to overestimate the events with low maximum event water fractions. Machine learning can prove to be a promising approach to predict variables that are not always possible to be estimated due to the lack of routine measurements.

2020 ◽  
Author(s):  
Alejandro Chamorro ◽  
Amirhossein Sahraei ◽  
Tobias Houska ◽  
Lutz Breuer

<p><strong>Abstract </strong>In recent years, stable isotopes of water have become a well-known tool to investigate runoff generation processes. The proper estimation of stable water isotope concentration dynamics based on a set of independent multivariate variables would allow the quantification of event water fraction in stream water even at times when no direct measurements of isotopes are available. Here we estimate stable water isotope concentrations and derived event water fractions in stream water over 40 precipitation events. A mobile field laboratory was set up to measure high-resolution (20 min) stable isotopes of water by laser spectrometry. Artificial neural networks (ANN) were established to model the same information. We consider precipitation and antecedent wetness hydrometrics such as precipitation depth, precipitation intensity and soil moisture of different depths as independent variables measured in the same high-temporal resolution. An important issue is the reduction of the deviation between observations and simulations in both the training and testing set of the network. In order to minimize this difference, various combinations of variables, dimensionalities of the training and testing sets and ANN architectures are studied. A k-fold cross validation analysis is performed to find the best solution. Further constraints in the iteration procedure are considered to avoid overfitting. The study was carried out in the Schwingbach Environmental Observatory (SEO), Germany. Results indicate a good performance of the optimized model, in which the dynamics of the isotope concentrations and the estimated event water fractions in the stream water were estimated. Compared to a multivariate linear model, the ANN-based model clearly outperformed the estimations showing the smallest deviation. The optimum network consists of 2 hidden nodes with a 5-dimensional input set. This strongly suggests that ANN-based models can be used to estimate and even forecast the dynamics of the isotope concentrations and event water fractions for future precipitation events.</p>


2007 ◽  
Vol 46 (10) ◽  
pp. 1587-1605 ◽  
Author(s):  
J-F. Miao ◽  
D. Chen ◽  
K. Borne

Abstract In this study, the performance of two advanced land surface models (LSMs; Noah LSM and Pleim–Xiu LSM) coupled with the fifth-generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model (MM5), version 3.7.2, in simulating the near-surface air temperature in the greater Göteborg area in Sweden is evaluated and compared using the GÖTE2001 field campaign data. Further, the effects of different planetary boundary layer schemes [Eta and Medium-Range Forecast (MRF) PBLs] for Noah LSM and soil moisture initialization approaches for Pleim–Xiu LSM are investigated. The investigation focuses on the evaluation and comparison of diurnal cycle intensity and maximum and minimum temperatures, as well as the urban heat island during the daytime and nighttime under the clear-sky and cloudy/rainy weather conditions for different experimental schemes. The results indicate that 1) there is an evident difference between Noah LSM and Pleim–Xiu LSM in simulating the near-surface air temperature, especially in the modeled urban heat island; 2) there is no evident difference in the model performance between the Eta PBL and MRF PBL coupled with the Noah LSM; and 3) soil moisture initialization is of crucial importance for model performance in the Pleim–Xiu LSM. In addition, owing to the recent release of MM5, version 3.7.3, some experiments done with version 3.7.2 were repeated to reveal the effects of the modifications in the Noah LSM and Pleim–Xiu LSM. The modification to longwave radiation parameterizations in Noah LSM significantly improves model performance while the adjustment of emissivity, one of the vegetation properties, affects Pleim–Xiu LSM performance to a larger extent. The study suggests that improvements both in Noah LSM physics and in Pleim–Xiu LSM initialization of soil moisture and parameterization of vegetation properties are important.


2003 ◽  
Vol 17 (6) ◽  
pp. 1073-1092 ◽  
Author(s):  
A. Sugimoto ◽  
D. Naito ◽  
N. Yanagisawa ◽  
K. Ichiyanagi ◽  
N. Kurita ◽  
...  

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Fatma-Elzahraa Eid ◽  
Haitham A. Elmarakeby ◽  
Yujia Alina Chan ◽  
Nadine Fornelos ◽  
Mahmoud ElHefnawi ◽  
...  

AbstractBiases in data used to train machine learning (ML) models can inflate their prediction performance and confound our understanding of how and what they learn. Although biases are common in biological data, systematic auditing of ML models to identify and eliminate these biases is not a common practice when applying ML in the life sciences. Here we devise a systematic, principled, and general approach to audit ML models in the life sciences. We use this auditing framework to examine biases in three ML applications of therapeutic interest and identify unrecognized biases that hinder the ML process and result in substantially reduced model performance on new datasets. Ultimately, we show that ML models tend to learn primarily from data biases when there is insufficient signal in the data to learn from. We provide detailed protocols, guidelines, and examples of code to enable tailoring of the auditing framework to other biomedical applications.


2021 ◽  
Author(s):  
Emy Alerskans ◽  
Joachim Nyborg ◽  
Morten Birk ◽  
Eigil Kaas

<p>Numerical weather prediction (NWP) models are known to exhibit systematic errors, especially for near-surface variables such as air temperature. This is partly due to deficiencies in the physical formulation of the model dynamics and the inability of these models to successfully handle sub-grid phenomena. Forecasts that better match the locally observed weather can be obtained by post-processing NWP model output using local meteorological observations. Here, we have implemented a non-linear post-processing model based on machine learning techniques with the aim of post-processing near-surface air temperature forecasts from a global coarse-resolution model in order to produce localized forecasts. The model is trained on observational from a network of private weather stations and forecast data from the global coarse-resolution NWP model. Independent data is used to assess the performance of the model and the results are compared with the performance of the raw NWP model output. Overall, the non-linear machine learning post-processing method reduces the bias and the standard deviation compared to the raw NWP forecast and produces a forecast that better match the locally observed weather.</p>


2021 ◽  
Author(s):  
Doris Duethmann ◽  
Aaron Smith ◽  
Lukas Kleine ◽  
Chris Soulsby ◽  
Doerthe Tetzlaff

<p>It is widely acknowledged that calibrating and evaluating hydrological models only against streamflow may lead to inconsistencies of internal model states and large parameter uncertainties. Soil moisture is a key variable for the energy and water balance, which affects the partitioning of solar radiation into latent and sensible heat as well as the partitioning of precipitation into direct runoff and catchment storage. In contrast to ground-based measurements, satellite-derived soil moisture (SDSM) data are widely available and new data products benefit from improved spatio-temporal resolutions. Here we use a soil water index product based on data fusion of microwave data from METOP ASCAT and Sentinel 1 CSAR for calibrating the process-based ecohydrological model EcH<sub>2</sub>O-iso in the 66 km² Demnitzer Millcreek catchment in NE Germany. Available field measurements in and close to this intensively monitored catchment include soil moisture data from 74 sensors and water stable isotopes in precipitation, stream and soil water. Water stable isotopes provide information on flow pathways, storage dynamics, and the partitioning of evapotranspiration into evaporation and transpiration. Accounting for water stable isotopes in the ecohydrologic model therefore provides further insights regarding the consistency of internal processes. We first compare the SDSM data to the ground-based measurements. Based on a Monte Carlo approach, we then investigate the trade-off between model performance in terms of soil moisture and streamflow. <em>In situ</em> soil moisture and water stable isotopes are further consulted to evaluate the internal consistency of the model. Overall, we find relatively good agreements between satellite-derived and ground based soil moisture dynamics. Preliminary results suggest that including SDSM in the model calibration can improve the simulation of internal processes, but uncertainties of the SDSM data should be accounted for. The findings of this study are relevant for reliable ecohydrological modelling in catchments that lack detailed field measurements for model evaluation.</p>


2019 ◽  
Vol 11 (3) ◽  
pp. 284 ◽  
Author(s):  
Linglin Zeng ◽  
Shun Hu ◽  
Daxiang Xiang ◽  
Xiang Zhang ◽  
Deren Li ◽  
...  

Soil moisture mapping at a regional scale is commonplace since these data are required in many applications, such as hydrological and agricultural analyses. The use of remotely sensed data for the estimation of deep soil moisture at a regional scale has received far less emphasis. The objective of this study was to map the 500-m, 8-day average and daily soil moisture at different soil depths in Oklahoma from remotely sensed and ground-measured data using the random forest (RF) method, which is one of the machine-learning approaches. In order to investigate the estimation accuracy of the RF method at both a spatial and a temporal scale, two independent soil moisture estimation experiments were conducted using data from 2010 to 2014: a year-to-year experiment (with a root mean square error (RMSE) ranging from 0.038 to 0.050 m3/m3) and a station-to-station experiment (with an RMSE ranging from 0.044 to 0.057 m3/m3). Then, the data requirements, importance factors, and spatial and temporal variations in estimation accuracy were discussed based on the results using the training data selected by iterated random sampling. The highly accurate estimations of both the surface and the deep soil moisture for the study area reveal the potential of RF methods when mapping soil moisture at a regional scale, especially when considering the high heterogeneity of land-cover types and topography in the study area.


Author(s):  
Fahad Kamran ◽  
Kathryn Harrold ◽  
Jonathan Zwier ◽  
Wendy Carender ◽  
Tian Bao ◽  
...  

Abstract Background Recently, machine learning techniques have been applied to data collected from inertial measurement units to automatically assess balance, but rely on hand-engineered features. We explore the utility of machine learning to automatically extract important features from inertial measurement unit data for balance assessment. Findings Ten participants with balance concerns performed multiple balance exercises in a laboratory setting while wearing an inertial measurement unit on their lower back. Physical therapists watched video recordings of participants performing the exercises and rated balance on a 5-point scale. We trained machine learning models using different representations of the unprocessed inertial measurement unit data to estimate physical therapist ratings. On a held-out test set, we compared these learned models to one another, to participants’ self-assessments of balance, and to models trained using hand-engineered features. Utilizing the unprocessed kinematic data from the inertial measurement unit provided significant improvements over both self-assessments and models using hand-engineered features (AUROC of 0.806 vs. 0.768, 0.665). Conclusions Unprocessed data from an inertial measurement unit used as input to a machine learning model produced accurate estimates of balance performance. The ability to learn from unprocessed data presents a potentially generalizable approach for assessing balance without the need for labor-intensive feature engineering, while maintaining comparable model performance.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


Sign in / Sign up

Export Citation Format

Share Document