scholarly journals Parsimonious statistical learning models for low flow estimation

2021 ◽  
Author(s):  
Johannes Laimighofer ◽  
Michael Melcher ◽  
Gregor Laaha

Abstract. Statistical learning methods offer a promising approach for low flow regionalization. We examine seven statistical learning models (lasso, linear and non-linear model based boosting, sparse partial least squares, principal component regression, random forest, and support vector machine regression) for the prediction of winter and summer low flow based on a hydrological diverse dataset of 260 catchments in Austria. In order to produce sparse models we adapt the recursive feature elimination for variable preselection and propose to use three different variable ranking methods (conditional forest, lasso and linear model based boosting) for each of the prediction models. Results are evaluated for the low flow characteristic Q95 (Pr(Q>Q95) = 0.95) standardized by catchment area using a repeated nested cross validation scheme. We found a generally high prediction accuracy for winter (R2CV of 0.66 to 0.7) and summer (R2CV of 0.83 to 0.86). The models perform similar or slightly better than a Top-kriging model that constitutes the current benchmark for the study area. The best performing models are support vector machine regression (winter) and non-linear model based boosting (summer), but linear models exhibit similar prediction accuracy. The use of variable preselection can significantly reduce the complexity of all models with only a small loss of performance. The so obtained learning models are more parsimonious, thus easier to interpret and more robust when predicting at ungauged sites. A direct comparison of linear and non-linear models reveals that non-linear relationships can be sufficiently captured by linear learning models, so there is no need to use more complex models or to add non-liner effects. When performing low flow regionalization in a seasonal climate, the temporal stratification into summer and winter low flows was shown to increase the predictive performance of all learning models, offering an alternative to catchment grouping that is recommended otherwise.

2022 ◽  
Vol 26 (1) ◽  
pp. 129-148
Author(s):  
Johannes Laimighofer ◽  
Michael Melcher ◽  
Gregor Laaha

Abstract. Statistical learning methods offer a promising approach for low-flow regionalization. We examine seven statistical learning models (Lasso, linear, and nonlinear-model-based boosting, sparse partial least squares, principal component regression, random forest, and support vector regression) for the prediction of winter and summer low flow based on a hydrologically diverse dataset of 260 catchments in Austria. In order to produce sparse models, we adapt the recursive feature elimination for variable preselection and propose using three different variable ranking methods (conditional forest, Lasso, and linear model-based boosting) for each of the prediction models. Results are evaluated for the low-flow characteristic Q95 (Pr(Q>Q95)=0.95) standardized by catchment area using a repeated nested cross-validation scheme. We found a generally high prediction accuracy for winter (RCV2 of 0.66 to 0.7) and summer (RCV2 of 0.83 to 0.86). The models perform similarly to or slightly better than a top-kriging model that constitutes the current benchmark for the study area. The best-performing models are support vector regression (winter) and nonlinear model-based boosting (summer), but linear models exhibit similar prediction accuracy. The use of variable preselection can significantly reduce the complexity of all the models with only a small loss of performance. The so-obtained learning models are more parsimonious and thus easier to interpret and more robust when predicting at ungauged sites. A direct comparison of linear and nonlinear models reveals that nonlinear processes can be sufficiently captured by linear learning models, so there is no need to use more complex models or to add nonlinear effects. When performing low-flow regionalization in a seasonal climate, the temporal stratification into summer and winter low flows was shown to increase the predictive performance of all learning models, offering an alternative to catchment grouping that is recommended otherwise.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012053
Author(s):  
B P Ashwini ◽  
R Sumathi ◽  
H S Sudhira

Abstract Congested roads are a global problem, and increased usage of private vehicles is one of the main reasons for congestion. Public transit modes of travel are a sustainable and eco-friendly alternative for private vehicle usage, but attracting commuters towards public transit mode is a mammoth task. Commuters expect the public transit service to be reliable, and to provide a reliable service it is necessary to fine-tune the transit operations and provide well-timed necessary information to commuters. In this context, the public transit travel time is predicted in Tumakuru, a tier-2 city of Karnataka, India. As this is one of the initial studies in the city, the performance comparison of eight Machines Learning models including four linear namely, Linear Regression, Ridge Regression, Least Absolute Shrinkage and Selection Operator Regression, and Support Vector Regression; and four non-linear models namely, k-Nearest Neighbors, Regression Trees, Random Forest Regression, and Gradient Boosting Regression Trees is conducted to identify a suitable model for travel time predictions. The data logs of one month (November 2020) of the Tumakuru city service, provided by Tumakuru Smart City Limited are used for the study. The time-of-the-day (trip start time), day-of-the-week, and direction of travel are used for the prediction. Travel time for both upstream and downstream are predicted, and the results are evaluated based on the performance metrics. The results suggest that the performance of non-linear models is superior to linear models for predicting travel times, and Random Forest Regression was found to be a better model as compared to other models.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 27789-27801 ◽  
Author(s):  
Hongxin Xue ◽  
Yanping Bai ◽  
Hongping Hu ◽  
Ting Xu ◽  
Haijian Liang

1992 ◽  
Vol 2 (3) ◽  
pp. 145-153 ◽  
Author(s):  
Suhas K. Mahuli ◽  
R. Russell Rhinehart ◽  
James B. Riggs

Author(s):  
Pratyush Kaware

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.


Sign in / Sign up

Export Citation Format

Share Document