Parsimonious statistical learning models for low flow estimation

Abstract. Statistical learning methods offer a promising approach for low flow regionalization. We examine seven statistical learning models (lasso, linear and non-linear model based boosting, sparse partial least squares, principal component regression, random forest, and support vector machine regression) for the prediction of winter and summer low flow based on a hydrological diverse dataset of 260 catchments in Austria. In order to produce sparse models we adapt the recursive feature elimination for variable preselection and propose to use three different variable ranking methods (conditional forest, lasso and linear model based boosting) for each of the prediction models. Results are evaluated for the low flow characteristic Q95 (Pr(Q>Q95) = 0.95) standardized by catchment area using a repeated nested cross validation scheme. We found a generally high prediction accuracy for winter (R2CV of 0.66 to 0.7) and summer (R2CV of 0.83 to 0.86). The models perform similar or slightly better than a Top-kriging model that constitutes the current benchmark for the study area. The best performing models are support vector machine regression (winter) and non-linear model based boosting (summer), but linear models exhibit similar prediction accuracy. The use of variable preselection can significantly reduce the complexity of all models with only a small loss of performance. The so obtained learning models are more parsimonious, thus easier to interpret and more robust when predicting at ungauged sites. A direct comparison of linear and non-linear models reveals that non-linear relationships can be sufficiently captured by linear learning models, so there is no need to use more complex models or to add non-liner effects. When performing low flow regionalization in a seasonal climate, the temporal stratification into summer and winter low flows was shown to increase the predictive performance of all learning models, offering an alternative to catchment grouping that is recommended otherwise.

Download Full-text

Parsimonious statistical learning models for low-flow estimation

Hydrology and Earth System Sciences ◽

10.5194/hess-26-129-2022 ◽

2022 ◽

Vol 26 (1) ◽

pp. 129-148

Author(s):

Johannes Laimighofer ◽

Michael Melcher ◽

Gregor Laaha

Keyword(s):

Support Vector Regression ◽

Statistical Learning ◽

Nonlinear Model ◽

Prediction Accuracy ◽

Prediction Models ◽

Low Flow ◽

Support Vector ◽

Learning Models ◽

Model Based ◽

Linear And Nonlinear

Abstract. Statistical learning methods offer a promising approach for low-flow regionalization. We examine seven statistical learning models (Lasso, linear, and nonlinear-model-based boosting, sparse partial least squares, principal component regression, random forest, and support vector regression) for the prediction of winter and summer low flow based on a hydrologically diverse dataset of 260 catchments in Austria. In order to produce sparse models, we adapt the recursive feature elimination for variable preselection and propose using three different variable ranking methods (conditional forest, Lasso, and linear model-based boosting) for each of the prediction models. Results are evaluated for the low-flow characteristic Q95 (Pr(Q>Q95)=0.95) standardized by catchment area using a repeated nested cross-validation scheme. We found a generally high prediction accuracy for winter (RCV2 of 0.66 to 0.7) and summer (RCV2 of 0.83 to 0.86). The models perform similarly to or slightly better than a top-kriging model that constitutes the current benchmark for the study area. The best-performing models are support vector regression (winter) and nonlinear model-based boosting (summer), but linear models exhibit similar prediction accuracy. The use of variable preselection can significantly reduce the complexity of all the models with only a small loss of performance. The so-obtained learning models are more parsimonious and thus easier to interpret and more robust when predicting at ungauged sites. A direct comparison of linear and nonlinear models reveals that nonlinear processes can be sufficiently captured by linear learning models, so there is no need to use more complex models or to add nonlinear effects. When performing low-flow regionalization in a seasonal climate, the temporal stratification into summer and winter low flows was shown to increase the predictive performance of all learning models, offering an alternative to catchment grouping that is recommended otherwise.

Download Full-text

Bus Travel Time Prediction: A Comparative Study of Linear and Non-Linear Machine Learning Models

Journal of Physics Conference Series ◽

10.1088/1742-6596/2161/1/012053 ◽

2022 ◽

Vol 2161 (1) ◽

pp. 012053

Author(s):

B P Ashwini ◽

R Sumathi ◽

H S Sudhira

Keyword(s):

Random Forest ◽

Travel Time ◽

Linear Models ◽

Public Transit ◽

Regression Trees ◽

Support Vector ◽

Learning Models ◽

Random Forest Regression ◽

The Public ◽

Non Linear

Abstract Congested roads are a global problem, and increased usage of private vehicles is one of the main reasons for congestion. Public transit modes of travel are a sustainable and eco-friendly alternative for private vehicle usage, but attracting commuters towards public transit mode is a mammoth task. Commuters expect the public transit service to be reliable, and to provide a reliable service it is necessary to fine-tune the transit operations and provide well-timed necessary information to commuters. In this context, the public transit travel time is predicted in Tumakuru, a tier-2 city of Karnataka, India. As this is one of the initial studies in the city, the performance comparison of eight Machines Learning models including four linear namely, Linear Regression, Ridge Regression, Least Absolute Shrinkage and Selection Operator Regression, and Support Vector Regression; and four non-linear models namely, k-Nearest Neighbors, Regression Trees, Random Forest Regression, and Gradient Boosting Regression Trees is conducted to identify a suitable model for travel time predictions. The data logs of one month (November 2020) of the Tumakuru city service, provided by Tumakuru Smart City Limited are used for the study. The time-of-the-day (trip start time), day-of-the-week, and direction of travel are used for the prediction. Travel time for both upstream and downstream are predicted, and the results are evaluated based on the performance metrics. The results suggest that the performance of non-linear models is superior to linear models for predicting travel times, and Random Forest Regression was found to be a better model as compared to other models.

Download Full-text

Product Sales Forecasting Model Based on Robust Wavelet v-Support Vector Machine

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2009.01027 ◽

2009 ◽

Vol 35 (7) ◽

pp. 1027-1032 ◽

Cited By ~ 5

Author(s):

Qi WU ◽

Hong-Sen YAN ◽

Bin WANG

Keyword(s):

Support Vector Machine ◽

Support Vector ◽

Forecasting Model ◽

Model Based ◽

Product Sales

Download Full-text

Research on Server Health State Prediction Model Based on Support Vector Machine

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/790/1/012029 ◽

2020 ◽

Vol 790 ◽

pp. 012029

Author(s):

Dan Jin ◽

Ce Li ◽

Qiong Wang ◽

Yuhang Chen

Keyword(s):

Support Vector Machine ◽

Prediction Model ◽

Support Vector ◽

Health State ◽

Model Based ◽

State Prediction

Download Full-text

Ionospheric TEC forecast model based on support vector machine with GPU acceleration in the China region

Advances in Space Research ◽

10.1016/j.asr.2021.03.021 ◽

2021 ◽

Author(s):

Guozhen Xia ◽

Yi Liu ◽

Tongfeng Wei ◽

Zhuangkai Wang ◽

Weiquan Huang ◽

...

Keyword(s):

Support Vector Machine ◽

Forecast Model ◽

Gpu Acceleration ◽

Support Vector ◽

Model Based ◽

China Region

Download Full-text

A Novel Hybrid Model Based on TVIW-PSO-GSA Algorithm and Support Vector Machine for Classification Problems

IEEE Access ◽

10.1109/access.2019.2897644 ◽

2019 ◽

Vol 7 ◽

pp. 27789-27801 ◽

Cited By ~ 10

Author(s):

Hongxin Xue ◽

Yanping Bai ◽

Hongping Hu ◽

Ting Xu ◽

Haijian Liang

Keyword(s):

Support Vector Machine ◽

Hybrid Model ◽

Support Vector ◽

Classification Problems ◽

Model Based

Download Full-text

Support vector machine and higher‐order cumulants based blind identification for non‐linear Wiener models

IET Signal Processing ◽

10.1049/iet-spr.2017.0384 ◽

2018 ◽

Vol 12 (6) ◽

pp. 761-769 ◽

Cited By ~ 1

Author(s):

Xiaoping Xu ◽

Feng Wang ◽

Fucai Qian

Keyword(s):

Support Vector Machine ◽

Higher Order ◽

Support Vector ◽

Blind Identification ◽

Higher Order Cumulants ◽

Non Linear ◽

Wiener Models

Download Full-text

Experimental demonstration of non-linear model-based in-line control of pH

Journal of Process Control ◽

10.1016/0959-1524(92)85004-g ◽

1992 ◽

Vol 2 (3) ◽

pp. 145-153 ◽

Cited By ~ 26

Author(s):

Suhas K. Mahuli ◽

R. Russell Rhinehart ◽

James B. Riggs

Keyword(s):

Linear Model ◽

Experimental Demonstration ◽

Model Based ◽

Non Linear

Download Full-text

Research on Cloud Computing Scheduling Model Based on Support Vector Machine Algorithm

10.1145/3482632.3482746 ◽

2021 ◽

Author(s):

Fen Li

Keyword(s):

Support Vector Machine ◽

Cloud Computing ◽

Support Vector ◽

Support Vector Machine Algorithm ◽

Model Based ◽

Scheduling Model

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text