scholarly journals Data-driven runoff forecasting for Minjiang River: a case study

2020 ◽  
Vol 20 (6) ◽  
pp. 2284-2295
Author(s):  
Yuqiang Wu ◽  
Qinhui Wang ◽  
Ge Li ◽  
Jidong Li

Abstract Long-term runoff forecasting has the characteristics of a long forecast period, which can be widely applied in environmental protection, hydropower operation, flood prevention and waterlogging management, water transport management, and optimal allocation of water resources. Many models and methods are currently used for runoff prediction, and data-driven models for runoff prediction are now mainstream methods, but their prediction accuracy cannot meet the needs of production departments. To this end, the present research starts with this method and, based on a support vector machine (SVM), it introduces ant colony optimization (ACO) to optimize its penalty coefficient C, Kernel function parameter g, and insensitivity coefficient p, to construct a data-driven ACO-SVM model. The validity of the method is confirmed by taking the Minjiang River Basin as an example. The results show that the runoff predicted by use of ACO-SVM is more accurate than that of the default parameter SVM and the Bayesian method.

2012 ◽  
Vol 226-228 ◽  
pp. 2303-2307
Author(s):  
Li Xue Wang ◽  
Li Na Wang ◽  
Guo Feng Li ◽  
Ce Luan ◽  
Fei Fei Sun

In view of the little sample, less data problems, mid-and-long term hydrologic forecasting is a case of which, Support Vector Machine (SVM) can solve this kind of problems perfectly. This paper introduced the basic optimization procedure and PSO-SVM modeling procedure. The PSO-SVM model has been applied in forecasting the monthly runoff of Dahuofang reservoir. The comparison between PSO-SVM and not-optimized SVM implied that the PSO-SVM has a fast convergence speed and strong generalization capability, also the related error has been decreased from 15.5% to 11.9%.


2020 ◽  
Vol 20 (8) ◽  
pp. 3658-3664
Author(s):  
Chen Shijun ◽  
Wei Qin ◽  
Zhu Yanmei ◽  
Ma Guangwen ◽  
Han Xiaoyan ◽  
...  

Abstract Medium- and long-term runoff forecasting is closely related to the generation capacity forecasting of cascade hydropower stations, which is of great significance to power plants when arranging production plans and assisting market decisions. In order to improve the accuracy of runoff forecasting, an attempt was made to use random forest regression (RFR) to model the medium- and long-term runoff forecasting and to further make a verification based on the actual monthly runoff data of Mupo and Chuntangba stations. By comparison with the forecast results attained through a support vector machine (SVM) and an integrated autoregressive moving average model (IARMA), the results showed that the RFR model had the lowest mean square error (MSE) among the three methods. In addition, the coefficients of determination R2 of the RFR for the two stations increased by 0.0261 and 0.0295 compared with the SVM model, and the R2 rose by 0.1134 and 0.1332 compared with the IARMA model. The comparison of the three methods showed that the RFR had higher forecasting accuracy as well as stronger reliability and practicability than the IARMA model and the SVM model, so the RFR provided a new idea and method for the study of runoff forecasting.


2021 ◽  
Vol 13 (5) ◽  
pp. 949
Author(s):  
Salman Qureshi ◽  
Saman Nadizadeh Shorabeh ◽  
Najmeh Neysani Samany ◽  
Foad Minaei ◽  
Mehdi Homaee ◽  
...  

Due to irregular and uncontrolled expansion of cities in developing countries, currently operational landfill sites cannot be used in the long-term, as people will be living in proximity to these sites and be exposed to unhygienic circumstances. Hence, this study aims at proposing an integrated approach for determining suitable locations for landfills while considering their physical expansion. The proposed approach utilizes the fuzzy analytical hierarchy process (FAHP) to weigh the sets of identified landfill location criteria. Furthermore, the weighted linear combination (WLC) approach was applied for the elicitation of the proper primary locations. Finally, the support vector machine (SVM) and cellular automation-based Markov chain method were used to predict urban growth. To demonstrate the applicability of the developed approach, it was applied to a case study, namely the city of Mashhad in Iran, where suitable sites for landfills were identified considering the urban growth in different geographical directions for this city by 2048. The proposed approach could be of use for policymakers, urban planners, and other decision-makers to minimize uncertainty arising from long-term resource allocation.


Water ◽  
2018 ◽  
Vol 10 (11) ◽  
pp. 1618 ◽  
Author(s):  
Dan Ma ◽  
Hongyu Duan ◽  
Xin Cai ◽  
Zhenhua Li ◽  
Qiang Li ◽  
...  

Water inrush hazards can be effectively reduced by a reasonable and accurate soft-measuring method on the water inrush quantity from the mine floor. This is quite important for safe mining. However, there is a highly nonlinear relationship between the water outburst from coal seam floors and geological structure, hydrogeology, aquifer, water pressure, water-resisting strata, mining damage, fault and other factors. Therefore, it is difficult to establish a suitable model by traditional methods to forecast the water inrush quantity from the mine floor. Modeling methods developed in other fields can provide adequate models for rock behavior on water inrush. In this study, a new forecast system, which is based on a hybrid genetic algorithm (GA) with the support vector machine (SVM) algorithm, a model structure and the related parameters are proposed simultaneously on water inrush prediction. With the advantages of powerful global optimization functions, implicit parallelism and high stability of the GA, the penalty coefficient, insensitivity coefficient and kernel function parameter of the SVM model are determined as approximately optimal automatically in the spatial dimension. All of these characteristics greatly improve the accuracy and usable range of the SVM model. Testing results show that GA has a useful ability in finding optimal parameters of a SVM model. The performance of the GA optimized SVM (GA-SVM) is superior to the SVM model. The GA-SVM enables the prediction of water inrush and provides a promising solution to the predictive problem for relevant industries.


Author(s):  
Junwei Ma ◽  
Xiao Liu ◽  
Xiaoxu Niu ◽  
Yankun Wang ◽  
Tao Wen ◽  
...  

Data-driven models have been extensively employed in landslide displacement prediction. However, predictive uncertainty, which consists of input uncertainty, parameter uncertainty, and model uncertainty, is usually disregarded in deterministic data-driven modeling, and point estimates are separately presented. In this study, a probability-scheme combination ensemble prediction that employs quantile regression neural networks and kernel density estimation (QRNNs-KDE) is proposed for robust and accurate prediction and uncertainty quantification of landslide displacement. In the ensemble model, QRNNs serve as base learning algorithms to generate multiple base learners. Final ensemble prediction is obtained by integration of all base learners through a probability combination scheme based on KDE. The Fanjiaping landslide in the Three Gorges Reservoir area (TGRA) was selected as a case study to explore the performance of the ensemble prediction. Based on long-term (2006–2018) and near real-time monitoring data, a comprehensive analysis of the deformation characteristics was conducted for fully understanding the triggering factors. The experimental results indicate that the QRNNs-KDE approach can perform predictions with perfect performance and outperform the traditional backpropagation (BP), radial basis function (RBF), extreme learning machine (ELM), support vector machine (SVM) methods, bootstrap-extreme learning machine-artificial neural network (bootstrap-ELM-ANN), and Copula-kernel-based support vector machine quantile regression (Copula-KSVMQR). The proposed QRNNs-KDE approach has significant potential in medium-term to long-term horizon forecasting and quantification of uncertainty.


Author(s):  
Yi Ji ◽  
Hong-Tao Dong ◽  
Zhen-Xiang Xing ◽  
Ming-xin Sun ◽  
Qiang Fu ◽  
...  

Abstract Middle and long-term runoff forecasting has always been a problem, especially in flood seasons. The forecasting performance can be improved using complementary ensemble empirical mode decomposition (CEEMD) to produce clearer signals as model inputs. In the forecasting models based on CEEMD, the entire time series is decomposed into several sub-series, each sub-series is divided into training and validation dataset, and forecasted by some common models, such as least-squares support vector machine (LSSVM), and finally an ensemble forecasting result is obtained by summing the forecasted results of each sub-series. This model is applied to forecast the inflow runoff of theShitouxia Reservoir (STX Reservoir). The forecasting results show that the Nash efficiency coefficient of the LSSVM model is 0.815, and the Nash efficiency coefficient of the CEEMD-LSSVM model is 0.954, an increase of 13.9%. The root mean square error value is reduced from 20.654 to 10.235, a decrease of 50.4%.The runoff forecasting performance can be improved effectively by applying the CEEMD-LSSVM model.When analyzing the annual runoff forecasting results month by month, it was found that the forecasting results from November to April of the following year were unsatisfactory compared with the nearest neighbor bootstrapping regressive (NNBR) model which more suitable in dry season, but the forecasting results from May to October improved significantly. This also proves that the CEEMD-LSSVM model has a great advantage in the forecasting of inflow runoff during the flood season. In the optimized operation of reservoirs, the forecasting result of inflow runoff in flood season is more important than in dry season. Therefore, when forecasting annual runoff month by month, it is recommended to adopt the CEEMD-LSSVM model in the flood season and the NNBR model in the dry season, that is, the combination of the two models is applied to the forecasting of the inflow runoff of the STX Reservoir.


2018 ◽  
Vol 9 (1) ◽  
pp. 104 ◽  
Author(s):  
Kejun Long ◽  
Wukai Yao ◽  
Jian Gu ◽  
Wei Wu ◽  
Lee Han

Freeway travel time is influenced by many factors including traffic volume, adverse weather, accidents, traffic control, and so on. We employ the multiple source data-mining method to analyze freeway travel time. We collected toll data, weather data, traffic accident disposal logs, and other historical data from Freeway G5513 in Hunan Province, China. Using the Support Vector Machine (SVM), we proposed the travel time predicting model founded on these databases. The new SVM model can simulate the nonlinear relationship between travel time and those factors. In order to improve the precision of the SVM model, we applied the Artificial Fish Swarm algorithm to optimize the SVM model parameters, which include the kernel parameter σ, non-sensitive loss function parameter ε, and penalty parameter C. We compared the new optimized SVM model with the Back Propagation (BP) neural network and a common SVM model, using the historical data collected from freeway G5513. The results show that the accuracy of the optimized SVM model is 17.27% and 16.44% higher than those of the BP neural network model and the common SVM model, respectively.


Author(s):  
Allan Fong ◽  
Nicholas Scoulios ◽  
H. Joseph Blumenthal ◽  
Ryan E. Anderson

Abstract Background and Objective The prevalence of value-based payment models has led to an increased use of the electronic health record to capture quality measures, necessitating additional documentation requirements for providers. Methods This case study uses text mining and natural language processing techniques to identify the timely completion of diabetic eye exams (DEEs) from 26,203 unique clinician notes for reporting as an electronic clinical quality measure (eCQM). Logistic regression and support vector machine (SVM) using unbalanced and balanced datasets, using the synthetic minority over-sampling technique (SMOTE) algorithm, were evaluated on precision, recall, sensitivity, and f1-score for classifying records positive for DEE. We then integrate a high precision DEE model to evaluate free-text clinical narratives from our clinical EHR system. Results Logistic regression and SVM models had comparable f1-score and specificity metrics with models trained and validated with no oversampling favoring precision over recall. SVM with and without oversampling resulted in the best precision, 0.96, and recall, 0.85, respectively. These two SVM models were applied to the unannotated 31,585 text segments representing 24,823 unique records and 13,714 unique patients. The number of records classified as positive for DEE using the SVM models ranged from 667 to 8,935 (2.7–36% out of 24,823, respectively). Unique patients classified as positive for DEE ranged from 3.5 to 41.8% highlighting the potential utility of these models. Discussion We believe the impact of oversampling on SVM model performance to be caused by the potential of overfitting of the SVM SMOTE model on the synthesized data and the data synthesis process. However, the specificities of SVM with and without SMOTE were comparable, suggesting both models were confident in their negative predictions. By prioritizing to implement the SVM model with higher precision over sensitivity or recall in the categorization of DEEs, we can provide a highly reliable pool of results that can be documented through automation, reducing the burden of secondary review. Although the focus of this work was on completed DEEs, this method could be applied to completing other necessary documentation by extracting information from natural language in clinician notes. Conclusion By enabling the capture of data for eCQMs from documentation generated by usual clinical practice, this work represents a case study in how such techniques can be leveraged to drive quality without increasing clinician work.


Author(s):  
Jacobus Daniel van der Walt ◽  
Eric Scheepbouwer ◽  
Bryan Pidwerbesky ◽  
Brian Guo ◽  
Max Ferguson ◽  
...  

With the advancement of digital technology, the collection of pavement performance data has become commonplace. The improvement of tools to extract useful information from pavement databases has become a priority to justify expenditures. This paper presents a case study of PaveMD, a tool that integrates multi-dimensional data structures with a data-driven fuzzy approach to identify good performing pavement sections. Combining this tool with an innovative paradigm where the focus is on repeating success can bring additional value to existing pavement databases. The case study shows that PaveMD can identify pavement sections that are performing well by comparing performance measures for the New Zealand context. In this paper, PaveMD's development is described, and its implementation is showcased using data from the New Zealand Long-Term Pavement Performance (LTPP) database. It is recommended that this approach be further developed and extended to other infrastructure databases internationally.


Sign in / Sign up

Export Citation Format

Share Document