Interpolation of Missing Precipitation Data Using Kernel Estimations for Hydrologic Modeling

Precipitation is the main factor that drives hydrologic modeling; therefore, missing precipitation data can cause malfunctions in hydrologic modeling. Although interpolation of missing precipitation data is recognized as an important research topic, only a few methods follow a regression approach. In this study, daily precipitation data were interpolated using five different kernel functions, namely, Epanechnikov, Quartic, Triweight, Tricube, and Cosine, to estimate missing precipitation data. This study also presents an assessment that compares estimation of missing precipitation data throughKth nearest neighborhood (KNN) regression to the five different kernel estimations and their performance in simulating streamflow using the Soil Water Assessment Tool (SWAT) hydrologic model. The results show that the kernel approaches provide higher quality interpolation of precipitation data compared with theKNN regression approach, in terms of both statistical data assessment and hydrologic modeling performance.

Download Full-text

Impact of model complexity and precipitation data products on modeled streamflow

Journal of Hydroinformatics ◽

10.2166/hydro.2013.056 ◽

2013 ◽

Vol 16 (3) ◽

pp. 588-599 ◽

Cited By ~ 2

Author(s):

Kenneth J. Tobin ◽

Marvin E. Bennett

Keyword(s):

Assessment Tool ◽

Rain Gauge ◽

Hydrologic Model ◽

South Texas ◽

Model Complexity ◽

Precipitation Data ◽

Hydrologic Models ◽

Rain Gauges ◽

Sensing Platforms ◽

Southern Georgia

With the proliferation of remote sensing platforms as well as numerous ground products based on weather radar estimation, there are now multiple options for precipitation data beyond traditional rain gauges for which most hydrologic models were originally designed. This study evaluates four precipitation products as input for generating streamflow simulations using two hydrologic models that significantly vary in complexity. The four precipitation products include two ground products from the National Weather Service: the Multi-sensor Precipitation Estimator (MPE) and rain gauge data. The two satellite products come from NASA's Tropical Rainfall Measurement Mission (TRMM) and include the TRMM 3B42 Research Version 6, which has a built-in ground bias correction, and the real-time TRMM Multi-Satellite Precipitation Analysis. The two hydrologic models utilized include the Soil and Water Assessment Tool (SWAT) and Gridded Surface and Subsurface Hydrologic Analysis (GSSHA). Simulations were conducted in three, moderate- to large-sized basins across the southern United States, the San Casimiro (South Texas), Skuna (northern Mississippi), Alapaha (southern Georgia), and were run for over 2 years. This study affirms the realization that input precipitation is at least as important as the choice of hydrologic model.

Download Full-text

Calibrating GPM IMERG Late-Run product using ground-based CPC daily precipitation data: a case study in the Beijing-Tianjin-Hebei urban agglomeration

Remote Sensing Letters ◽

10.1080/2150704x.2021.1942576 ◽

2021 ◽

Vol 12 (9) ◽

pp. 848-858

Author(s):

Jintao Xu ◽

Siyu Zhu ◽

Ziqiang Ma ◽

Hui Liu ◽

Yulin Shangguan ◽

...

Keyword(s):

Daily Precipitation ◽

Urban Agglomeration ◽

Precipitation Data ◽

Daily Precipitation Data

Download Full-text

Assessing Bias Corrected Daily Precipitation Data with Special Focus on Precipitation Climate Indices Using an Ensemble of High-Resolution EURO-CORDEX and Med-CORDEX Simulations for the Carpathian Region

10.1002/essoar.10500442.1 ◽

2019 ◽

Author(s):

Csaba Zsolt Torma ◽

Anna Kis ◽

Rita Pongracz

Keyword(s):

High Resolution ◽

Daily Precipitation ◽

Climate Indices ◽

Special Focus ◽

Precipitation Data ◽

Carpathian Region ◽

Daily Precipitation Data

Download Full-text

A Downscaling–Merging Scheme for Improving Daily Spatial Precipitation Estimates Based on Random Forest and Cokriging

Remote Sensing ◽

10.3390/rs13112040 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2040

Author(s):

Xin Yan ◽

Hua Chen ◽

Bingru Tian ◽

Sheng Sheng ◽

Jinxing Wang ◽

...

Keyword(s):

High Resolution ◽

Random Forest ◽

Spatial Resolution ◽

Daily Precipitation ◽

Machine Learning Algorithms ◽

Precipitation Data ◽

Large Area ◽

Tree Model ◽

Daily Precipitation Data ◽

Precipitation Estimates

High-spatial-resolution precipitation data are of great significance in many applications, such as ecology, hydrology, and meteorology. Acquiring high-precision and high-resolution precipitation data in a large area is still a great challenge. In this study, a downscaling–merging scheme based on random forest and cokriging is presented to solve this problem. First, the enhanced decision tree model, which is based on random forest from machine learning algorithms, is used to reduce the spatial resolution of satellite daily precipitation data to 0.01°. The downscaled satellite-based daily precipitation is then merged with gauge observations using the cokriging method. The scheme is applied to downscale the Global Precipitation Measurement Mission (GPM) daily precipitation product over the upstream part of the Hanjiang Basin. The experimental results indicate that (1) the downscaling model based on random forest can correctly spatially downscale the GPM daily precipitation data, which retains the accuracy of the original GPM data and greatly improves their spatial details; (2) the GPM precipitation data can be downscaled on the seasonal scale; and (3) the merging method based on cokriging greatly improves the accuracy of the downscaled GPM daily precipitation data. This study provides an efficient scheme for generating high-resolution and high-quality daily precipitation data in a large area.

Download Full-text

Quality assessment of linked Canadian clinical administrative hospital and vital statistics death data

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.978 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

Nancy Rodrigues ◽

Maureen Kelly ◽

Tobi Henderson

Keyword(s):

Assessment Tool ◽

Vital Statistics ◽

Data Availability ◽

Supplementary Information ◽

Five Dimensions ◽

Tracking Process ◽

Data Assessment ◽

Hospital Deaths ◽

And Performance ◽

Discharge From Hospital

IntroductionThree Canadian clinical-administrative hospital databases were linked to the Canadian Vital Statistics Death Database (CVSD) to provide information about patients who died following discharge from hospital as well as supplementary information about patients that died in-hospital. Quality was assessed using a guided approach and through feedback from initial users. Objectives and ApproachThe linked datasets were created to develop and validate health care indicators and performance measures and perform outcome analyses. It is therefore imperative to evaluate the data’s fitness for use. Quality was assessed by calculating coverage of deaths for all linked contributors, creating a profile of the linked dataset and analyzing issues that were identified by users. These analyses were guided by an existing Data Source Assessment Tool, which provides a set of criteria that allow for assessment across five dimensions of quality, thus allowing for appropriate determination of a given set of data’s fitness for use. ResultsDeterministic linkage of the datasets resulted in linkage rates that ranged from 66.9% to 90.9% depending on the dataset or data year. Linkage rates also varied by Canadian jurisdictions and patient cohort. Variables had good data availability with rates of 95% or higher. Initial users identified a significant number of duplicate records that were flagged to and corrected by the data supplier. 1.4\% of acute hospital deaths had discrepancies in the death date captured in the two linked sources; the vast majority had a difference of only one day. A user group and issue tracking process were created to share information about the linked data and guarantee that issues are triaged to the appropriate party and allow for timely follow up with the data supplier. Conclusion/ImplicationsDocumentation provided by the data supplier was vital to understanding the linkage methodology and its impact on linkage rates. A guided data assessment ensured that strengths and limitations were identified and shared to support appropriate use. Feedback to the data supplier is supporting ongoing improvements to the linkage methodology.

Download Full-text

Identifying the Ideal Number Components of the Bayesian Principal Component Analysis Model for Missing Daily Precipitation Data Treatment

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.30.21992 ◽

2018 ◽

Vol 7 (4.30) ◽

pp. 5 ◽

Cited By ~ 1

Author(s):

Zun Liang Chuan ◽

Azlyna Senawi ◽

Wan Nur Syahidah Wan Yusoff ◽

Noriszura Ismail ◽

Tan Lit Ken ◽

...

Keyword(s):

Time Series ◽

Daily Precipitation ◽

Principal Component ◽

Precipitation Data ◽

Analysis Model ◽

Principal Component Analysis Model ◽

Data Treatment ◽

Daily Precipitation Data ◽

Ideal Number ◽

The Ideal

The grassroots of the presence of missing precipitation data are due to the malfunction of instruments, error of recording and meteorological extremes. Consequently, an effective imputation algorithm is indeed much needed to provide a high quality complete time series in assessing the risk of occurrence of extreme precipitation tragedy. In order to overcome this issue, this study desired to investigate the effectiveness of various Q-components of the Bayesian Principal Component Analysis model associates with Variational Bayes algorithm (BPCAQ-VB) in missing daily precipitation data treatment, which the ideal number of Q-components is identified by using The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) algorithm. The effectiveness of BPCAQ-VB algorithm in missing daily precipitation data treatment is evaluated by using four distinct precipitation time series, including two monitoring stations located in inland and coastal regions of Kuantan district, respectively. The analysis results rendered the BPCA5-VB is superior in missing daily precipitation data treatment for the coastal region time series compared to the single imputation algorithms proposed in previous studies. Contrarily, the single imputation algorithm is superior in missing daily precipitation data treatment for an inland region time series rather than the BPCAQ-VB algorithm.

Download Full-text

Global rainfall erosivity changes between 1980 and 2017 based on an erosivity model using daily precipitation data

CATENA ◽

10.1016/j.catena.2020.104768 ◽

2020 ◽

Vol 194 ◽

pp. 104768 ◽

Cited By ~ 1

Author(s):

Yue Liu ◽

Wenwu Zhao ◽

Yanxu Liu ◽

Paulo Pereira

Keyword(s):

Daily Precipitation ◽

Rainfall Erosivity ◽

Precipitation Data ◽

Daily Precipitation Data

Download Full-text

Integrated numerical model for irrigated area water resources management

Journal of Water and Climate Change ◽

10.2166/wcc.2019.042 ◽

2019 ◽

Vol 11 (4) ◽

pp. 980-991 ◽

Cited By ~ 4

Author(s):

Aidi Huo ◽

Xiaofan Wang ◽

Yan Liang ◽

Cheng Jiang ◽

Xiaolu Zheng

Keyword(s):

River Basin ◽

Water Resource Management ◽

Assessment Tool ◽

Swat Model ◽

Hydrologic Model ◽

Heihe River Basin ◽

Irrigation District ◽

Heihe River ◽

Model Parameters ◽

Long Term Effects

Abstract The likelihood of future global water shortages is increasing and further development of existing operational hydrologic models is needed to maintain sustainable development of the ecological environment and human health. In order to quantitatively describe the water balance factors and transformation relations, the objective of this article is to develop a distributed hydrologic model that is capable of simulating the surface water (SW) and groundwater (GW) in irrigation areas. The model can be used as a tool for evaluating the long-term effects of water resource management. By coupling the Soil and Water Assessment Tool (SWAT) and MODFLOW models, a comprehensive hydrological model integrating SW and GW is constructed. The hydrologic response units for the SWAT model are exchanged with cells in the MODFLOW model. Taking the Heihe River Basin as the study area, 10 years of historical data are used to conduct an extensive sensitivity analysis on model parameters. The developed model is run for a 40-year prediction period. The application of the developed coupling model shows that since the construction of the Heihe reservoir, the average GW level in the study area has declined by 6.05 m. The model can accurately simulate and predict the dynamic changes in SW and GW in the downstream irrigation area of Heihe River Basin and provide a scientific basis for water management in an irrigation district.

Download Full-text

Estimating extremely large amounts of missing precipitation data

Journal of Hydroinformatics ◽

10.2166/hydro.2020.127 ◽

2020 ◽

Vol 22 (3) ◽

pp. 578-592

Author(s):

Héctor Aguilera ◽

Carolina Guardiola-Albert ◽

Carmen Serrano-Hidalgo

Keyword(s):

Missing Values ◽

Daily Precipitation ◽

Learning Algorithm ◽

Accurate Estimation ◽

Precipitation Data ◽

Rain Gauges ◽

Error Measures ◽

Predictive Mean Matching ◽

Daily Precipitation Data ◽

Spatio Temporal

Abstract Accurate estimation of missing daily precipitation data remains a difficult task. A wide variety of methods exists for infilling missing values, but the percentage of gaps is one of the main factors limiting their applicability. The present study compares three techniques for filling in large amounts of missing daily precipitation data: spatio-temporal kriging (STK), multiple imputation by chained equations through predictive mean matching (PMM), and the random forest (RF) machine learning algorithm. To our knowledge, this is the first time that extreme missingness (>90%) has been considered. Different percentages of missing data and missing patterns are tested in a large dataset drawn from 112 rain gauges in the period 1975–2017. The results show that both STK and RF can handle extreme missingness, while PMM requires larger observed sample sizes. STK is the most robust method, suitable for chronological missing patterns. RF is efficient under random missing patterns. Model evaluation is usually based on performance and error measures. However, this study outlines the risk of just relying on these measures without checking for consistency. The RF algorithm overestimated daily precipitation outside the validation period in some cases due to the overdetection of rainy days under time-dependent missing patterns.

Download Full-text