scholarly journals Detecting dominant changes in irregularly sampled multivariate water quality data sets

2018 ◽  
Author(s):  
Christian Lehr ◽  
Ralf Dannowski ◽  
Thomas Kalettka ◽  
Christoph Merz ◽  
Boris Schröder ◽  
...  

Abstract. Time series of catchment water quality often exhibit substantial temporal and spatial variability which can rarely be traced back to single causal factors. Numerous anthropogenic and natural drivers influence groundwater and stream water quality, especially in regions with high land use intensity. In addition, typical existing monitoring data sets, e.g. from environmental agencies, are usually characterized by relatively low sampling frequency and irregular sampling in space and/or time. This complicates the differentiation between anthropogenic influence and natural variability as well as the detection of changes in water quality which indicate changes of single drivers. Detecting such changes is of fundamental interest for water management purposes as well as for scientific analyses. We suggest the new term dominant changes for changes in multivariate water quality data that concern (1) more than a single variable, (2) more than one single site and (3) more than short-term fluctuations or single events and present an exploratory framework for the detection of such dominant changes in multivariate water quality data sets with irregular sampling in space and time. Firstly, we used a non-linear dimension reduction technique to derive multivariate water quality components. The components provide a sparse description of the dominant spatiotemporal dynamics in the multivariate water quality data set. In addition, they can be used to derive hypotheses on the dominant drivers influencing water quality. Secondly, different sampling sites were compared with respect to median component values. Thirdly, time series of the components at single sites were analysed for seasonal patterns and linear and non-linear trends. Spatial and temporal heterogeneities are efficiently used as a source of information rather than being considered as noise. Besides, non-linearities are considered explicitly. The approach is especially recommended for the exploratory assessment of existing long term low frequency multivariate water quality monitoring data. We tested the approach with a large data set of stream water and groundwater quality consisting of sixteen hydrochemical variables sampled with a spatially and temporally irregular sampling scheme at 29 sites in the Uckermark region in northeast Germany from 1998 to 2009. Four components were derived and interpreted as (1) the agriculturally induced enhancement of the natural background level of solute concentration, (2) the redox sequence from reducing conditions in deep groundwater to post oxic conditions in shallow groundwater and oxic conditions in stream water, (3) the mixing ratio of deep and shallow groundwater to the streamflow and (4) sporadic events of slurry application in the agricultural practice. Dominant changes were observed for the first two components. The changing intensity of the 1st component during the course of the observation period was interpreted as response to the temporal variability of the thickness of the unsaturated zone. A steady increase of the 2nd component throughout the monitoring period at most stream water sites pointed towards progressing depletion of the denitrification capacity of the deep aquifer.

2018 ◽  
Vol 22 (8) ◽  
pp. 4401-4424
Author(s):  
Christian Lehr ◽  
Ralf Dannowski ◽  
Thomas Kalettka ◽  
Christoph Merz ◽  
Boris Schröder ◽  
...  

Abstract. Time series of groundwater and stream water quality often exhibit substantial temporal and spatial variability, whereas typical existing monitoring data sets, e.g. from environmental agencies, are usually characterized by relatively low sampling frequency and irregular sampling in space and/or time. This complicates the differentiation between anthropogenic influence and natural variability as well as the detection of changes in water quality which indicate changes in single drivers. We suggest the new term “dominant changes” for changes in multivariate water quality data which concern (1) multiple variables, (2) multiple sites and (3) long-term patterns and present an exploratory framework for the detection of such dominant changes in data sets with irregular sampling in space and time. Firstly, a non-linear dimension-reduction technique was used to summarize the dominant spatiotemporal dynamics in the multivariate water quality data set in a few components. Those were used to derive hypotheses on the dominant drivers influencing water quality. Secondly, different sampling sites were compared with respect to median component values. Thirdly, time series of the components at single sites were analysed for long-term patterns. We tested the approach with a joint stream water and groundwater data set quality consisting of 1572 samples, each comprising sixteen variables, sampled with a spatially and temporally irregular sampling scheme at 29 sites in northeast Germany from 1998 to 2009. The first four components were interpreted as (1) an agriculturally induced enhancement of the natural background level of solute concentration, (2) a redox sequence from reducing conditions in deep groundwater to post-oxic conditions in shallow groundwater and oxic conditions in stream water, (3) a mixing ratio of deep and shallow groundwater to the streamflow and (4) sporadic events of slurry application in the agricultural practice. Dominant changes were observed for the first two components. The changing intensity of the first component was interpreted as response to the temporal variability of the thickness of the unsaturated zone. A steady increase in the second component at most stream water sites pointed towards progressing depletion of the denitrification capacity of the deep aquifer.


2018 ◽  
Vol 22 (2) ◽  
pp. 1175-1192 ◽  
Author(s):  
Qian Zhang ◽  
Ciaran J. Harman ◽  
James W. Kirchner

Abstract. River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends because (1) fractal scaling has the potential to lead to false inference about the statistical significance of trends and (2) the abundance of irregularly spaced data in water-quality monitoring networks complicates efforts to quantify fractal scaling. Traditional methods for estimating fractal scaling – in the form of spectral slope (β) or other equivalent scaling parameters (e.g., Hurst exponent) – are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1) they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β  =  0) to Brown noise (β  =  2) and (2) their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths) in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining autocorrelation, as the interpolation methods consistently underestimate or overestimate β under a wide range of prescribed β values and gap distributions. Second, the widely used Lomb–Scargle spectral method also consistently underestimates β. A previously published modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among all methods for a wide range of prescribed β values and gap distributions. The aliasing method, however, does not itself account for sampling irregularity, and this introduces some bias in the result. Nonetheless, the wavelet method is recommended for estimating β in irregular time series until improved methods are developed. Finally, all methods' performances depend strongly on the sampling irregularity, highlighting that the accuracy and precision of each method are data specific. Accurately quantifying the strength of fractal scaling in irregular water-quality time series remains an unresolved challenge for the hydrologic community and for other disciplines that must grapple with irregular sampling.


2004 ◽  
Vol 8 (3) ◽  
pp. 503-520 ◽  
Author(s):  
C. Neal ◽  
B. Reynolds ◽  
M. Neal ◽  
H. Wickham ◽  
L. Hill ◽  
...  

Abstract. Results for long term water quality monitoring are described for the headwaters of the principal headwater stream of the River Severn, the Afon Hafren. The results are linked to within-catchment information to describe the influence of conifer harvesting on stream and shallow groundwater quality. A 19-year record of water quality data for the Hafren (a partially spruce forested catchment with podzolic soil) shows the classic patterns of hydrochemical change in relation to concentration and flow responses for upland forested systems. Progressive felling of almost two-thirds of the forest over the period of study resulted in little impact from harvesting and replanting in relation to stream water quality. However, at the local scale, a six years’ study of felling indicated significant release of nitrate into both surface and groundwater; this persisted for two or three years before declining. The study has shown two important features. Firstly, phased felling has led to minimal impacts on stream water. This contrasts with the results of an experimental clear fell for the adjacent catchment of the Afon Hore where a distinct water quality deterioration was observed for a few years. Secondly, there are localised zones with varying hydrology that link to groundwater sources with fracture flow properties. This variability makes extrapolation to the catchment scale difficult without very extensive monitoring. The implications of these findings are discussed in relation to strong support for the use of phased felling-based management of catchments and the complexities of within catchment processes. Keywords: deforestation, water quality, acidification, pH, nitrate, alkalinity, ANC, aluminium, dissolved organic carbon, Plynlimon, forest, spruce, Afon Hafren, podzol


2002 ◽  
Vol 6 (3) ◽  
pp. 421-432 ◽  
Author(s):  
C. Neal

Abstract. A method for examining the impacts of disturbance on stream water quality based on paired catchment "control" and "response" water quality time series is described in relation to diagrams of cumulative flux and cumulative flux difference. The paper describes the equations used and illustrates the patterns expected for idealised flux changes followed by an application to stream water quality data for a spruce forested catchment, the Hore, subjected to clear fell. The water quality determinands examined are sodium, chloride, nitrate, calcium and acid neutralisation capacity. The anticipated effects of felling are shown in relation to reduction in mist capture and nitrate release with felling as well as to the influence of weathering and cation exchange mechanisms, but in a much clearer way than observed previously using other approaches. Keywords: Plynlimon, stream, Hore, acid neutralisation capacity, calcium, chloride, nitrate, sodium, cumulative flux, flux


2017 ◽  
Author(s):  
Qian Zhang ◽  
Ciaran J. Harman ◽  
James W. Kirchner

Abstract. River water-quality time series often exhibit fractal scaling, which here refers to autocorrelation that decays as a power law over some range of scales. Fractal scaling presents challenges to the identification of deterministic trends, but traditional methods for estimating spectral slope (β) or other equivalent scaling parameters (e.g., Hurst exponent) are generally inapplicable to irregularly sampled data. Here we consider two types of estimation approaches for irregularly sampled data and evaluate their performance using synthetic time series. These time series were generated such that (1) they exhibit a wide range of prescribed fractal scaling behaviors, ranging from white noise (β = 0) to Brown noise (β = 2), and (2) their sampling gap intervals mimic the sampling irregularity (as quantified by both the skewness and mean of gap-interval lengths) in real water-quality data. The results suggest that none of the existing methods fully account for the effects of sampling irregularity on β estimation. First, the results illustrate the danger of using interpolation for gap filling when examining auto-correlation, as the interpolation methods consistently under-estimate or over-estimate β under a wide range of prescribed β values and gap distributions. Second, the long-established Lomb-Scargle spectral method also consistently under-estimates β. A modified form, using only the lowest 5 % of the frequencies for spectral slope estimation, has very poor precision, although the overall bias is small. Third, a recent wavelet-based method, coupled with an aliasing filter, generally has the smallest bias and root-mean-squared error among all methods for a wide range of prescribed β values and gap distributions. The aliasing method, however, does not itself account for sampling irregularity, and this introduces some bias in the result. Nonetheless, the wavelet method is recommended for estimating β in irregular time series until improved methods are developed. Finally, all methods' performances depend strongly on the sampling irregularity, highlighting that the accuracy and precision of each method are data-specific. Accurately quantifying the strength of fractal scaling in irregular water-quality time series remains an unresolved challenge for the hydrologic community and for other disciplines that must grapple with irregular sampling.


2019 ◽  
Vol 6 (2) ◽  
pp. 75-82
Author(s):  
Maryam Ravanbakhsh ◽  
Yaser Tahmasebi Birgani ◽  
Maryam Dastoorpoor ◽  
Kambiz Ahmadi Angali

Discriminant analysis (DA) and principal component analysis (PCA), as multivariate statistical techniques, are used to interpret large complex water quality data and assess their temporal and spatial variation in the basin of the Zohreh river. In this study, data sets of 16 water quality parameters collected from 1966 to 2013) in 4 stations (1554 observations for each parameter) were analyzed. PCA for data sets of Kheirabad, Poleflour, Chambostan and Dehmolla stations resulted in 4, 4, 4, and 3 latent factors accounting for 88.985%, 93.828%, 88.648%, and 88.68% of the total variance in water quality parameters, respectively. It is indicated that total dissolved solids (TDS), electrical conductivity (EC), chlorides (Cl−), sodium (Na), sodium absorption ratio (SAR), and %Na were responsible for water quality variations which are mainly related to natural and anthropogenic pollution sources including climate effects, gypsum, and salt crystals in the supratidal of Zohreh river delta, fault zones of Chamshir I and II, drainage of sugarcane fields, and domestic and industrial wastewaters discharge into the river. DA reduced the data set to only seven parameters (discharge, temperature, electrical conductivity, HCO3-, Cl-, %Na, and T-Hardness), affording more than 58.5% correct assignations in temporal evaluations and describing responsible parameters for large variations in the quality of the Zohreh river.


2021 ◽  
Author(s):  
Tijmen Willard ◽  
Reza Pramana ◽  
Saket Pande ◽  
Boris van Breukelen ◽  
Maurits Ertsen

<p>Water quality in the rivers and tributaries of the Brantas catchment (about 12.000 km<sup>2</sup>; East Java, Indonesia), which is deteriorating due to various reasons, is measured by different agencies involved in water resource development and management. We discuss how different time series of water quality data from three local agencies in the Brantas basin (differing in specific parameters and measurement frequency) have been used to provide recommendations on the improvement of (using) the different measurement strategies (in policy recommendations). In general, monthly to quarterly data were available from 2009 until 2019 at 104 locations. Data were analyzed with Principal Component Analysis (PCA) to show which parameters vary significantly across the catchment. Preliminary results suggested how parameters were related, based on series of box plots of the PCA scores. This provided insights on the first order processes that control the physical-chemical status of the Brantas River, of each agency and for all the data sets combined. Applying Python and QGIS to separate the parameters and map the hot spots in terms of eigen functions allowed relating water levels with hot spots to estimate the fluctuations in the concentrations of different parameters in time and space. These data elaborations allow improving the different measurement campaigns, and to address specific policy questions related to water quality monitoring more efficiently.</p>


2014 ◽  
Vol 46 (3) ◽  
pp. 377-388 ◽  
Author(s):  
Matias Bonansea ◽  
Claudia Ledesma ◽  
Claudia Rodriguez ◽  
Lucio Pinotti

Water quality monitoring programs generate complex multidimensional data sets. In this study, multivariate statistical techniques were employed as an effective tool for the analysis and interpretation of these water quality data sets. Principal component analysis (PCA) and cluster analysis (CA) were applied to evaluate spatial and temporal variation of water quality in Río Tercero Reservoir (Argentina). Six sampling sites were surveyed each climatic season for 21 parameters during 2003–2010. The results revealed that PCA showed the existence of four significant principal components (PCs) which account for 96.7% of the total variance of the data set. The first PC was assigned to mineralization whereas the other PCs were built from variables indicative of pollution. Hierarchical CA grouped the six monitoring sites into three clusters and classified the different climatic seasons into two clusters based on similarities in water quality characteristics.


2020 ◽  
Vol 12 (14) ◽  
pp. 5500 ◽  
Author(s):  
Yu Song ◽  
Xiaodong Song ◽  
Guofan Shao

Intense human activities and drastic land use changes in rapidly urbanized areas may cause serious water quality degradation. In this study, we explored the effects of land use on water quality from a landscape perspective. We took a rapidly urbanized area in Hangzhou City, China, as a case study, and collected stream water quality data and algae biomass in a field campaign. The results showed that built-up lands had negative effects on water quality and were the primary cause of stream water pollution. The concentration of total phosphorus significantly correlated with the areas of residential, industrial, road, and urban greenspace, and the concentration of chlorophyll a also significantly correlated with the areas of these land uses, except residential land. At a landscape level, the correlation analysis showed that the landscape indices, e.g., dominance, shape complexity, fragmentation, aggregation, and diversity, all had significant correlations with water quality parameters. From the perspective of land use, the redundancy analysis results showed that the percentages of variation in water quality explained by the built-up, forest and wetland, cropland, and bareland decreased in turn. The spatial composition of the built-up lands was the main factor causing stream water pollution, while the shape complexities of the forest and wetland patches were negatively correlated with stream water pollution.


2016 ◽  
Vol 47 (5) ◽  
pp. 1069-1085 ◽  
Author(s):  
Yung-Chia Chiu ◽  
Chih-Wei Chiang ◽  
Tsung-Yu Lee

The adaptive neuro fuzzy inference system (ANFIS) has been proposed to model the time series of water quality data in this study. The biochemical oxygen demand data collected at the upstream catchment of Feitsui Reservoir in Taiwan for more than 20 years are selected as the target water quality variable. The classical statistical technique of the Box-Jenkins method is applied for the selection of appropriate input variables and data pre-processing of using differencing is implemented during the model development. The time series data obtained by ANFIS models are compared to those obtained by autoregressive integrated moving average (ARIMA) and artificial neural networks (ANNs). The results show that the ANFIS model identified at each sampling station is superior to the respective ARIMA and ANN models. The R values at all sampling stations of the training and testing datasets are 0.83–0.98 and 0.81–0.89, respectively, except at Huang-ju-pi-liao station. ANFIS models can provide accurate predictions for complex hydrological processes, and can be extended to other areas to improve the understanding of river pollution trends. The procedure of input selection and the pre-processing of input data proposed in this study can stimulate the usage of ANFIS in other related studies.


Sign in / Sign up

Export Citation Format

Share Document