Non-linear visualization and analysis of large water quality data sets: a model-free basis for efficient monitoring and risk assessment

Gunnar Lischeid

doi:10.1007/s00477-008-0266-y

A national approach to risk assessment for drinking water catchments in Australia

Water Science & Technology Water Supply ◽

10.2166/ws.2005.0029 ◽

2005 ◽

Vol 5 (2) ◽

pp. 123-134 ◽

Cited By ~ 2

Author(s):

R. Miller ◽

B. Whitehill ◽

D. Deere

Keyword(s):

Water Quality ◽

Risk Assessment ◽

Land Use ◽

Drinking Water ◽

Water Supply ◽

Supply System ◽

Drinking Water Quality ◽

Water Utilities ◽

Quality Data ◽

Water Quality Data

This paper comments on the strengths and weaknesses of different methodologies for risk assessment, appropriate for utilisation by Australian Water Utilities in risk assessment for drinking water source protection areas. It is intended that a suggested methodology be recommended as a national approach to catchment risk assessment. Catchment risk management is a process for setting priorities for protecting drinking water quality in source water areas. It is structured through a series of steps for identifying water quality hazards, assessing the threat posed, and prioritizing actions to address the threat. Water management organisations around Australia are at various stages of developing programs for catchment risk management. While much conceptual work has been done on the individual components of catchment risk management, work on these components has not previously been combined to form a management tool for source water protection. A key driver for this project has been the requirements of the National Health and Medical Research Council Framework for the Management of Drinking Water Quality (DWQMF) included in the draft 2002 Australian Drinking Water Guidelines (ADWG). The Framework outlines a quality management system of steps for the Australian water industry to follow with checks and balances to ensure water quality is protected from catchment to tap. Key steps in the Framework that relate to this project are as follows: Element 2 Assessment of the Drinking Water Supply System• Water Supply System analysis• Review of Water Quality Data• Hazard Identification and Risk Assessment Element 3 Preventive Measures for Drinking Water Quality Management• Preventive Measures and Multiple Barriers• Critical Control Points This paper provides an evaluation of the following risk assessment techniques: Hazard Analysis and Critical Control Points (HACCP); World Health Organisation Water Safety Plans; Australian Standard AS 4360; and The Australian Drinking Water Guidelines – Drinking Water Quality Management Framework. These methods were selected for assessment in this report as they provided coverage of the different approaches being used across Australia by water utilities of varying: scale of water management organisation; types of water supply system management; and land use and activity-based risks in the catchment area of the source. Initially, different risk assessment methodologies were identified and reviewed. Then examples of applications of those methods were assessed, based on several key water utilities across Australia and overseas. Strengths and weaknesses of each approach were identified. In general there seems some general grouping of types of approaches into those that: cover the full catchment-to-tap drinking water system; cover just the catchment area of the source and do not recognise downstream barriers or processes; use water quality data or land use risks as a key driving component; and are based primarily on the hazard whilst others are based on a hazardous event. It is considered that an initial process of screening water quality data is very valuable in determining key water quality issues and guiding the risk assessment, and to the overall understanding of the catchment and water source area, allowing consistency with the intentions behind the ADWG DWQM Framework. As such, it is suggested that the recommended national risk assessment approach has two key introductory steps: initial screening of key issues via water quality data, and land use or activity scenario and event-based HACCP-style risk assessment. In addition, the importance of recognising the roles that uncertainty and bias plays in risk assessments was highlighted. As such it was deemed necessary to develop and integrate uncertainty guidelines for information used in the risk assessment process. A hybrid risk assessment methodology was developed, based on the HACCP approach, but with some key additions and modifications to make it applicable to varying catchment risks, water supply operation needs and environmental management processes.

Download Full-text

Water Quality Database Offers New Tools to Study Aquatic Systems

Eos ◽

10.1029/2017eo069005 ◽

2017 ◽

Author(s):

Lily Strelich

Keyword(s):

Water Quality ◽

Aquatic Systems ◽

Quality Data ◽

Data Sets ◽

Web Portal ◽

Water Quality Data

Researchers assess the federal Water Quality Portal, a Web portal that unites disparate water quality data sets and resources.

Download Full-text

Health Risk Assessment of Household Drinking Water in a District in the UAE

Water ◽

10.3390/w10121726 ◽

2018 ◽

Vol 10 (12) ◽

pp. 1726 ◽

Cited By ~ 5

Author(s):

Mohammed Mahmoud ◽

Mohamed Hamouda ◽

Ruwaya Al Kendi ◽

Mohamed Mohamed

Keyword(s):

Heavy Metals ◽

Water Quality ◽

Risk Assessment ◽

Drinking Water ◽

Health Risk ◽

Health Risk Assessment ◽

Quality Data ◽

Water Tank ◽

Water Quality Data ◽

Sampling Points

The quality of household drinking water in a community of 30 houses in a district in Abu Dhabi, United Arab Emirates (UAE) was assessed over a period of one year (January to November 2015). Standard analytical techniques were used to screen for water quality parameters and contaminants of concern. Water quality was evaluated in the 30 households at four sampling points: kitchen faucet, bathroom faucet, household water tank, and main water pipe. The sampling points were chosen to help identify the source when an elevated level of a particular contaminant is observed. Water quality data was interpreted by utilizing two main techniques: spatial variation analysis and multivariate statistical techniques. Initial analysis showed that many households had As, Cd, and Pb concentrations that were higher than the maximum allowable level set by UAE drinking water standards. In addition, the water main samples had the highest concentration of the heavy metals compared to other sampling points. Health risk assessment results indicated that approximately 30%, 55%, and 15% of the houses studied had a high, moderate, and low risk from the prolonged exposure to heavy metals, respectively. The analysis can help with planning a spatially focused sampling plan to confirm the study findings and set an appropriate course of action.

Download Full-text

Characterizing water quality monitoring visualization with Hadoop and Google Maps

Water Practice & Technology ◽

10.2166/wpt.2017.093 ◽

2017 ◽

Vol 12 (4) ◽

pp. 882-893 ◽

Cited By ~ 1

Author(s):

Weijian Huang ◽

Xinfei Zhao ◽

Yuanbin Han ◽

Wei Du ◽

Yao Cheng

Keyword(s):

Water Quality ◽

Water Environment ◽

Large Data ◽

Water Quality Monitoring ◽

Quality Monitoring ◽

Quality Data ◽

Data Sets ◽

Google Maps ◽

Water Quality Data ◽

Computer Clusters

Abstract In water quality monitoring, the complexity and abstraction of water environment data make it difficult for staff to monitor the data efficiently and intuitively. Visualization of water quality data is an important part of the monitoring and analysis of water quality. Because water quality data have geographic features, their visualization can be realized using maps, which not only provide intuitive visualization, but also reflect the relationship between water quality and geographical position. For this study, the heat map provided by Google Maps was used for water quality data visualization. However, as the amount of data increases, the computational efficiency of traditional development models cannot meet the computing task needs quickly. Effective storage, extraction and analysis of large water data sets becomes a problem that needs urgent solution. Hadoop is an open source software framework running on computer clusters that can store and process large data sets efficiently, and it was used in this study to store and process water quality data. Through reasonable analysis and experiment, an efficient and convenient information platform can be provided for water quality monitoring.

Download Full-text

Impact of non-detects in water quality data on estimation of constituent mass loading

Water Science & Technology ◽

10.2166/wst.2002.0243 ◽

2002 ◽

Vol 45 (9) ◽

pp. 219-225 ◽

Cited By ~ 14

Author(s):

M. Kayhanian ◽

A. Singh ◽

S. Meyer

Keyword(s):

Water Quality ◽

Data Analysis ◽

Censored Data ◽

Quality Data ◽

Data Sets ◽

Mass Loading ◽

Water Quality Data ◽

Mean Values ◽

Constituent Mass

Often, fractions of stormwater constituents are not detected above laboratory reporting limits and are reported as non-detect (ND), or censored data. Analysts and stormwater modelers represent these NDs in stormwater data sets using a variety of methods. Application of these different methods results in different estimates of constituent mean concentrations that will, in turn, affect mass loading computations. In this paper, different methods of data analysis were introduced to determine constituent mean concentrations from water quality datasets that include ND values. Depending on the number of NDs and the method of data analysis, differences ranging from 1 to 70 percent have been observed in mean values. Differences in mean values were, as shown by simulation, found to have significant impacts on estimations of constituent mass loading.

Download Full-text

Identification and characterization of water quality transients using wavelet analysis. II. Application to electronic water quality data

Water Science & Technology ◽

10.2166/wst.1997.0232 ◽

1997 ◽

Vol 36 (5) ◽

pp. 337-348 ◽

Cited By ~ 3

Author(s):

Paul H. Whitfield ◽

Kathleen Dohan

Keyword(s):

Water Quality ◽

Quality Data ◽

Data Sets ◽

Water Quality Data ◽

Periodic Processes ◽

Small Streams ◽

Transient Events ◽

Identification And Characterization

Two wavelet transform techniques for identifying water quality transients are applied to example data sets from two small streams. Temperature and conductance represent the range of properties from periodic processes to transient events. Both methods were successful in identifying the location, duration and magnitude of the transient events in these data sets. The methods may be refined to automate the detection and classification of transient events.

Download Full-text

Review of “Detecting dominant changes in irregularly sampled multivariate water quality data sets (Lehr et al.)”

10.5194/hess-2018-39-rc1 ◽

2018 ◽

Author(s):

Anonymous

Keyword(s):

Water Quality ◽

Quality Data ◽

Data Sets ◽

Water Quality Data

Download Full-text

Detecting dominant changes in irregularly sampled multivariate water quality data sets

Hydrology and Earth System Sciences ◽

10.5194/hess-22-4401-2018 ◽

2018 ◽

Vol 22 (8) ◽

pp. 4401-4424

Author(s):

Christian Lehr ◽

Ralf Dannowski ◽

Thomas Kalettka ◽

Christoph Merz ◽

Boris Schröder ◽

...

Keyword(s):

Water Quality ◽

Stream Water ◽

Shallow Groundwater ◽

Irregular Sampling ◽

Quality Data ◽

Data Sets ◽

Water Quality Data ◽

Data Set ◽

Sampling In Space

Abstract. Time series of groundwater and stream water quality often exhibit substantial temporal and spatial variability, whereas typical existing monitoring data sets, e.g. from environmental agencies, are usually characterized by relatively low sampling frequency and irregular sampling in space and/or time. This complicates the differentiation between anthropogenic influence and natural variability as well as the detection of changes in water quality which indicate changes in single drivers. We suggest the new term “dominant changes” for changes in multivariate water quality data which concern (1) multiple variables, (2) multiple sites and (3) long-term patterns and present an exploratory framework for the detection of such dominant changes in data sets with irregular sampling in space and time. Firstly, a non-linear dimension-reduction technique was used to summarize the dominant spatiotemporal dynamics in the multivariate water quality data set in a few components. Those were used to derive hypotheses on the dominant drivers influencing water quality. Secondly, different sampling sites were compared with respect to median component values. Thirdly, time series of the components at single sites were analysed for long-term patterns. We tested the approach with a joint stream water and groundwater data set quality consisting of 1572 samples, each comprising sixteen variables, sampled with a spatially and temporally irregular sampling scheme at 29 sites in northeast Germany from 1998 to 2009. The first four components were interpreted as (1) an agriculturally induced enhancement of the natural background level of solute concentration, (2) a redox sequence from reducing conditions in deep groundwater to post-oxic conditions in shallow groundwater and oxic conditions in stream water, (3) a mixing ratio of deep and shallow groundwater to the streamflow and (4) sporadic events of slurry application in the agricultural practice. Dominant changes were observed for the first two components. The changing intensity of the first component was interpreted as response to the temporal variability of the thickness of the unsaturated zone. A steady increase in the second component at most stream water sites pointed towards progressing depletion of the denitrification capacity of the deep aquifer.

Download Full-text

Evaluation of significantly modified water bodies in Vojvodina by using multivariate statistical techniques

Hemijska industrija ◽

10.2298/hemind121002007v ◽

2013 ◽

Vol 67 (5) ◽

pp. 823-833

Author(s):

Svetlana Vujovic ◽

Srdjan Kolakovic ◽

Milena Becelic-Tomin

Keyword(s):

Water Quality ◽

Water Bodies ◽

Total Variance ◽

Quality Data ◽

Statistical Techniques ◽

Data Sets ◽

Multivariate Statistical Techniques ◽

Multivariate Statistical ◽

Water Quality Data ◽

Cluster Properties

This paper illustrates the utility of multivariate statistical techniques for analysis and interpretation of water quality data sets and identification of pollution sources/factors with a view to get better information about the water quality and design of monitoring network for effective management of water resources. Multivariate statistical techniques, such as factor analysis (FA)/principal component analysis (PCA) and cluster analysis (CA), were applied for the evaluation of variations and for the interpretation of a water quality data set of the natural water bodies obtained during 2010 year of monitoring of 13 parameters at 33 different sites. FA/PCA attempts to explain the correlations between the observations in terms of the underlying factors, which are not directly observable. Factor analysis is applied to physico-chemical parameters of natural water bodies with the aim classification and data summation as well as segmentation of heterogeneous data sets into smaller homogeneous subsets. Factor loadings were categorized as strong and moderate corresponding to the absolute loading values of >0.75, 0.75-0.50, respectively. Four principal factors were obtained with Eigenvalues >1 summing more than 78 % of the total variance in the water data sets, which is adequate to give good prior information regarding data structure. Each factor that is significantly related to specific variables represents a different dimension of water quality. The first factor F1 accounting for 28 % of the total variance and represents the hydrochemical dimension of water quality. The second factor F2 accounting for 18% of the total variance and may be taken factor of water eutrophication. The third factor F3 accounting 17 % of the total variance and represents the influence of point sources of pollution on water quality. The fourth factor F4 accounting 13 % of the total variance and may be taken as an ecological dimension of water quality. Cluster analysis (CA) is an objective technique to identify natural groupings in the set of data. CA divides a large number of objects into smaller number of homogenous groups on the basis of their correlation structure. CA combines the data objects together to form the natural groups involving objects with similar cluster properties and separates the objects with different cluster properties. CA showed similarities and dissimilarities among the sampling sites and explain the observed clustering in terms of affected conditions. Using FA/PCA and CA have been identified water bodies that are under the highest pressure. With regard to the factors identified water bodies are: for factor F1 (Plazovic, Bosut, Studva, Zlatica, Stari Begej, Krivaja), for factor F2 (Krivaja, Keres), for factor F3 (Studva, Zlatica, Tamis, Krivaja i Keres) and for factor F4 (Studva, Zlatica, Krivaja, Keres).

Download Full-text

Evaluation of Temporal and Spatial Variations of Water Quality Parameters in Zohreh River, Iran

Avicenna Journal of Environmental Health Engineering ◽

10.34172/ajehe.2019.10 ◽

2019 ◽

Vol 6 (2) ◽

pp. 75-82

Author(s):

Maryam Ravanbakhsh ◽

Yaser Tahmasebi Birgani ◽

Maryam Dastoorpoor ◽

Kambiz Ahmadi Angali

Keyword(s):

Water Quality ◽

Electrical Conductivity ◽

Quality Parameters ◽

Water Quality Parameters ◽

Quality Data ◽

Sodium Absorption ◽

Data Sets ◽

Water Quality Data ◽

Data Set ◽

Temporal And Spatial

Discriminant analysis (DA) and principal component analysis (PCA), as multivariate statistical techniques, are used to interpret large complex water quality data and assess their temporal and spatial variation in the basin of the Zohreh river. In this study, data sets of 16 water quality parameters collected from 1966 to 2013) in 4 stations (1554 observations for each parameter) were analyzed. PCA for data sets of Kheirabad, Poleflour, Chambostan and Dehmolla stations resulted in 4, 4, 4, and 3 latent factors accounting for 88.985%, 93.828%, 88.648%, and 88.68% of the total variance in water quality parameters, respectively. It is indicated that total dissolved solids (TDS), electrical conductivity (EC), chlorides (Cl−), sodium (Na), sodium absorption ratio (SAR), and %Na were responsible for water quality variations which are mainly related to natural and anthropogenic pollution sources including climate effects, gypsum, and salt crystals in the supratidal of Zohreh river delta, fault zones of Chamshir I and II, drainage of sugarcane fields, and domestic and industrial wastewaters discharge into the river. DA reduced the data set to only seven parameters (discharge, temperature, electrical conductivity, HCO3-, Cl-, %Na, and T-Hardness), affording more than 58.5% correct assignations in temporal evaluations and describing responsible parameters for large variations in the quality of the Zohreh river.

Download Full-text