scholarly journals A framework for automated anomaly detection in high frequency water-quality data from in situ sensors

2019 ◽  
Vol 664 ◽  
pp. 885-898 ◽  
Author(s):  
Catherine Leigh ◽  
Omar Alsibai ◽  
Rob J. Hyndman ◽  
Sevvandi Kandanaarachchi ◽  
Olivia C. King ◽  
...  
2019 ◽  
Author(s):  
Catherine Leigh ◽  
Sevvandi Kandanaarachchi ◽  
James M. McGree ◽  
Rob J. Hyndman ◽  
Omar Alsibai ◽  
...  

AbstractWater-quality monitoring in rivers often focuses on the concentrations of sediments and nutrients, constituents that can smother biota and cause eutrophication. However, the physical and economic constraints of manual sampling prohibit data collection at the frequency required to adequately capture the variation in concentrations through time. Here, we developed models to predict total suspended solids (TSS) and oxidized nitrogen (NOx) concentrations based on high-frequency time series of turbidity, conductivity and river level data from in situ sensors in rivers flowing into the Great Barrier Reef lagoon. We fit generalized-linear mixed-effects models with continuous first-order autoregressive correlation structures to water-quality data collected by manual sampling at two freshwater sites and one estuarine site and used the fitted models to predict TSS and NOx from the in situ sensor data. These models described the temporal autocorrelation in the data and handled observations collected at irregular frequencies, characteristics typical of water-quality monitoring data. Turbidity proved a useful and generalizable surrogate of TSS, with high predictive ability in the estuarine and fresh water sites. Turbidity, conductivity and river level served as combined surrogates of NOx. However, the relationship between NOx and the covariates was more complex than that between TSS and turbidity, and consequently the ability to predict NOx was lower and less generalizable across sites than for TSS. Furthermore, prediction intervals tended to increase during events, for both TSS and NOx models, highlighting the need to include measures of uncertainty routinely in water-quality reporting. Our study also highlights that surrogate-based models used to predict sediments and nutrients need to better incorporate temporal components if variance estimates are to be unbiased and model inference meaningful. The transferability of models across sites, and potentially regions, will become increasingly important as organizations move to automated sensing for water-quality monitoring throughout catchments.


2019 ◽  
Vol 55 (11) ◽  
pp. 8547-8568 ◽  
Author(s):  
Priyanga Dilini Talagala ◽  
Rob J. Hyndman ◽  
Catherine Leigh ◽  
Kerrie Mengersen ◽  
Kate Smith‐Miles

Author(s):  
A. Manuel ◽  
A. C. Blanco ◽  
A. M. Tamondong ◽  
R. Jalbuena ◽  
O. Cabrera ◽  
...  

Abstract. Laguna Lake, the Philippines’ largest freshwater lake, has always been historically, economically, and ecologically significant to the people living near it. However, as it lies at the center of urban development in Metro Manila, it suffers from water quality degradation. Water quality sampling by current field methods is not enough to assess the spatial and temporal variations of water quality in the lake. Regular water quality monitoring is advised, and remote sensing addresses the need for a synchronized and frequent observation and provides an efficient way to obtain bio-optical water quality parameters. Optimization of bio-optical models is done as local parameters change regionally and seasonally, thus requiring calibration. Field spectral measurements and in-situ water quality data taken during simultaneous satellite overpass were used to calibrate the bio-optical modelling tool WASI-2D to get estimates of chlorophyll-a concentration from the corresponding Landsat-8 images. The initial output values for chlorophyll-a concentration, which ranges from 10–40 μg/L, has an RMSE of up to 10 μg/L when compared with in situ data. Further refinements in the initial and constant parameters of the model resulted in an improved chlorophyll-a concentration retrieval from the Landsat-8 images. The outputs provided a chlorophyll-a concentration range from 5–12 μg/L, well within the usual range of measured values in the lake, with an RMSE of 2.28 μg/L compared to in situ data.


2020 ◽  
Vol 8 (3) ◽  
pp. 172-185
Author(s):  
Juan G. Arango ◽  
Brandon K. Holzbauer-Schweitzer ◽  
Robert W. Nairn ◽  
Robert C. Knox

The focus of this study was to develop true reflectance surfaces in the visible portion of the electromagnetic spectrum from small unmanned aerial system (sUAS) images obtained over large bodies of water when no ground control points were available. The goal of the research was to produce true reflectance surfaces from which reflectance values could be extracted and used to estimate optical water quality parameters utilizing limited in-situ water quality analyses. Multispectral imagery was collected using a sUAS equipped with a multispectral sensor, capable of obtaining information in the blue (0.475 μm), green (0.560 μm), red (0.668 μm), red edge (0.717 μm), and near infrared (0.840 μm) portions of the electromagnetic spectrum. To develop a reliable and repeatable protocol, a five-step methodology was implemented: (i) image and water quality data collection, (ii) image processing, (iii) reflectance extraction, (iv) statistical interpolation, and (v) data validation. Results indicate that the created protocol generates geolocated and radiometrically corrected true reflectance surfaces from sUAS missions flown over large bodies of water. Subsequently, relationships between true reflectance values and in-situ water quality parameters were developed.


2015 ◽  
Author(s):  
Jeffrey W Hollister ◽  
W. Bryan Milstead ◽  
Betty J. Kreakie

Productivity of lentic ecosystems is well studied and it is widely accepted that as nutrient inputs increase, productivity increases and lakes transition from lower trophic state (e.g. oligotrophic) to higher trophic states (e.g. eutrophic). These broad trophic state classifications are good predictors of ecosystem condition, services, and disservices (e.g. recreation, aesthetics, and harmful algal blooms). While the relationship between nutrients and trophic state provides reliable predictions, it requires in situ water quality data in order to parameterize the model. This limits the application of these models to lakes with existing and, more importantly, available water quality data. To address this, we take advantage of the availability of a large national lakes water quality database (i.e. the National Lakes Assessment), land use/land cover data, lake morphometry data, other universally available data, and apply data mining approaches to predict trophic state. Using this data and random forests, we first model chlorophyll a, then classify the resultant predictions into trophic states. The full model estimates chlorophyll a with both in situ and universally available data. The mean squared error and adjusted R2 of this model was 0.09 and 0.8, respectively. The second model uses universally available GIS data only. The mean squared error was 0.22 and the adjusted R2 was 0.48. The accuracy of the trophic state classifications derived from the chlorophyll a predictions were 69% for the full model and 49% for the “GIS only” model. Random forests extend the usefulness of the class predictions by providing prediction probabilities for each lake. This allows us to make trophic state predictions and also indicate the level of uncertainity around those predictions. For the full model, these predicted class probabilites ranged from 0.42 to 1. For the GIS only model, they ranged from 0.33 to 0.96. It is our conclusion that in situ data are required for better predictions, yet GIS and universally available data provide trophic state predictions, with estimated uncertainty, that still have the potential for a broad array of applications. The source code and data for this manuscript are available from https://github.com/USEPA/LakeTrophicModelling.


2020 ◽  
Author(s):  
Devanshi Pathak ◽  
Michael Hutchins ◽  
François Edwards

<p>River phytoplankton provide food for primary consumers, and are a major source of oxygen in many rivers. However, high phytoplankton concentrations can hamper river water quality and ecosystem functioning, making it crucial to predict and prevent harmful phytoplankton growth in rivers. In this study, we modify an existing mechanistic water quality model to simulate sub-daily changes in water quality, and present its application in the River Thames catchment. So far, the modelling studies in the River Thames have focused on daily to weekly time-steps, and have shown limited predictive ability in modelling phytoplankton concentrations. With the availability of high-frequency water quality data, modelling tools can be improved to better understand process interactions for phytoplankton growth in dynamic rivers. The modified model in this study uses high-frequency water quality data along a 62 km stretch in the lower Thames to simulate river flows, water temperature, nutrients, and phytoplankton concentrations at sub-daily time-steps for 2013-14. Model performance is judged by percentage error in mean and Nash-Sutcliffe Efficiency (NSE) statistics. The model satisfactorily simulates the observed diurnal variability and transport of phytoplankton concentrations within the river stretch, with NSE values greater than 0.7 at all calibration sites. Phytoplankton blooms develop within an optimum range of flows (16-81 m<sup>3</sup>/s) and temperature (11-18° C), and are largely influenced by phytoplankton growth and death rate parameters. We find that phytoplankton growth in the lower Thames is mainly limited by physical controls such as residence time, light, and water temperature, and show some nutrient limitation arising from phosphorus depletion in summer. The model is tested under different future scenarios to evaluate the impact of changes in climate and management conditions on primary production and its controls. Our findings provide support for the argument that the sub-daily modelling of phytoplankton is a step forward in better prediction and management of phytoplankton dynamics in river systems.</p>


2015 ◽  
Author(s):  
Jeffrey W Hollister ◽  
W. Bryan Milstead ◽  
Betty J. Kreakie

Productivity of lentic ecosystems is well studied and it is widely accepted that as nutrient inputs increase, productivity increases and lakes transition from lower trophic state (e.g. oligotrophic) to higher trophic states (e.g. eutrophic). These broad trophic state classifications are good predictors of ecosystem condition, services, and disservices (e.g. recreation, aesthetics, and harmful algal blooms). While the relationship between nutrients and trophic state provides reliable predictions, it requires in situ water quality data in order to parameterize the model. This limits the application of these models to lakes with existing and, more importantly, available water quality data. To address this, we take advantage of the availability of a large national lakes water quality database (i.e. the National Lakes Assessment), land use/land cover data, lake morphometry data, other universally available data, and apply data mining approaches to predict trophic state. Using this data and random forests, we first model chlorophyll a, then classify the resultant predictions into trophic states. The full model estimates chlorophyll a with both in situ and universally available data. The mean squared error and adjusted R2 of this model was 0.09 and 0.8, respectively. The second model uses universally available GIS data only. The mean squared error was 0.22 and the adjusted R2 was 0.48. The accuracy of the trophic state classifications derived from the chlorophyll a predictions were 69% for the full model and 49% for the “GIS only” model. Random forests extend the usefulness of the class predictions by providing prediction probabilities for each lake. This allows us to make trophic state predictions and also indicate the level of uncertainity around those predictions. For the full model, these predicted class probabilites ranged from 0.42 to 1. For the GIS only model, they ranged from 0.33 to 0.96. It is our conclusion that in situ data are required for better predictions, yet GIS and universally available data provide trophic state predictions, with estimated uncertainty, that still have the potential for a broad array of applications. The source code and data for this manuscript are available from https://github.com/USEPA/LakeTrophicModelling.


Sign in / Sign up

Export Citation Format

Share Document