Integration of long-term collocated ecological datasets: examples from the UK Environmental Change Network (ECN)

Author(s):  
Chak-Hau Michael Tso ◽  
Aaron Lowther ◽  
Don Monteith ◽  
Linsay Flynn Banin ◽  
William Simm ◽  
...  

<p>It is increasingly recognized that a whole-system approach is needed to address many challenging environmental research questions. While the whole-system approach is increasingly adopted by integrating data and models from various sub-systems, the ambition to apply this approach more widely across the environmental sciences requires infrastructure, methodologies, and a culture shift in order to facilitate seamless collaboration and re-deployment of workflows. </p><p>We report our recent progress in addressing some of these issues. We focus our examples here on work related to the UK Environmental Change Network (ECN, an eLTER member network). A transdisciplinary project team comprised of environmental scientists, statisticians, and computer scientists collaborated through the medium of a virtual research platform (DataLabs). Within the DataLabs platform, all data and analysis code are centrally stored via a cloud service and easily accessible via an internet browser from any operating system. Access to cloud computing resources for analyses are also available. More importantly, all users have access to the same versions of the data and software running on the same hardware throughout the collaboration process.</p><p>Such close collaboration allows us to co-develop statistical/data science algorithms that are suitable for a wide range of environmental data. These algorithms are not domain-specific and are generic enough to be used on any environmental datasets. Here we demonstrate how they are used to highlight periods of data with significant change. The first example is a "state tagging" algorithm, where each point in time of a dataset is classified as belonging to an arbitrary state based on clustering of covariates. Subsequently, confidence intervals, based on the statistics of each state, are computed and any data points that lie outside the confidence intervals are flagged for further investigation. A second example is the development of an algorithm for the identification of changepoints across multiple time series comprising different sampling frequencies or misaligned sampling times.  Existing multivariate changepoint algorithms assume that each time series is sampled at the same time (a situation not commonly applicable to environmental data). Our method removes this assumption, and emerged after consultation and collaboration with domain scientists. It has many potential applications, such as confirming whether changepoints occur across sites or across multiple variables within sites, or combinations thereof. In the final example, we show how DataLabs can facilitate the acquisition and application of third-party data to improve understanding of ECN atmospheric deposition chemistry data. Specifically, it allows users to take advantage of cloud computing and storage and collaborate seamlessly; where each collaborator is not required to have independent versions of software and data, saving time and effort. </p><p>The developments reported herein highlight the benefits of collaborative research using DataLabs to advance the integration of data, models, and methods across the environmental sciences. It provides the infrastructure, data, and culture to allow scientists to work more closely together. This in turn allows rapid incorporation of novel data science methods. It also allows the data integration workflows developed to be more readily applied elsewhere, while stakeholders can view and manipulate resultant data products.</p><p> </p>

2021 ◽  
Vol 257 ◽  
pp. 83-100
Author(s):  
Andrew Harvey

This article shows how new time series models can be used to track the progress of an epidemic, forecast key variables and evaluate the effects of policies. The univariate framework of Harvey and Kattuman (2020, Harvard Data Science Review, Special Issue 1—COVID-19, https://hdsr.mitpress.mit.edu/pub/ozgjx0yn) is extended to model the relationship between two or more series and the role of common trends is discussed. Data on daily deaths from COVID-19 in Italy and the UK provides an example of leading indicators when there is a balanced growth. When growth is not balanced, the model can be extended by including a non-stationary component in one of the series. The viability of this model is investigated by examining the relationship between new cases and deaths in the Florida second wave of summer 2020. The balanced growth framework is then used as the basis for policy evaluation by showing how some variables can serve as control groups for a target variable. This approach is used to investigate the consequences of Sweden’s soft lockdown coronavirus policy in the spring of 2020.


2019 ◽  
Author(s):  
Susannah Rennie ◽  
Chris Andrews ◽  
Sarah Atkinson ◽  
Deborah Beaumont ◽  
Sue Benham ◽  
...  

Abstract. Long-term datasets of integrated environmental variables, co-located together, are relatively rare. The UK Environmental Change Network (ECN) was launched in 1992 and provides the UK with its only long-term integrated environmental monitoring and research network for the assessment of the causes and consequences of environmental change. Measurements, covering a wide range of physical, chemical and biological "driver" and "response" variables are made in close proximity at ECN terrestrial sites using protocols incorporating standard quality control procedures. This paper describes the datasets (there are nineteen published ECN datasets) for these co-located measurements, containing over twenty years of data (1993–2015). The data and supporting documentation are freely available from the NERC Environmental Information Data Centre under the terms of the Open Government Licence using the following DOI’s: Meteorology Meteorology: https://doi.org/10.5285/fc9bcd1c-e3fc-4c5a-b569-2fe62d40f2f5 (Rennie et al., 2017a) Biogeochemistry Atmospheric nitrogen chemistry: https://doi.org/10.5285/baf51776-c2d0-4e57-9cd3-30cd6336d9cf (Rennie et al., 2017b) Precipitation chemistry: https://doi.org/10.5285/18b7c387-037d-4949-98bc-e8db5ef4264c (Rennie et al., 2017c) Soil solution chemistry: https://doi.org/10.5285/b330d395-68f2-47f1-8d59-3291dc02923b (Rennie et al., 2017d) Stream water chemistry: https://doi.org/10.5285/fd7ca5ef-460a-463c-ad2b-5ad48bb4e22e (Rennie et al., 2017e) Stream water discharge: https://doi.org/10.5285/8b58c86b-0c2a-4d48-b25a-7a0141859004 (Rennie et al., 2017f) Invertebrates Moths: https://doi.org/10.5285/a2a49f47-49b3-46da-a434-bb22e524c5d2 (Rennie et al., 2017g) Butterflies: https://doi.org/10.5285/5aeda581-b4f2-4e51-b1a6-890b6b3403a3 (Rennie et al., 2017h) Carabid beetle: https://doi.org/10.5285/8385f864-dd41-410f-b248-028f923cb281 (Rennie et al., 2017i) Spittle bugs: https://doi.org/10.5285/aff433be-0869-4393-b765-9e6faad2a12b (Rennie et al., 2018) Vegetation Baseline: https://doi.org/10.5285/a7b49ac1-24f5-406e-ac8f-3d05fb583e3b (Rennie et al., 2016a) Coarse grain: https://doi.org/10.5285/d349babc-329a-4d6e-9eca-92e630e1be3f (Rennie et al., 2016b) Woodland: https://doi.org/10.5285/94aef007-634e-42db-bc52-9aae86adbd33 (Rennie et al., 2017j) Fine grain: https://doi.org/10.5285/b98efec8-6de0-4e0c-85dc-fe4cdf01f086 (Rennie et al., 2017k) Vertebrates Frogs: https://doi.org/10.5285/4d8c7dd9-8248-46ca-b988-c1fc38e51581 (Rennie et al., 2017l) Birds (Breeding bird survey): https://doi.org/10.5285/5886c3ba-1fa5-49c0-8da8-40e69a10d2b5 (Rennie et al., 2017m) Birds (Common bird census): https://doi.org/10.5285/8582a02c-b28c-45d2-afa1-c1e85fba023d (Rennie et al., 2017n) Bats: https://doi.org/10.5285/2588ee91-6cbd-4888-86fc-81858d1bf085 (Rennie et al., 2017o) Rabbits and deer: https://doi.org/10.5285/0be0aed3-f205-4f1f-a65d-84f8cfd8d50f (Rennie et al., 2017p).


2020 ◽  
Author(s):  
Michael Tso ◽  
Peter Henrys ◽  
Susannah Rennie ◽  
John Watkins

<p>Long-term monitoring data that considers a wide array of environmental variables provides key insights to environmental change because responses of ecosystem functions and services to environmental drivers are inherently long-term and strongly interlinked. To ensure that the data are reliable for analysis and interpretation, they must undergo quality assurance procedures. However, the expected or acceptable range of data values vary greatly as the state of the ecosystem changes. Current quality assurance procedures for environmental data take no consideration of the system state at which each measurement is made, and provide the user with little contextual information on the probable cause for a measurement to be flagged out of range. We propose the use of data science techniques to tag each measurement with an identified system state. The term “state” here is defined loosely and they are identified using k-means clustering, an unsupervised machine learning method. The meaning of the states is open to specialist interpretation. Once the states are identified, state-dependent prediction intervals can be calculated for each observational variable. This approach provides the user with more contextual information to resolve out-of-range flags and derive prediction intervals for observational variables that considers the changes in system states. Our highly flexible and efficient approach is applicable to any point data time series in earth and environmental sciences, regardless of their sub-discipline. Such advantage is particularly relevant when conducting simultaneous analysis of multiple processes and feedbacks, where a wide variety of data is used.</p><p>We illustrate our approach using the moth and butterfly data from the UK Environmental Change Network (ECN), where meteorological variables are used to define system states. A web application is publicly available to allow users to explore the method on various ECN site, while a generic is also available for users to upload their own data files. Our work contributes to the ongoing development of a better data science framework that allows researchers and other stakeholders to find and use the data they need more readily and reliably.</p><p> </p>


2011 ◽  
Vol 22 (5) ◽  
pp. 639-648 ◽  
Author(s):  
J. Dick ◽  
C. Andrews ◽  
D. A. Beaumont ◽  
S. Benham ◽  
D. R. Brooks ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document