Faculty Opinions recommendation of Population time series: process variability, observation errors, missing values, lags, and hidden states.

Author(s):  
Mark Rees
Ecology ◽  
2004 ◽  
Vol 85 (11) ◽  
pp. 3140-3150 ◽  
Author(s):  
James S. Clark ◽  
Ottar N. Bjørnstad

Hydrology ◽  
2018 ◽  
Vol 5 (4) ◽  
pp. 63 ◽  
Author(s):  
Benjamin Nelsen ◽  
D. Williams ◽  
Gustavious Williams ◽  
Candace Berrett

Complete and accurate data are necessary for analyzing and understanding trends in time-series datasets; however, many of the available time-series datasets have gaps that affect the analysis, especially in the earth sciences. As most available data have missing values, researchers use various interpolation methods or ad hoc approaches to data imputation. Since the analysis based on inaccurate data can lead to inaccurate conclusions, more accurate data imputation methods can provide accurate analysis. We present a spatial-temporal data imputation method using Empirical Mode Decomposition (EMD) based on spatial correlations. We call this method EMD-spatial data imputation or EMD-SDI. Though this method is applicable to other time-series data sets, here we demonstrate the method using temperature data. The EMD algorithm decomposes data into periodic components called intrinsic mode functions (IMF) and exactly reconstructs the original signal by summing these IMFs. EMD-SDI initially decomposes the data from the target station and other stations in the region into IMFs. EMD-SDI evaluates each IMF from the target station in turn and selects the IMF from other stations in the region with periodic behavior most correlated to target IMF. EMD-SDI then replaces a section of missing data in the target station IMF with the section from the most closely correlated IMF from the regional stations. We found that EMD-SDI selects the IMFs used for reconstruction from different stations throughout the region, not necessarily the station closest in the geographic sense. EMD-SDI accurately filled data gaps from 3 months to 5 years in length in our tests and favorably compares to a simple temporal method. EMD-SDI leverages regional correlation and the fact that different stations can be subject to different periodic behaviors. In addition to data imputation, the EMD-SDI method provides IMFs that can be used to better understand regional correlations and processes.


Author(s):  
Andrei Vorobev ◽  
Vyacheslav Pilipenko ◽  
Gulnara Vorobeva ◽  
Olga Khristodulo

Introduction: Magnetic stations are one of the main tools for observing the geomagnetic field. However, gaps and anomalies in time series of geomagnetic data, which often exceed 30% of the number of recorded values, negatively affect the effectiveness of the implemented approach and complicate the application of mathematical tools which require that the information signal is continuous. Besides, the missing values ​​add extra uncertainty in computer simulation of dynamic spatial distribution of geomagnetic variations and related parameters. Purpose: To develop a methodology for improving the efficiency of technical means for observing the geomagnetic field. Method: Creation of problem-oriented digital twins of magnetic stations, and their integration into the collection and preprocessing of geomagnetic data, in order to simulate the functioning of their physical prototypes with a certain accuracy. Results: Using Kilpisjärvi magnetic station (Finland) as an example, it is shown that the use of digital twins, whose information environment is made up of geomagnetic data from adjacent stations, can provide the opportunity for reconstruction (retrospective forecast) of geomagnetic variation parameters with a mean square error in the auroral zone of up to 11.5 nT. The integration of problem-oriented digital twins of magnetic stations into the processes of collecting and registering geomagnetic data can provide automatic identification and replacement of missing and abnormal values, increasing, due to the redundancy effect, the fault tolerance of the magnetic station as a data source object. For example, the digital twin of Kilpisjärvi station recovers 99.55% of annual information, and 86.73% of it has an error not exceeding 12 nT. Discussion: Due to the spatial anisotropy of geomagnetic field parameters, the error at the digital twin output will be different in each specific case, depending on the geographic location of the magnetic station, as well as on the number of the surrounding magnetic stations and the distance to them. However, this problem can be minimized by integrating geomagnetic data from satellites into the information environment of the digital twin. Practical relevance: The proposed methodology provides the opportunity for automated diagnostics of time series of geomagnetic data for outliers and anomalies, as well as restoration of missing values and identification of small-scale disturbances.


2018 ◽  
Vol 17 (02) ◽  
pp. 1850017 ◽  
Author(s):  
Mahdi Kalantari ◽  
Masoud Yarmohammadi ◽  
Hossein Hassani ◽  
Emmanuel Sirimal Silva

Missing values in time series data is a well-known and important problem which many researchers have studied extensively in various fields. In this paper, a new nonparametric approach for missing value imputation in time series is proposed. The main novelty of this research is applying the [Formula: see text] norm-based version of Singular Spectrum Analysis (SSA), namely [Formula: see text]-SSA which is robust against outliers. The performance of the new imputation method has been compared with many other established methods. The comparison is done by applying them to various real and simulated time series. The obtained results confirm that the SSA-based methods, especially [Formula: see text]-SSA can provide better imputation in comparison to other methods.


2021 ◽  
Author(s):  
Christopher Kadow ◽  
David Hall ◽  
Uwe Ulbrich

<p>Historical temperature measurements are the basis of global climate datasets like HadCRUT4. This dataset contains many missing values, particularly for periods before the mid-twentieth century, although recent years are also incomplete. Here we demonstrate that artificial intelligence can skilfully fill these observational gaps when combined with numerical climate model data. We show that recently developed image inpainting techniques perform accurate monthly reconstructions via transfer learning using either 20CR (Twentieth-Century Reanalysis) or the CMIP5 (Coupled Model Intercomparison Project Phase 5) experiments. The resulting global annual mean temperature time series exhibit high Pearson correlation coefficients (≥0.9941) and low root mean squared errors (≤0.0547 °C) as compared with the original data. These techniques also provide advantages relative to state-of-the-art kriging interpolation and principal component analysis-based infilling. When applied to HadCRUT4, our method restores a missing spatial pattern of the documented El Niño from July 1877. With respect to the global mean temperature time series, a HadCRUT4 reconstruction by our method points to a cooler nineteenth century, a less apparent hiatus in the twenty-first century, an even warmer 2016 being the warmest year on record and a stronger global trend between 1850 and 2018 relative to previous estimates. We propose image inpainting as an approach to reconstruct missing climate information and thereby reduce uncertainties and biases in climate records.</p><p>From:</p><p>Kadow, C., Hall, D.M. & Ulbrich, U. Artificial intelligence reconstructs missing climate information. <em>Nature Geoscience</em> <strong>13, </strong>408–413 (2020). https://doi.org/10.1038/s41561-020-0582-5</p><p>The presentation will tell from the journey of changing an image AI to a climate research application.</p>


Sign in / Sign up

Export Citation Format

Share Document