scholarly journals A Synthesis of the Information Given by Temporal Data Series: The Representative Day

2021 ◽  
Author(s):  
Tiziano Tirabassi ◽  
Daniela Buske

The recording of air pollution concentration values involves the measurement of a large volume of data. Generally, automatic selectors and explicators are provided by statistics. The use of the Representative Day allows the compilation of large amounts of data in a compact format that will supply meaningful information on the whole data set. The Representative Day (RD) is a real day that best represents (in the meaning of the least squares technique) the set of daily trends of the considered time series. The Least Representative Day (LRD), on the contrary, it is a real day that worst represents (in the meaning of the least squares technique) the set of daily trends of the same time series. The identification of RD and LRD can prove to be a very important tool for identifying both anomalous and standard behaviors of pollutants within the selected period and establishing measures of prevention, limitation and control. Two application examples, in two different areas, are presented related to meteorological and SO 2 and O 3 concentration data sets.

2017 ◽  
Author(s):  
Gerrit de Leeuw ◽  
Larisa Sogacheva ◽  
Edith Rodriguez ◽  
Konstantinos Kourtidis ◽  
Aristeidis K. Georgoulias ◽  
...  

Abstract. The retrieval of aerosol properties from satellite observations provides their spatial distribution over a wide area in cloud-free conditions. As such, they complement ground-based measurements by providing information over sparsely instrumented areas, albeit that significant differences may exist in both the type of information obtained and the temporal information from satellite and ground-based observations. In this paper, information from different types of satellite-based instruments is used to provide a 3-D climatology of aerosol properties over mainland China, i.e. vertical profiles of extinction coefficients from CALIOP, a lidar flying on board the CALIPSO satellite, and the column-integrated extinction (AOD), available from three radiometers: ESA’s ATSR-2, AATSR (together referred to as ATSR) and NASA's MODIS/Terra, together spanning the period 1995–2015. AOD data are retrieved from ATSR using the ADV v2.31 algorithm while for MODIS the Collection 6 (C6) DTDB merged AOD data set is used. These data sets are validated and differences are compared using AERONET version 2 L2.0 AOD data as reference. The results show that, over China, MODIS slightly overestimates the AOD and ATSR slightly underestimates the AOD. Consequently, MODIS AOD is overall higher than that from ATSR, and the difference increases with increasing AOD. The comparison also shows that none of the ATSR and MODIS AOD data sets is better than the other one everywhere. However, ATSR ADV has limitations over bright surfaces where the MODIS DB was designed for. To allow for comparison of MODIS C6 results with previous analyses where MODIS Collection 5.1 (C5.1) data were used, also the difference between the C6 and C5.1 DTDB merged data sets from MODIS/Terra over China is briefly discussed. The AOD data sets show strong seasonal differences and the seasonal features vary with latitude and longitude across China. Two-decadal AOD time series, averaged over the whole mainland China, are presented and briefly discussed. Using the 17 years of ATSR data as the basis and MODIS/Terra to follow the temporal evolution in recent years when ENVISAT was lost requires a comparison of the data sets for the overlapping period to show their complementarity. ATSR precedes the MODIS time series between 1995 and 2000 and shows a distinct increase in the AOD over this period. The two data series show similar variations during the overlapping period between 2000 and 2011, with minima and maxima in the same years. MODIS extends this time series beyond the end of the ENVISAT period in 2012, showing decreasing AOD.


2021 ◽  
Author(s):  
Annette Dietmaier ◽  
Thomas Baumann

<p>The European Water Framework Directive (WFD) commits EU member states to achieve a good qualitative and quantitative status of all their water bodies.  WFD provides a list of actions to be taken to achieve the goal of good status.  However, this list disregards the specific conditions under which deep (> 400 m b.g.l.) groundwater aquifers form and exist.  In particular, deep groundwater fluid composition is influenced by interaction with the rock matrix and other geofluids, and may assume a bad status without anthropogenic influences. Thus, a new concept with directions of monitoring and modelling this specific kind of aquifers is needed. Their status evaluation must be based on the effects induced by their exploitation. Here, we analyze long-term real-life production data series to detect changes in the hydrochemical deep groundwater characteristics which might be triggered by balneological and geothermal exploitation. We aim to use these insights to design a set of criteria with which the status of deep groundwater aquifers can be quantitatively and qualitatively determined. Our analysis is based on a unique long-term hydrochemical data set, taken from 8 balneological and geothermal sites in the molasse basin of Lower Bavaria, Germany, and Upper Austria. It is focused on a predefined set of annual hydrochemical concentration values. The data range dates back to 1937. Our methods include developing threshold corridors, within which a good status can be assumed, and developing cluster analyses, correlation, and piper diagram analyses. We observed strong fluctuations in the hydrochemical characteristics of the molasse basin deep groundwater during the last decades. Special interest is put on fluctuations that seem to have a clear start and end date, and to be correlated with other exploitation activities in the region. For example, during the period between 1990 and 2020, bicarbonate and sodium values displayed a clear increase, followed by a distinct dip to below-average values and a subsequent return to average values at site F. During the same time, these values showed striking irregularities at site B. Furthermore, we observed fluctuations in several locations, which come close to disqualifying quality thresholds, commonly used in German balneology. Our preliminary results prove the importance of using long-term (multiple decades) time series analysis to better inform quality and quantity assessments for deep groundwater bodies: most fluctuations would stay undetected within a < 5 year time series window, but become a distinct irregularity when viewed in the context of multiple decades. In the next steps, a quality assessment matrix and threshold corridors will be developed, which take into account methods to identify these fluctuations. This will ultimately aid in assessing the sustainability of deep groundwater exploitation and reservoir management for balneological and geothermal uses.</p>


Water ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 1477 ◽  
Author(s):  
Davide De Luca ◽  
Luciano Galasso

This study tests stationary and non-stationary approaches for modelling data series of hydro-meteorological variables. Specifically, the authors considered annual maximum rainfall accumulations observed in the Calabria region (southern Italy), and attention was focused on time series characterized by heavy rainfall events which occurred from 1 January 2000 in the study area. This choice is justified by the need to check if the recent rainfall events in the new century can be considered as very different or not from the events occurred in the past. In detail, the whole data set of each considered time series (characterized by a sample size N > 40 data) was analyzed, in order to compare recent and past rainfall accumulations, which occurred in a specific site. All the proposed models were based on the Two-Component Extreme Value (TCEV) probability distribution, which is frequently applied for annual maximum time series in Calabria. The authors discussed the possible sources of uncertainty related to each framework and remarked on the crucial role played by ergodicity. In fact, if the process is assumed to be non-stationary, then ergodicity cannot hold, and thus possible trends should be derived from external sources, different from the time series of interest: in this work, Regional Climate Models’ (RCMs) outputs were considered in order to assess possible trends of TCEV parameters. From the obtained results, it does not seem essential to adopt non-stationary models, as significant trends do not appear from the observed data, due to a relevant number of heavy events which also occurred in the central part of the last century.


Geomatics ◽  
2021 ◽  
Vol 1 (4) ◽  
pp. 464-495
Author(s):  
Desi Suyamto ◽  
Lilik Prasetyo ◽  
Yudi Setiawan ◽  
Arief Wijaya ◽  
Kustiyo Kustiyo ◽  
...  

This article demonstrated an easily applicable method for measuring the similarity between a pair of point patterns, which applies to spatial or temporal data sets. Such a measurement was performed using similarity-based pattern analysis as an alternative to conventional approaches, which typically utilize straightforward point-to-point matching. Using our approach, in each point data set, two geometric features (i.e., the distance and angle from the centroid) were calculated and represented as probability density functions (PDFs). The PDF similarity of each geometric feature was measured using nine metrics, with values ranging from zero (very contrasting) to one (exactly the same). The overall similarity was defined as the average of the distance and angle similarities. In terms of sensibility, the method was shown to be capable of measuring, at a human visual sensing level, two pairs of hypothetical patterns, presenting reasonable results. Meanwhile, in terms of the method′s sensitivity to both spatial and temporal displacements from the hypothetical origin, the method is also capable of consistently measuring the similarity of spatial and temporal patterns. The application of the method to assess both spatial and temporal pattern similarities between two deforestation data sets with different resolutions was also discussed.


2020 ◽  
Author(s):  
Mieke Kuschnerus ◽  
Roderik Lindenbergh ◽  
Sander Vos

Abstract. Sandy coasts are constantly changing environments governed by complex interacting processes. Permanent laser scanning is a promising technique to monitor such coastal areas and support analysis of geomorphological deformation processes. This novel technique delivers 3D representations of a part of the coast at hourly temporal and centimetre spatial resolution and allows to observe small scale changes in elevation over extended periods of time. These observations have the potential to improve understanding and modelling of coastal deformation processes. However, to be of use to coastal researchers and coastal management, an efficient way to find and extract deformation processes from the large spatio-temporal data set is needed. In order to allow data mining in an automated way, we extract time series in elevation or range and use unsupervised learning algorithms to derive a partitioning of the observed area according to change patterns. We compare three well known clustering algorithms, k-means, agglomerative clustering and DBSCAN, and identify areas that undergo similar evolution during one month. We test if they fulfil our criteria for a suitable clustering algorithm on our exemplary data set. The three clustering methods are applied to time series of 30 epochs (during one month) extracted from a data set of daily scans covering a part of the coast at Kijkduin, the Netherlands. A small section of the beach, where a pile of sand was accumulated by a bulldozer is used to evaluate the performance of the algorithms against a ground truth. The k-means algorithm and agglomerative clustering deliver similar clusters, and both allow to identify a fixed number of dominant deformation processes in sandy coastal areas, such as sand accumulation by a bulldozer or erosion in the intertidal area. The DBSCAN algorithm finds clusters for only about 44 % of the area and turns out to be more suitable for the detection of outliers, caused for example by temporary objects on the beach. Our study provides a methodology to efficiently mine a spatio-temporal data set for predominant deformation patterns with the associated regions, where they occur.


2021 ◽  
Author(s):  
Beatrix Izsák ◽  
Mónika Lakatos ◽  
Rita Pongrácz ◽  
Tamás Szentimrey ◽  
Olivér Szentes

<p>Climate studies, in particular those related to climate change, require long, high-quality, controlled data sets that are representative both spatially and temporally. Changing the conditions in which the measurements were taken, for example relocating the station, or a change in the frequency and time of measurements, or in the instruments used may result in an fractured time series. To avoid these problems, data errors and inhomogeneities are eliminated for Hungary and data gaps are filled in by using the MASH (Multiple Analysis of Series for Homogenization, Szentimrey) homogenization procedure. Homogenization of the data series raises the problem that how to homogenize long and short data series together within the same process, since the meteorological observation network was upgraded significantly in the last decades. It is possible to solve these problems with the method MASH due to its adequate mathematical principles for such purposes. The solution includes the synchronization of the common parts’ inhomogeneities within three (or more) different MASH processing of the three (or more) datasets with different lengths. Then, the homogenized station data series are interpolated to the whole area of Hungary, to a 0.1 degree regular grid. For this purpose, the MISH (Meteorological Interpolation based on Surface Homogenized Data Basis; Szentimrey and Bihari) program system is used. The MISH procedure was developed specifically for the interpolation of various meteorological elements. Hungarian time series of daily average temperature and precipitation sum for the period 1870-2020 were used in this study, thus providing the longest homogenized, gridded daily data sets in the region with up-to-date information already included.</p><p><em>Supported by the ÚNKP-20-3 New National Excellence Program of the Ministry for Innovation andTechnology from the source of the National Research, Development and Innovation Fund.</em></p>


1998 ◽  
Vol 185 ◽  
pp. 167-168
Author(s):  
T. Appourchaux ◽  
M.C. Rabello-Soares ◽  
L. Gizon

Two different data sets have been used to derive low-degree rotational splittings. One data set comes from the Luminosity Oscillations Imager of VIRGO on board SOHO; the observation starts on 27 March 96 and ends on 26 March 97, and are made of intensity time series of 12 pixels (Appourchaux et al, 1997, Sol. Phys., 170, 27). The other data set was kindly made available by the GONG project; the observation starts on 26 August 1995 and ends on 21 August 1996, and are made of complex Fourier spectra of velocity time series for l = 0 − 9. For the GONG data, the contamination of l = 1 from the spatial aliases of l = 6 and l = 9 required some cleaning. To achieve this, we applied the inverse of the leakage matrix of l = 1, 6 and 9 to the original Fourier spectra of the same degrees; cleaning of all 3 degrees was achieved simultaneously (Appourchaux and Gizon, 1997, these proceedings).


2008 ◽  
Vol 15 (6) ◽  
pp. 1013-1022 ◽  
Author(s):  
J. Son ◽  
D. Hou ◽  
Z. Toth

Abstract. Various statistical methods are used to process operational Numerical Weather Prediction (NWP) products with the aim of reducing forecast errors and they often require sufficiently large training data sets. Generating such a hindcast data set for this purpose can be costly and a well designed algorithm should be able to reduce the required size of these data sets. This issue is investigated with the relatively simple case of bias correction, by comparing a Bayesian algorithm of bias estimation with the conventionally used empirical method. As available forecast data sets are not large enough for a comprehensive test, synthetically generated time series representing the analysis (truth) and forecast are used to increase the sample size. Since these synthetic time series retained the statistical characteristics of the observations and operational NWP model output, the results of this study can be extended to real observation and forecasts and this is confirmed by a preliminary test with real data. By using the climatological mean and standard deviation of the meteorological variable in consideration and the statistical relationship between the forecast and the analysis, the Bayesian bias estimator outperforms the empirical approach in terms of the accuracy of the estimated bias, and it can reduce the required size of the training sample by a factor of 3. This advantage of the Bayesian approach is due to the fact that it is less liable to the sampling error in consecutive sampling. These results suggest that a carefully designed statistical procedure may reduce the need for the costly generation of large hindcast datasets.


2018 ◽  
Vol 18 (3) ◽  
pp. 1573-1592 ◽  
Author(s):  
Gerrit de Leeuw ◽  
Larisa Sogacheva ◽  
Edith Rodriguez ◽  
Konstantinos Kourtidis ◽  
Aristeidis K. Georgoulias ◽  
...  

Abstract. The retrieval of aerosol properties from satellite observations provides their spatial distribution over a wide area in cloud-free conditions. As such, they complement ground-based measurements by providing information over sparsely instrumented areas, albeit that significant differences may exist in both the type of information obtained and the temporal information from satellite and ground-based observations. In this paper, information from different types of satellite-based instruments is used to provide a 3-D climatology of aerosol properties over mainland China, i.e., vertical profiles of extinction coefficients from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), a lidar flying aboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellite and the column-integrated extinction (aerosol optical depth – AOD) available from three radiometers: the European Space Agency (ESA)'s Along-Track Scanning Radiometer version 2 (ATSR-2), Advanced Along-Track Scanning Radiometer (AATSR) (together referred to as ATSR) and NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the Terra satellite, together spanning the period 1995–2015. AOD data are retrieved from ATSR using the ATSR dual view (ADV) v2.31 algorithm, while for MODIS Collection 6 (C6) the AOD data set is used that was obtained from merging the AODs obtained from the dark target (DT) and deep blue (DB) algorithms, further referred to as the DTDB merged AOD product. These data sets are validated and differences are compared using Aerosol Robotic Network (AERONET) version 2 L2.0 AOD data as reference. The results show that, over China, ATSR slightly underestimates the AOD and MODIS slightly overestimates the AOD. Consequently, ATSR AOD is overall lower than that from MODIS, and the difference increases with increasing AOD. The comparison also shows that neither of the ATSR and MODIS AOD data sets is better than the other one everywhere. However, ATSR ADV has limitations over bright surfaces which the MODIS DB was designed for. To allow for comparison of MODIS C6 results with previous analyses where MODIS Collection 5.1 (C5.1) data were used, also the difference between the C6 and C5.1 merged DTDB data sets from MODIS/Terra over China is briefly discussed. The AOD data sets show strong seasonal differences and the seasonal features vary with latitude and longitude across China. Two-decadal AOD time series, averaged over all of mainland China, are presented and briefly discussed. Using the 17 years of ATSR data as the basis and MODIS/Terra to follow the temporal evolution in recent years when the environmental satellite Envisat was lost requires a comparison of the data sets for the overlapping period to show their complementarity. ATSR precedes the MODIS time series between 1995 and 2000 and shows a distinct increase in the AOD over this period. The two data series show similar variations during the overlapping period between 2000 and 2011, with minima and maxima in the same years. MODIS extends this time series beyond the end of the Envisat period in 2012, showing decreasing AOD.


2020 ◽  
Author(s):  
Oleg Skrynyk ◽  
Enric Aguilar ◽  
José A. Guijarro ◽  
Sergiy Bubin

<p>Before using climatological time series in research studies, it is necessary to perform their quality control and homogenization in order to remove possible artefacts (inhomogeneities) usually present in the raw data sets. In the vast majority of cases, the homogenization procedure allows to improve the consistency of the data, which then can be verified by means of the statistical comparison of the raw and homogenized time series. However, a new question then arises: how far are the homogenized data from the true climate signal or, in other words, what errors could still be present in homogenized data?</p><p>The main objective of our work is to estimate the uncertainty produced by the adjustment algorithm of the widely used Climatol homogenization software when homogenizing daily time series of the additive climate variables. We focused our efforts on the minimum and maximum air temperature. In order to achieve our goal we used a benchmark data set created by the INDECIS<sup>*</sup> project. The benchmark contains clean data, extracted from an output of the Royal Netherlands Meteorological Institute Regional Atmospheric Climate Model (version 2) driven by Hadley Global Environment Model 2 - Earth System, and inhomogeneous data, created by introducing realistic breaks and errors.</p><p>The statistical evaluation of discrepancies between the homogenized (by means of Climatol with predefined break points) and clean data sets was performed using both a set of standard parameters and a metrics introduced in our work. All metrics used clearly identifies the main features of errors (systematic and random) present in the homogenized time series. We calculated the metrics for every time series (only over adjusted segments) as well as their averaged values as measures of uncertainties in the whole data set.</p><p>In order to determine how the two key parameters of the raw data collection, namely the length of time series and station density, influence the calculated measures of the adjustment error we gradually decreased the length of the period and number of stations in the area under study. The total number of cases considered was 56, including 7 time periods (1950-2005, 1954-2005, …, 1974-2005) and 8 different quantities of stations (100, 90, …, 30). Additionally, in order to find out how stable are the calculated metrics for each of the 56 cases and determine their confidence intervals we performed 100 random permutations in the introduced inhomogeneity time series and repeated our calculations With that the total number of homogenization exercises performed was 5600 for each of two climate variables.</p><p>Lastly, the calculated metrics were compared with the corresponding values, obtained for raw time series. The comparison showed some substantial improvement of the metric values after homogenization in each of the 56 cases considered (for the both variables).</p><p>-------------------</p><p><sup>*</sup>INDECIS is a part of ERA4CS, an ERA-NET initiated by JPI Climate, and funded by FORMAS (SE), DLR (DE), BMWFW (AT), IFD (DK), MINECO (ES), ANR (FR) with co-funding by the European Union (Grant 690462). The work has been partially supported by the Ministry of Education and Science of Kazakhstan (Grant BR05236454) and Nazarbayev University (Grant 090118FD5345).</p>


Sign in / Sign up

Export Citation Format

Share Document