scholarly journals Efficient modeling of correlated noise

2020 ◽  
Vol 638 ◽  
pp. A95
Author(s):  
J.-B. Delisle ◽  
N. Hara ◽  
D. Ségransan

Correlated noise affects most astronomical datasets and to neglect accounting for it can lead to spurious signal detections, especially in low signal-to-noise conditions, which is often the context in which new discoveries are pursued. For instance, in the realm of exoplanet detection with radial velocity time series, stellar variability can induce false detections. However, a white noise approximation is often used because accounting for correlated noise when analyzing data implies a more complex analysis. Moreover, the computational cost can be prohibitive as it typically scales as the cube of the dataset size. For some restricted classes of correlated noise models, there are specific algorithms that can be used to help bring down the computational cost. This improvement in speed is particularly useful in the context of Gaussian process regression, however, it comes at the expense of the generality of the noise model. In this article, we present the S + LEAF noise model, which allows us to account for a large class of correlated noises with a linear scaling of the computational cost with respect to the size of the dataset. The S + LEAF model includes, in particular, mixtures of quasiperiodic kernels and calibration noise. This efficient modeling is made possible by a sparse representation of the covariance matrix of the noise and the use of dedicated algorithms for matrix inversion, solving, determinant computation, etc. We applied the S + LEAF model to reanalyze the HARPS radial velocity time series of the recently published planetary system HD 136352. We illustrate the flexibility of the S + LEAF model in handling various sources of noise. We demonstrate the importance of taking correlated noise into account, and especially calibration noise, to correctly assess the significance of detected signals.

2019 ◽  
Vol 489 (2) ◽  
pp. 2555-2571 ◽  
Author(s):  
M Damasso ◽  
M Pinamonti ◽  
G Scandariato ◽  
A Sozzetti

Abstract Gaussian process regression is a widespread tool used to mitigate stellar correlated noise in radial velocity (RV) time series. It is particularly useful to search for and determine the properties of signals induced by small-sized low-mass planets (Rp < 4 R⊕, mp < 10 M⊕). By using extensive simulations based on a quasi-periodic representation of the stellar activity component, we investigate the ability in retrieving the planetary parameters in 16 different realistic scenarios. We analyse systems composed by one planet and host stars having different levels of activity, focusing on the challenging case represented by low-mass planets, with Doppler semi-amplitudes in the range 1–3 $\rm{\,m\,s^{-1}}$. We consider many different configurations for the quasi-periodic stellar activity component, as well as different combinations of the observing epochs. We use commonly employed analysis tools to search for and characterize the planetary signals in the data sets. The goal of our injection-recovery statistical analysis is twofold. First, we focus on the problem of planet mass determination. Then, we analyse in a statistical way periodograms obtained with three different algorithms, in order to explore some of their general properties, as the completeness and reliability in retrieving the injected planetary and stellar activity signals with low false alarm probabilities. This work is intended to provide some understanding of the biases introduced in the planet parameters inferred from the analysis of RV time series that contain correlated signals due to stellar activity. It also aims to motivate the use and encourage the improvement of extensive simulations for planning spectroscopic follow-up observations.


Author(s):  
Nikita K. Zvonarev ◽  

The problem of weighted finite-rank time-series approximation is considered for signal estimation in “signal plus noise” model, where the inverse covariance matrix of noise is (2p+1)-diagonal. Finding of weights, which improve the estimation accuracy, is examined. An effective method for the numerical search of the weights is constructed and proved. Numerical simulations are performed to study the improvement of the estimation accuracy for several noise models.


2021 ◽  
Author(s):  
Santiago Belda ◽  
Matías Salinero ◽  
Eatidal Amin ◽  
Luca Pipia ◽  
Pablo Morcillo-Pallarés ◽  
...  

&lt;p&gt;In general, modeling phenological evolution represents a challenging task mainly because of time series gaps and noisy data, coming from different viewing and illumination geometries, cloud cover, seasonal snow and the interval needed to revisit and acquire data for the exact same location. For that reason, the use of reliable gap-filling fitting functions and smoothing filters is frequently required for retrievals at the highest feasible accuracy. Of specific interest to filling gaps in time series is the emergence of machine learning regression algorithms (MLRAs) which can serve as fitting functions. Among the multiple MLRA approaches currently available, the kernel-based methods developed in a Bayesian framework deserve special attention because of both being adaptive and providing associated uncertainty estimates, such as Gaussian Process Regression (GPR).&lt;/p&gt;&lt;p&gt;Recent studies demonstrated the effectiveness of GPR for gap-filling of biophysical parameter time series because the hyperparameters can be optimally set for each time series (one for each pixel in the area) with a single optimization procedure. The entire procedure of learning a GPR model only relies on appropriate selection of the type of kernel and the hyperparameters involved in the estimation of input data covariance. Despite its clear strategic advantage, the most important shortcomings of this technique are the (1) high computational cost and (2) memory requirements of their training, which grows cubically and quadratically with the number of model&amp;#8217;s samples, respectively. This can become problematic in view of processing a large amount of data, such as in Sentinel-2 (S2) time series tiles. Hence, optimization strategies need to be developed on how to speed up the GPR processing while maintaining the superior performance in terms of accuracy.&lt;/p&gt;&lt;p&gt;To mitigate its computational burden and to address such shortcoming and repetitive procedure, we evaluated whether the GPR hyperparameters can be preoptimized over a reduced set of representative pixels and kept fixed over a more extended crop area. We used S2 LAI time series over an agricultural region in Castile and Leon (North-West Spain) and testing different functions for Covariance estimation such as exponential Kernel, Squared exponential kernel and matern kernel with parameter 3/2 or 5/2. The performance of image reconstructions was compared against the standard per-pixel GPR time series training process. Results showed that accuracies were on the same order (12% RMSE degradation) whereas processing time accelerated up to 90 times. Crop phenology indicators were also calculated and compared, revealing similar temporal patterns with differences in start and end of growing season of no more than five days. To the benefit of crop monitoring applications, all the gap-filling and phenology indicators retrieval techniques have been implemented into the &lt;strong&gt;freely downloadable GUI toolbox DATimeS&lt;/strong&gt; (Decomposition and Analysis of Time Series Software - https://artmotoolbox.com/).&lt;/p&gt;


2020 ◽  
Vol 635 ◽  
pp. A83 ◽  
Author(s):  
J.-B. Delisle ◽  
N. Hara ◽  
D. Ségransan

Periodograms are common tools used to search for periodic signals in unevenly spaced time series. The significance of periodogram peaks is often assessed using false alarm probability (FAP), which in most studies assumes uncorrelated noise and is computed using numerical methods such as bootstrapping or Monte Carlo. These methods have a high computational cost, especially for low FAP levels, which are of most interest. We present an analytical estimate of the FAP of the periodogram in the presence of correlated noise, which is fundamental to analyze astronomical time series correctly. The analytical estimate that we derive provides a very good approximation of the FAP at a much lower cost than numerical methods. We validate our analytical approach by comparing it with Monte Carlo simulations. Finally, we discuss the sensitivity of the method to different assumptions in the modeling of the noise.


2021 ◽  
Author(s):  
Claire Birnie ◽  
Matteo Ravasi

&lt;p&gt;As a result of the world-wide interest in carbon storage and geothermal energy production, increased emphasis is nowadays placed on the development of reliable microseismic monitoring techniques for hazard monitoring related to fluid movement and reactivation of faults. In the process of developing and benchmarking these techniques, the incorporation of realistic noise into synthetic datasets is of vital importance to predict their effectiveness once deployed in the real world. Similarly, the recent widespread use of Machine Learning in seismological applications calls for the creation of synthetic seismic datasets that are indistinguishable from the field data to which they will be applied.&amp;#160;&lt;/p&gt;&lt;p&gt;Noise generation procedures can be split into two categories: model-based and data-driven. The distributed surface sources approach is the most common method in the first category: however, it is well-known that this fails to capture the complexity of recorded noise (Dean et al., 2015). Pearce and Barley (1977)&amp;#8217;s convolutional approach offers a data-driven procedure that has the ability to accurately capture the frequency content of noise however imposes that noise must be stationary. Birnie et al. (2016)&amp;#8217;s covariance-based approach removes the stationarity requirement accurately capturing spatio-temporal characterisations of noise, however, like all other data-driven approaches it is constrained to the survey geometry in which the noise data has been collected.&amp;#160;&lt;/p&gt;&lt;p&gt;In this work, we propose an extension of the covariance-based noise modelling workflow that aims to generate a noise model over a user-defined geometry. The extended workflow comprises of two steps: the first step is responsible for the characterisation of the recorded noise field and the generation of multiple realisations with the same statistical properties, constrained to the original acquisition geometry. Gaussian Process Regression (GPR) is subsequently applied over each time slice of the noise model transforming the model into the desired geometry.&lt;/p&gt;&lt;p&gt;The workflow is initially validated on synthetically generated noise with a user-defined input covariance matrix. This allows us to prove that the noise statistics (i.e., covariance and variogram) can be kept almost identical between the noise extracted from the synthetic dataset and the various steps of the noise model procedure. The workflow is further applied to the openly available ToC2ME passive dataset from Alberta, Canada consisting of 69 geophones arranged in a pseudo-random pattern. The noise is modelled and transformed into a 56-sensor, gridded array, which is shown to a very close resemblance to the recorded noise field.&amp;#160;&lt;/p&gt;&lt;p&gt;Whilst the importance of using realistic noise in synthetic datasets for benchmarking algorithms or training ML solutions cannot be overstated, the ability to transform such noise models into arbitrary receiver geometries opens up a host of new opportunities in the area of survey design. We argue that by coupling the noise generation and monitoring algorithms, the placement of sensors can be optimized based on the expected microseismic signatures as well as the surrounding noise behaviour. This could be of particular interest for geothermal and CO&lt;sub&gt;2&lt;/sub&gt; storage sites where processing plants are likely to be in close proximity to the permanent monitoring stations.&lt;/p&gt;


2015 ◽  
Vol 5 (1) ◽  
Author(s):  
M. A. Goudarzi ◽  
M. Cocard ◽  
R. Santerre

AbstractWe analyzed the noise characteristics of 112 continuously operating GPS stations in eastern North America using the Spectral Analysis and the Maximum Likelihood Estimation (MLE) methods. Results of both methods show that the combination ofwhite plus flicker noise is the best model for describing the stochastic part of the position time series. We explored this further using the MLE in the time domain by testing noise models of (a) powerlaw, (b)white, (c)white plus flicker, (d)white plus randomwalk, and (e) white plus flicker plus random-walk. The results show that amplitudes of all noise models are smallest in the north direction and largest in the vertical direction. While amplitudes of white noise model in (c–e) are almost equal across the study area, they are prevailed by the flicker and Random-walk noise for all directions. Assuming flicker noise model increases uncertainties of the estimated velocities by a factor of 5–38 compared to the white noise model.


2019 ◽  
Vol 11 (4) ◽  
pp. 386 ◽  
Author(s):  
Wenhao Li ◽  
Fei Li ◽  
Shengkai Zhang ◽  
Jintao Lei ◽  
Qingchuan Zhang ◽  
...  

The common mode error (CME) and optimal noise model are the two most important factors affecting the accuracy of time series in regional Global Navigation Satellite System (GNSS) networks. Removing the CME and selecting the optimal noise model can effectively improve the accuracy of GNSS coordinate time series. The CME, a major source of error, is related to the spatiotemporal distribution; hence, its detrimental effects on time series can be effectively reduced through spatial filtering. Independent component analysis (ICA) is used to filter the time series recorded by 79 GPS stations in Antarctica from 2010 to 2018. After removing stations exhibiting strong local effects using their spatial responses, the filtering results of residual time series derived from principal component analysis (PCA) and ICA are compared and analyzed. The Akaike information criterion (AIC) is then used to determine the optimal noise model of the GPS time series before and after ICA/PCA filtering. The results show that ICA is superior to PCA regarding both the filter results and the consistency of the optimal noise model. In terms of the filtering results, ICA can extract multisource error signals. After ICA filtering, the root mean square (RMS) values of the residual time series are reduced by 14.45%, 8.97%, and 13.27% in the east (E), north (N), and vertical (U) components, respectively, and the associated speed uncertainties are reduced by 13.50%, 8.06% and 11.82%, respectively. Furthermore, different GNSS time series in Antarctica have different optimal noise models with different noise characteristics in different components. The main noise models are the white noise plus flicker noise (WN+FN) and white noise plus power law noise (WN+PN) models. Additionally, the spectrum index of most PN is close to that of FN. Finally, there are more stations with consistent optimal noise models after ICA filtering than there are after PCA filtering.


2021 ◽  
Vol 13 (22) ◽  
pp. 4534
Author(s):  
Xiaoxing He ◽  
Machiel Simon Bos ◽  
Jean-Philippe Montillet ◽  
Rui Fernandes ◽  
Tim Melbourne ◽  
...  

The noise in position time series of 568 GPS (Global Position System) stations across North America with an observation span of ten years has been investigated using solutions from two processing centers, namely, the Pacific Northwest Geodetic Array (PANGA) and New Mexico Tech (NMT). It is well known that in the frequency domain, the noise exhibits a power-law behavior with a spectral index of around −1. By fitting various noise models to the observations and selecting the most likely one, we demonstrate that the spectral index in some regions flattens to zero at long periods while in other regions it is closer to −2. This has a significant impact on the estimated linear rate since flattening of the power spectral density roughly halves the uncertainty of the estimated tectonic rate while random walk doubles it. Our noise model selection is based on the highest log-likelihood value, and the Akaike and Bayesian Information Criteria to reduce the probability of over selecting noise models with many parameters. Finally, the noise in position time series also depends on the stability of the monument on which the GPS antenna is installed. We corroborate previous results that deep-drilled brace monuments produce smaller uncertainties than concrete piers. However, if at each site the optimal noise model is used, the differences become smaller due to the fact that many concrete piers are located in tectonic/seismic quiet areas. Thus, for the predicted performance of a new GPS network, not only the type of monument but also the noise properties of the region need to be taken into account.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5913
Author(s):  
Andrew Martin ◽  
Matthew Parry ◽  
Andy W. R. Soundy ◽  
Bradley J. Panckhurst ◽  
Phillip Brown ◽  
...  

We provide algorithms for inferring GPS (Global Positioning System) location and for quantifying the uncertainty of this estimate in real time. The algorithms are tested on GPS data from locations in the Southern Hemisphere at four significantly different latitudes. In order to rank the algorithms, we use the so-called log-score rule. The best algorithm uses an Ornstein–Uhlenbeck (OU) noise model and is built on an enhanced Kalman Filter (KF). The noise model is capable of capturing the observed autocorrelated process noise in the altitude, latitude and longitude recordings. This model outperforms a KF that assumes a Gaussian noise model, which under-reports the position uncertainties. We also found that the dilution-of-precision parameters, automatically reported by the GPS receiver at no additional cost, do not help significantly in the uncertainty quantification of the GPS positioning. A non-learning method using the actual position measurements and employing a constant uncertainty does not even converge to the correct position. Inference with the enhanced noise model is suitable for embedded computing and capable of achieving real-time position inference, can quantify uncertainty and be extended to incorporate complementary sensor recordings, e.g., from an accelerometer or from a magnetometer, in order to improve accuracy. The algorithm corresponding to the augmented-state unscented KF method suggests a computational cost of O(dx2dt), where dx is the dimension of the augmented state-vector and dt is an adjustable, design-dependent parameter corresponding to the length of “past values” one wishes to keep for re-evaluation of the model from time to time. The provided algorithm assumes dt=1. Hence, the algorithm is likely to be suitable for sensor fusion applications.


Sign in / Sign up

Export Citation Format

Share Document