scholarly journals Exploiting a novel algorithm and GPUs to break the ten quadrillion pairwise comparisons barrier for time series motifs and joins

2017 ◽  
Vol 54 (1) ◽  
pp. 203-236 ◽  
Author(s):  
Yan Zhu ◽  
Zachary Zimmerman ◽  
Nader Shakibay Senobari ◽  
Chin-Chia Michael Yeh ◽  
Gareth Funning ◽  
...  
2019 ◽  
Author(s):  
Bruce Wang ◽  
Timothy Sudijono ◽  
Henry Kirveslahti ◽  
Tingran Gao ◽  
Douglas M. Boyer ◽  
...  

AbstractThe recent curation of large-scale databases with 3D surface scans of shapes has motivated the development of tools that better detect global patterns in morphological variation. Studies which focus on identifying differences between shapes have been limited to simple pairwise comparisons and rely on pre-specified landmarks (that are often known). We present SINATRA: the first statistical pipeline for analyzing collections of shapes without requiring any correspondences. Our novel algorithm takes in two classes of shapes and highlights the physical features that best describe the variation between them. We use a rigorous simulation framework to assess our approach. Lastly, as a case study, we use SINATRA to analyze mandibular molars from four different suborders of primates and demonstrate its ability recover known morphometric variation across phylogenies.


Author(s):  
Peter Domonkos

The removal of non-climatic biases, so-called inhomogeneities, from long climatic records needs sophistically developed statistical methods. One principle is that usually the differences between a candidate series and its neighbour series are analysed instead of directly the candidate series, in order to neutralize the possible impacts of regionally common natural climate variation on the detection of inhomogeneities. In most homogenization methods, two main kinds of time series comparisons are applied, i.e. composite reference series or pairwise comparisons. In composite reference series the inhomogeneities of neighbour series are attenuated by averaging the individual series, and the accuracy of homogenization can be improved by the iterative improvement of composite reference series. By contrast, pairwise comparisons have the advantage that coincidental inhomogeneities affecting several station series in a similar way can be identified with higher certainty than with composite reference series. In addition, homogenization with pairwise comparisons tends to facilitate the most accurate regional trend estimations. A new time series comparison method is presented here, which combines the use of pairwise comparisons and composite reference series in a way that their advantages are unified. This time series comparison method is embedded into the ACMANT homogenization method, and tested in large, commonly available monthly temperature test datasets.


2019 ◽  
Vol 6 (3) ◽  
pp. 32-42 ◽  
Author(s):  
A. A. Surkov

The method of combining forecasts has already proven itself in practice as a reliable and effective way to improve the accuracy of economic forecasting. But this technique has several disadvantages. Today, one of the ways to improve the method of combining forecasts is to find the possibility of attracting expert information as a tool for correcting the obtained forecast results. This article is devoted to the use of an expert method of pairwise comparisons for constructing the weights of the combined forecast as one of the options for which you can use expert information when combining forecasts. The proposed methodology has been applied in practice for the economic time series of some products of industrial production in Russia. An assessment was made of the effectiveness of using the method of pairwise comparisons for combining forecasts, and based on the results obtained, a forecast of the development of the economic indicators under consideration was proposed.


Atmosphere ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 1134
Author(s):  
Peter Domonkos

The removal of non-climatic biases, so-called inhomogeneities, from long climatic records needs sophistically developed statistical methods. One principle is that the differences between a candidate series and its neighbor series are usually analyzed instead of the candidate series directly, in order to neutralize the possible impacts of regionally common natural climate variation on the detection of inhomogeneities. In most homogenization methods, two main kinds of time series comparisons are applied, i.e., composite reference series or pairwise comparisons. In composite reference series, the inhomogeneities of neighbor series are attenuated by averaging the individual series, and the accuracy of homogenization can be improved by the iterative improvement of composite reference series. By contrast, pairwise comparisons have the advantage that coincidental inhomogeneities affecting several station series in a similar way can be identified with higher certainty than with composite reference series. In addition, homogenization with pairwise comparisons tends to facilitate the most accurate regional trend estimations. A new time series comparison method is presented here, which combines the use of pairwise comparisons and composite reference series in a way that their advantages are unified. This time series comparison method is embedded into the Applied Caussinus-Mestre Algorithm for homogenizing Networks of climatic Time series (ACMANT) homogenization method, and tested in large, commonly available monthly temperature test datasets. Further favorable characteristics of ACMANT are also discussed.


2015 ◽  
Vol 204 (2) ◽  
pp. 1159-1163 ◽  
Author(s):  
I. Gaudot ◽  
É. Beucler ◽  
A. Mocquet ◽  
M. Schimmel ◽  
M. Le Feuvre

Abstract In order to detect possible signal redundancies in the ambient seismic wavefield, we develop a new method based on pairwise comparisons among a set of synchronous time-series. This approach is based on instantaneous phase coherence statistics. The first and second moments of the pairwise phase coherence distribution are used to characterize the phase randomness. For perfect phase randomness, the theoretical values of the mean and variance are equal to 0 and $\sqrt{1-2/\pi }$, respectively. As a consequence, any deviation from these values indicates the presence of a redundant phase in the raw continuous signal. A previously detected microseismic source in the Gulf of Guinea is used to illustrate one of the possible ways of handling phase coherence statistics. The proposed approach allows us to properly localize this persistent source, and to quantify its contribution to the overall seismic ambient wavefield. The strength of the phase coherence statistics relies in its ability to quantify the redundancy of a given phase among a set of time-series with various useful applications in seismic noise-based studies (tomography and/or source characterization).


2021 ◽  
Author(s):  
Peter Domonkos

<p>The development of ACMANT homogenization software started during the European COST HOME project, around 2010. Due to its excellent results in method comparison tests, the development of ACMANT has been being continuous since then. While its first version was applicable only to the homogenization of monthly temperature series, the later versions are applicable to a wide range of climatic variables and either for monthly or daily time series.</p><p>The operation of ACMANT is fast and automatic, and it is easy to use that even for large size datasets. The method can homogenize together time series of varied lengths, well tolerate data gaps, includes outlier filtering and infilling of data gaps (optional). ACMANT includes modern and effective statistical tools for the detection and removal of inhomogenities, such as step function fitting, bivariate detection for breaks of annual means and seasonal amplitudes (where applicable), ANOVA correction method and ensemble homogenization with varied pre-homogenization of neighbour series. For these properties, ACMANTv4 was the most accurate homogenization method in most method comparison tests of the Spanish MULTITEST project (https://doi.org/10.1175/JCLI-D-20-0611.1). In these tests, one important exception occurred, namely network mean trend errors were removed with significantly higher certainty by the Pairwise Homogenization Algorithm when approximately a half of the time series were affected with quasi synchronous breaks imitating concerted technical changes in the performance of climate observations. The most recent developments aiming the release of ACMANTv5 include the elimination of this drawback of ACMANT.</p><p>For ACMANTv5, a new break detection method has been developed, in which the combination of two time series comparison methods is applied. The new method contains both the use of composite reference series and pairwise comparisons, and in the detection with composite reference series the step function fitting is forced to include the breaks detected by pairwise comparisons. Another novelty of ACMANTv5 is that it gives options to use metadata in the homogenization procedure. The default operation mode of ACMANTv5 is still fully automatic, with or without the automatic use of a prepared metadata table. ACMANTv5 uses every date of the metadata list as a break indicator, and they are evaluated together with other indicators obtained by pairwise comparisons. Optionally, ACMANTv5 gives access to users to edit the list of detected breaks based on the pairwise detections of the first homogenization round. In the later steps of ACMANTv5 user intervention is not possible, but metadata may be considered by the automatic procedure also in the final estimation of break positions.     </p>


2016 ◽  
Author(s):  
Guy Karlebach

AbstractGene regulatory networks (GRNs) are increasingly used for explaining biological processes with complex transcriptional regulation. A GRN links the expression levels of a set of genes via regulatory controls that gene products exert on one another. Boolean networks are a common modeling choice since they balance between detail and ease of analysis. However, even for Boolean networks the problem of fitting a given network model to an expression dataset is NP-Complete. Previous methods have addressed this issue heuristically or by focusing on acyclic networks and specific classes of regulation functions. In this paper we introduce a novel algorithm for this problem that makes use of sampling in order to handle large datasets. Our algorithm can handle time series data for any network type and steady state data for acyclic networks. Using in-silico time series data we demonstrate good performance on large datasets with a significant level of noise.


Sign in / Sign up

Export Citation Format

Share Document