scholarly journals Computationally efficient sparsity-inducing coherence spectrum estimation of complete and non-complete data sets

2013 ◽  
Vol 93 (5) ◽  
pp. 1221-1234 ◽  
Author(s):  
K. Angelopoulos ◽  
G.O. Glentis ◽  
A. Jakobsson
2012 ◽  
Vol 60 (12) ◽  
pp. 6674-6681 ◽  
Author(s):  
K. Angelopoulos ◽  
G. O. Glentis ◽  
A. Jakobsson

2020 ◽  
Vol 70 (1) ◽  
pp. 145-161 ◽  
Author(s):  
Marnus Stoltz ◽  
Boris Baeumer ◽  
Remco Bouckaert ◽  
Colin Fox ◽  
Gordon Hiscott ◽  
...  

Abstract We describe a new and computationally efficient Bayesian methodology for inferring species trees and demographics from unlinked binary markers. Likelihood calculations are carried out using diffusion models of allele frequency dynamics combined with novel numerical algorithms. The diffusion approach allows for analysis of data sets containing hundreds or thousands of individuals. The method, which we call Snapper, has been implemented as part of the BEAST2 package. We conducted simulation experiments to assess numerical error, computational requirements, and accuracy recovering known model parameters. A reanalysis of soybean SNP data demonstrates that the models implemented in Snapp and Snapper can be difficult to distinguish in practice, a characteristic which we tested with further simulations. We demonstrate the scale of analysis possible using a SNP data set sampled from 399 fresh water turtles in 41 populations. [Bayesian inference; diffusion models; multi-species coalescent; SNP data; species trees; spectral methods.]


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Julian T. C. Wennmacher ◽  
Christian Zaubitzer ◽  
Teng Li ◽  
Yeon Kyoung Bahk ◽  
Jing Wang ◽  
...  

Author(s):  
Reza Alizadeh ◽  
Liangyue Jia ◽  
Anand Balu Nellippallil ◽  
Guoxin Wang ◽  
Jia Hao ◽  
...  

AbstractIn engineering design, surrogate models are often used instead of costly computer simulations. Typically, a single surrogate model is selected based on the previous experience. We observe, based on an analysis of the published literature, that fitting an ensemble of surrogates (EoS) based on cross-validation errors is more accurate but requires more computational time. In this paper, we propose a method to build an EoS that is both accurate and less computationally expensive. In the proposed method, the EoS is a weighted average surrogate of response surface models, kriging, and radial basis functions based on overall cross-validation error. We demonstrate that created EoS is accurate than individual surrogates even when fewer data points are used, so computationally efficient with relatively insensitive predictions. We demonstrate the use of an EoS using hot rod rolling as an example. Finally, we include a rule-based template which can be used for other problems with similar requirements, for example, the computational time, required accuracy, and the size of the data.


2020 ◽  
Author(s):  
Mark Naylor ◽  
Kirsty Bayliss ◽  
Finn Lindgren ◽  
Francesco Serafini ◽  
Ian Main

<p>Many earthquake forecasting approaches have developed bespokes codes to model and forecast the spatio-temporal eveolution of seismicity. At the same time, the statistics community have been working on a range of point process modelling codes. For example, motivated by ecological applications, inlabru models spatio-temporal point processes as a log-Gaussian Cox Process and is implemented in R. Here we present an initial implementation of inlabru to model seismicity. This fully Bayesian approach is computationally efficient because it uses a nested Laplace approximation such that posteriors are assumed to be Gaussian so that their means and standard deviations can be deterministically estimated rather than having to be constructed through sampling. Further, building on existing packages in R to handle spatial data, it can construct covariate maprs from diverse data-types, such as fault maps, in an intutitive and simple manner.</p><p>Here we present an initial application to the California earthqauke catalogue to determine the relative performance of different data-sets for describing the spatio-temporal evolution of seismicity.</p>


2012 ◽  
Vol 2 (1) ◽  
pp. 31-37 ◽  
Author(s):  
L. Sjöberg

Solutions to Linear Inverse Problems on the Sphere by Tikhonov Regularization, Wiener filtering and Spectral Smoothing and Combination — A ComparisonSolutions to linear inverse problems on the sphere, common in geodesy and geophysics, are compared for Tikhonov's method of regularization, Wiener filtering and spectral smoothing and combination as well as harmonic analysis. It is concluded that Wiener and spectral smoothing, although based on different assumptions and target functions, yield the same estimator. Also, provided that the extra information on the signal and error degree variances is available, the standard Tikhonov method is inferior to the other methods, which, in contrast to Tikhonov's approach, match the spectral errors and signals in an optimum way. We show that the corresponding Tikhonov matrix for optimum regularization can only be determined approximately. Moreover, as Tikhonov's method solves an integral equation, it is less computationally efficient than the other methods, which use forward integration. Also harmonic analysis uses direct integration and is not hampered, as previous methods, with spectral leakage. Spectral combination, in addition to filtering, has the advantage of combining different data sets by least squares spectral weighting.


2014 ◽  
Vol 39 (2) ◽  
pp. 107-127 ◽  
Author(s):  
Artur Matyja ◽  
Krzysztof Siminski

Abstract The missing values are not uncommon in real data sets. The algorithms and methods used for the data analysis of complete data sets cannot always be applied to missing value data. In order to use the existing methods for complete data, the missing value data sets are preprocessed. The other solution to this problem is creation of new algorithms dedicated to missing value data sets. The objective of our research is to compare the preprocessing techniques and specialised algorithms and to find their most advantageous usage.


2014 ◽  
Vol 13s1 ◽  
pp. CIN.S13890 ◽  
Author(s):  
Changjin Hong ◽  
Solaiappan Manimaran ◽  
William Evan Johnson

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/ .


2017 ◽  
Author(s):  
Darrell O. Ricke ◽  
Steven Schwartz

AbstractHigh throughput sequencing (HTS) of DNA forensic samples is expanding from the sizing of short tandem repeats (STRs) to massively parallel sequencing (MPS). HTS panels are expanding from the FBI 20 core Combined DNA Index System (CODIS) loci to include SNPs. The calculation of random man not excluded, P(RMNE), is used in DNA mixture analysis to estimate the probability that a person is present in a DNA mixture. This calculation encounters calculation artifacts with expansion to larger panel sizes. Increasing the floating-point precision of the calculations allows for increased panel sizes but with a corresponding increase in computation time. The Taylor series higher precision libraries used fail on some input data sets leading to algorithm unreliability. Herein, a new formula is introduced for calculating P(RMNE) that scales to larger SNP panel sizes while being computationally efficient (patent pending)[1].


Sign in / Sign up

Export Citation Format

Share Document