scholarly journals Mixtures and products in two graphical models

2018 ◽  
Vol 9 (1) ◽  
pp. 1-20 ◽  
Author(s):  
Anna Seigal ◽  
Guido Montufar

We compare two statistical models of three binary random variables. One is a mixture model and the other is a product of mixtures model called a restricted Boltzmann machine. Although the two models we study look different from their parametrizations, we show that they represent the same set of distributions on the interior of the probability simplex, and are equal up to closure. We give a semi-algebraic description of the model in terms of six binomial inequalities and obtain closed form expressions for the maximum likelihood estimates. We briefly discuss extensions to larger models.

2016 ◽  
Author(s):  
Atif Rahman ◽  
Lior Pachter

AbstractScaffolding i.e. ordering and orienting contigs is an important step in genome assembly. We present a method for scaffolding based on likelihoods of genome assemblies. Generative models for sequencing are used to obtain maximum likelihood estimates of gaps between contigs and to estimate whether linking contigs into scaffolds would lead to an increase in the likelihood of the assembly. We then link contigs if they can be unambiguously joined or if the corresponding increase in likelihood is substantially greater than that of other possible joins of those contigs. The method is implemented in a tool called Swalo with approximations to make it efficient and applicable to large datasets. Analysis on real and simulated datasets reveals that it consistently makes more or similar number of correct joins as other scaffolders while linking very few contigs incorrectly, thus outperforming other scaffolders and demonstrating that substantial improvement in genome assembly may be achieved through the use of statistical models. Swalo is freely available for download at https://atifrahman.github.io/SWALO/.


2019 ◽  
Vol 58 (4) ◽  
pp. 787-796 ◽  
Author(s):  
Paul L. Smith ◽  
Roger W. Johnson ◽  
Donna V. Kliche

AbstractUse of the standard deviation σm of the drop mass distribution as one of the three parameters of raindrop size distribution (DSD) functions was introduced for application to disdrometer data supporting the Global Precipitation Measurement dual-frequency radar system. The other two parameters are a normalized drop number concentration Nw and the mass-weighted mean diameter Dm. This paper presents an evaluation of that formulation of the DSD functions, in two parts. First is a mathematical analysis showing that the procedure for estimating σm, along with the other DSD parameters, from disdrometer data is in essence another moment method. As such, it is subject to the biases and errors inherent in all moment methods. When the form of the DSD function is specified, it is inferior (like all moment methods) to the maximum likelihood technique for fitting parameters to sampled data. The second part is a series of sampling simulations illustrating the biases and errors involved in applying the formulation to the specific case of gamma DSDs. It leads to underestimates of σm and consequently to overestimates of the gamma shape parameter—with large root-mean-square errors. Comparison with maximum likelihood estimates shows the degree of improvement that could be obtained in the estimates of the shape parameter. The propensity to underestimate σm will be pervasive, and users of this DSD formulation should be cognizant of the biases and errors that can occur.


Genetics ◽  
2020 ◽  
Vol 215 (2) ◽  
pp. 343-357 ◽  
Author(s):  
David Steinsaltz ◽  
Andy Dahl ◽  
Kenneth W. Wachter

We consider the problem of interpreting negative maximum likelihood estimates of heritability that sometimes arise from popular statistical models of additive genetic variation. These may result from random noise acting on estimates of genuinely positive heritability, but we argue that they may also arise from misspecification of the standard additive mechanism that is supposed to justify the statistical procedure. Researchers should be open to the possibility that negative heritability estimates could reflect a real physical feature of the biological process from which the data were sampled.


1983 ◽  
Vol 15 (11) ◽  
pp. 1545-1550 ◽  
Author(s):  
A Sen ◽  
R K Pruthi

Some very fast least-squares based calibration procedures for the gravity model have been presented in earlier papers. Since these procedures are inconvenient or impossible to use when diagonal elements of the base period origin—destination (O—D) matrix are unavailable, one of the procedures has been modified to handle such situations. This modified procedure is described in this paper and then applied to two O—D tables—one for food grains and the other for coal. For these two data sets the procedure yields estimates which are virtually identical to corresponding maximum likelihood estimates.


2017 ◽  
Author(s):  
David Steinsaltz ◽  
Andy Dahl ◽  
Kenneth W. Wachter

AbstractWe consider the problem of interpreting negative maximum likelihood estimates of heritability that sometimes arise from popular statistical models of additive genetic variation. These may result from random noise acting on estimates of genuinely positive heritability, but we argue that they may also arise from misspecification of the standard additive mechanism that is supposed to justify the statistical procedure. Researchers should be open to the possibility that negative heritability estimates could reflect a real physical feature of the biological process from which the data were sampled.


Genetics ◽  
2001 ◽  
Vol 159 (4) ◽  
pp. 1779-1788 ◽  
Author(s):  
Carlos D Bustamante ◽  
John Wakeley ◽  
Stanley Sawyer ◽  
Daniel L Hartl

Abstract In this article we explore statistical properties of the maximum-likelihood estimates (MLEs) of the selection and mutation parameters in a Poisson random field population genetics model of directional selection at DNA sites. We derive the asymptotic variances and covariance of the MLEs and explore the power of the likelihood ratio tests (LRT) of neutrality for varying levels of mutation and selection as well as the robustness of the LRT to deviations from the assumption of free recombination among sites. We also discuss the coverage of confidence intervals on the basis of two standard-likelihood methods. We find that the LRT has high power to detect deviations from neutrality and that the maximum-likelihood estimation performs very well when the ancestral states of all mutations in the sample are known. When the ancestral states are not known, the test has high power to detect deviations from neutrality for negative selection but not for positive selection. We also find that the LRT is not robust to deviations from the assumption of independence among sites.


Genetics ◽  
2000 ◽  
Vol 155 (3) ◽  
pp. 1429-1437
Author(s):  
Oliver G Pybus ◽  
Andrew Rambaut ◽  
Paul H Harvey

Abstract We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.


Sign in / Sign up

Export Citation Format

Share Document