The method of weighted likelihood functions

Author(s):  
Hans Rudolf Lerche
Geosciences ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 150
Author(s):  
Nilgün Güdük ◽  
Miguel de la Varga ◽  
Janne Kaukolinna ◽  
Florian Wellmann

Structural geological models are widely used to represent relevant geological interfaces and property distributions in the subsurface. Considering the inherent uncertainty of these models, the non-uniqueness of geophysical inverse problems, and the growing availability of data, there is a need for methods that integrate different types of data consistently and consider the uncertainties quantitatively. Probabilistic inference provides a suitable tool for this purpose. Using a Bayesian framework, geological modeling can be considered as an integral part of the inversion and thereby naturally constrain geophysical inversion procedures. This integration prevents geologically unrealistic results and provides the opportunity to include geological and geophysical information in the inversion. This information can be from different sources and is added to the framework through likelihood functions. We applied this methodology to the structurally complex Kevitsa deposit in Finland. We started with an interpretation-based 3D geological model and defined the uncertainties in our geological model through probability density functions. Airborne magnetic data and geological interpretations of borehole data were used to define geophysical and geological likelihoods, respectively. The geophysical data were linked to the uncertain structural parameters through the rock properties. The result of the inverse problem was an ensemble of realized models. These structural models and their uncertainties are visualized using information entropy, which allows for quantitative analysis. Our results show that with our methodology, we can use well-defined likelihood functions to add meaningful information to our initial model without requiring a computationally-heavy full grid inversion, discrepancies between model and data are spotted more easily, and the complementary strength of different types of data can be integrated into one framework.


2021 ◽  
pp. 1-25
Author(s):  
Yu-Chin Hsu ◽  
Ji-Liang Shiu

Under a Mundlak-type correlated random effect (CRE) specification, we first show that the average likelihood of a parametric nonlinear panel data model is the convolution of the conditional distribution of the model and the distribution of the unobserved heterogeneity. Hence, the distribution of the unobserved heterogeneity can be recovered by means of a Fourier transformation without imposing a distributional assumption on the CRE specification. We subsequently construct a semiparametric family of average likelihood functions of observables by combining the conditional distribution of the model and the recovered distribution of the unobserved heterogeneity, and show that the parameters in the nonlinear panel data model and in the CRE specification are identifiable. Based on the identification result, we propose a sieve maximum likelihood estimator. Compared with the conventional parametric CRE approaches, the advantage of our method is that it is not subject to misspecification on the distribution of the CRE. Furthermore, we show that the average partial effects are identifiable and extend our results to dynamic nonlinear panel data models.


Genetics ◽  
1997 ◽  
Vol 147 (4) ◽  
pp. 1855-1861 ◽  
Author(s):  
Montgomery Slatkin ◽  
Bruce Rannala

Abstract A theory is developed that provides the sampling distribution of low frequency alleles at a single locus under the assumption that each allele is the result of a unique mutation. The numbers of copies of each allele is assumed to follow a linear birth-death process with sampling. If the population is of constant size, standard results from theory of birth-death processes show that the distribution of numbers of copies of each allele is logarithmic and that the joint distribution of numbers of copies of k alleles found in a sample of size n follows the Ewens sampling distribution. If the population from which the sample was obtained was increasing in size, if there are different selective classes of alleles, or if there are differences in penetrance among alleles, the Ewens distribution no longer applies. Likelihood functions for a given set of observations are obtained under different alternative hypotheses. These results are applied to published data from the BRCA1 locus (associated with early onset breast cancer) and the factor VIII locus (associated with hemophilia A) in humans. In both cases, the sampling distribution of alleles allows rejection of the null hypothesis, but relatively small deviations from the null model can account for the data. In particular, roughly the same population growth rate appears consistent with both data sets.


METRON ◽  
2021 ◽  
Author(s):  
Giovanni Saraceno ◽  
Claudio Agostinelli ◽  
Luca Greco

AbstractA weighted likelihood technique for robust estimation of multivariate Wrapped distributions of data points scattered on a $$p-$$ p - dimensional torus is proposed. The occurrence of outliers in the sample at hand can badly compromise inference for standard techniques such as maximum likelihood method. Therefore, there is the need to handle such model inadequacies in the fitting process by a robust technique and an effective downweighting of observations not following the assumed model. Furthermore, the employ of a robust method could help in situations of hidden and unexpected substructures in the data. Here, it is suggested to build a set of data-dependent weights based on the Pearson residuals and solve the corresponding weighted likelihood estimating equations. In particular, robust estimation is carried out by using a Classification EM algorithm whose M-step is enhanced by the computation of weights based on current parameters’ values. The finite sample behavior of the proposed method has been investigated by a Monte Carlo numerical study and real data examples.


2013 ◽  
Vol 45 (1) ◽  
pp. 164-185 ◽  
Author(s):  
Pavel V. Gapeev ◽  
Albert N. Shiryaev

We study the Bayesian problems of detecting a change in the drift rate of an observable diffusion process with linear and exponential penalty costs for a detection delay. The optimal times of alarms are found as the first times at which the weighted likelihood ratios hit stochastic boundaries depending on the current observations. The proof is based on the reduction of the initial problems into appropriate three-dimensional optimal stopping problems and the analysis of the associated parabolic-type free-boundary problems. We provide closed-form estimates for the value functions and the boundaries, under certain nontrivial relations between the coefficients of the observable diffusion.


2018 ◽  
Vol 30 (11) ◽  
pp. 3072-3094 ◽  
Author(s):  
Hongqiao Wang ◽  
Jinglai Li

We consider Bayesian inference problems with computationally intensive likelihood functions. We propose a Gaussian process (GP)–based method to approximate the joint distribution of the unknown parameters and the data, built on recent work (Kandasamy, Schneider, & Póczos, 2015 ). In particular, we write the joint density approximately as a product of an approximate posterior density and an exponentiated GP surrogate. We then provide an adaptive algorithm to construct such an approximation, where an active learning method is used to choose the design points. With numerical examples, we illustrate that the proposed method has competitive performance against existing approaches for Bayesian computation.


2014 ◽  
Author(s):  
Sean Ruddy ◽  
Marla Johnson ◽  
Elizabeth Purdom

The prevalence of sequencing experiments in genomics has led to an increased use of methods for count data in analyzing high-throughput genomic data to perform analyses. The importance of shrinkage methods in improving the performance of statistical methods remains. A common example is that of gene expression data, where the counts per gene are often modeled as some form of an over-dispersed Poisson. In this case, shrinkage estimates of the per-gene dispersion parameter have led to improved estimation of dispersion in the case of a small number of samples. We address a different count setting introduced by the use of sequencing data: comparing differential proportional usage via an over-dispersed binomial model. This is motivated by our interest in testing for differential exon skipping in mRNA-Seq experiments. We introduce a novel method that is developed by modeling the dispersion based on the double binomial distribution proposed by Efron (1986). Our method (WEB-Seq) is an empirical bayes strategy for producing a shrunken estimate of dispersion and effectively detects differential proportional usage, and has close ties to the weighted-likelihood strategy of edgeR developed for gene expression data (Robinson and Smyth, 2007; Robinson et al., 2010). We analyze its behavior on simulated data sets as well as real data and show that our method is fast, powerful and gives accurate control of the FDR compared to alternative approaches. We provide implementation of our methods in the R package DoubleExpSeq available on CRAN.


Sign in / Sign up

Export Citation Format

Share Document