scholarly journals On specification tests for composite likelihood inference

Biometrika ◽  
2020 ◽  
Vol 107 (4) ◽  
pp. 907-917
Author(s):  
Jing Huang ◽  
Yang Ning ◽  
Nancy Reid ◽  
Yong Chen

Summary Composite likelihood functions are often used for inference in applications where the data have a complex structure. While inference based on the composite likelihood can be more robust than inference based on the full likelihood, the inference is not valid if the associated conditional or marginal models are misspecified. In this paper, we propose a general class of specification tests for composite likelihood inference. The test statistics are motivated by the fact that the second Bartlett identity holds for each component of the composite likelihood function when these components are correctly specified. We construct the test statistics based on the discrepancy between the so-called composite information matrix and the sensitivity matrix. As an illustration, we study three important cases of the proposed tests and establish their limiting distributions under both null and local alternative hypotheses. Finally, we evaluate the finite-sample performance of the proposed tests in several examples.

2011 ◽  
Vol 28 (2) ◽  
pp. 363-386 ◽  
Author(s):  
Frederic Ferraty ◽  
Alejandro Quintela-del-Río ◽  
Philippe Vieu

In this paper we construct a statistic to test a specific form of the conditional density function. The main point of this work is to consider a functional explanatory variable, and the statistic is constructed following recent advances in nonparametric functional data analysis. The asymptotic behavior of the test statistic is studied under both the null hypothesis and some local alternative hypothesis. Then, the finite sample behavior of the method is studied through simulated examples. This paper is one of the first in the setting of nonparametric specification tests when functional data are involved.


2017 ◽  
Author(s):  
Amy Ko ◽  
Rasmus Nielsen

AbstractPedigrees contain information about the genealogical relationships among individuals and are of fundamental importance in many areas of genetic studies. However, pedigrees are often unknown and must be inferred from genetic data. Despite the importance of pedigree inference, existing methods are limited to inferring only close relationships or analyzing a small number of individuals or loci. We present a simulated annealing method for estimating pedigrees in large samples of otherwise seemingly unrelated individuals using genome-wide SNP data. The method supports complex pedigree structures such as polygamous families, multi-generational families, and pedigrees in which many of the member individuals are missing. Computational speed is greatly enhanced by the use of a composite likelihood function which approximates the full likelihood. We validate our method on simulated data and show that it can infer distant relatives more accurately than existing methods. Furthermore, we illustrate the utility of the method on a sample of Greenlandic Inuit.Author SummaryPedigrees contain information about the genealogical relationships among individuals. This information can be used in many areas of genetic studies such as disease association studies, conservation efforts, and learning about the demographic history and social structure of a population. Despite their importance, pedigrees are often unknown and must be estimated from genetic information. However, pedigree inference remains a difficult problem due to the high cost of likelihood computation and the enormous number of possible pedigrees we must consider. These difficulties limit existing methods in their ability to infer pedigrees when the sample size or the number of markers is large, or when the sample contains only distant relatives. In this report, we present a method that circumvents these computational barriers in order to infer pedigrees of complex structure for a large number of individuals. From our simulation studies, we found that our method can infer distant relatives much more accurately than existing methods. Our ability to infer pedigrees with a greater accuracy opens up possibilities for developing or improving pedigree-based methods in many areas research such as linkage analysis, demographic inference, association studies, and conservation.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Hanji He ◽  
Guangming Deng

We extend the mean empirical likelihood inference for response mean with data missing at random. The empirical likelihood ratio confidence regions are poor when the response is missing at random, especially when the covariate is high-dimensional and the sample size is small. Hence, we develop three bias-corrected mean empirical likelihood approaches to obtain efficient inference for response mean. As to three bias-corrected estimating equations, we get a new set by producing a pairwise-mean dataset. The method can increase the size of the sample for estimation and reduce the impact of the dimensional curse. Consistency and asymptotic normality of the maximum mean empirical likelihood estimators are established. The finite sample performance of the proposed estimators is presented through simulation, and an application to the Boston Housing dataset is shown.


Econometrics ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 10
Author(s):  
Šárka Hudecová ◽  
Marie Hušková ◽  
Simos G. Meintanis

This article considers goodness-of-fit tests for bivariate INAR and bivariate Poisson autoregression models. The test statistics are based on an L2-type distance between two estimators of the probability generating function of the observations: one being entirely nonparametric and the second one being semiparametric computed under the corresponding null hypothesis. The asymptotic distribution of the proposed tests statistics both under the null hypotheses as well as under alternatives is derived and consistency is proved. The case of testing bivariate generalized Poisson autoregression and extension of the methods to dimension higher than two are also discussed. The finite-sample performance of a parametric bootstrap version of the tests is illustrated via a series of Monte Carlo experiments. The article concludes with applications on real data sets and discussion.


Author(s):  
Eduardo de Freitas Costa ◽  
Silvana Schneider ◽  
Giulia Bagatini Carlotto ◽  
Tainá Cabalheiro ◽  
Mauro Ribeiro de Oliveira Júnior

AbstractThe dynamics of the wild boar population has become a pressing issue not only for ecological purposes, but also for agricultural and livestock production. The data related to the wild boar dispersal distance can have a complex structure, including excess of zeros and right-censored observations, thus being challenging for modeling. In this sense, we propose two different zero-inflated-right-censored regression models, assuming Weibull and gamma distributions. First, we present the construction of the likelihood function, and then, we apply both models to simulated datasets, demonstrating that both regression models behave well. The simulation results point to the consistency and asymptotic unbiasedness of the developed methods. Afterwards, we adjusted both models to a simulated dataset of wild boar dispersal, including excess of zeros, right-censored observations, and two covariates: age and sex. We showed that the models were useful to extract inferences about the wild boar dispersal, correctly describing the data mimicking a situation where males disperse more than females, and age has a positive effect on the dispersal of the wild boars. These results are useful to overcome some limitations regarding inferences in zero-inflated-right-censored datasets, especially concerning the wild boar’s population. Users will be provided with an R function to run the proposed models.


Complexity ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-14
Author(s):  
Ahmed A. Mahmoud ◽  
Sarat C. Dass ◽  
Mohana S. Muthuvalu ◽  
Vijanth S. Asirvadam

This article presents statistical inference methodology based on maximum likelihoods for delay differential equation models in the univariate setting. Maximum likelihood inference is obtained for single and multiple unknown delay parameters as well as other parameters of interest that govern the trajectories of the delay differential equation models. The maximum likelihood estimator is obtained based on adaptive grid and Newton-Raphson algorithms. Our methodology estimates correctly the delay parameters as well as other unknown parameters (such as the initial starting values) of the dynamical system based on simulation data. We also develop methodology to compute the information matrix and confidence intervals for all unknown parameters based on the likelihood inferential framework. We present three illustrative examples related to biological systems. The computations have been carried out with help of mathematical software: MATLAB® 8.0 R2014b.


2018 ◽  
Vol 22 ◽  
pp. 19-34 ◽  
Author(s):  
Nigel J. Newton

We develop a family of infinite-dimensional (non-parametric) manifolds of probability measures. The latter are defined on underlying Banach spaces, and have densities of class Cbk with respect to appropriate reference measures. The case k = ∞, in which the manifolds are modelled on Fréchet spaces, is included. The manifolds admit the Fisher-Rao metric and, unusually for the non-parametric setting, Amari’s α-covariant derivatives for all α ∈ ℝ. By construction, they are C∞-embedded submanifolds of particular manifolds of finite measures. The statistical manifolds are dually (α = ±1) flat, and admit mixture and exponential representations as charts. Their curvatures with respect to the α-covariant derivatives are derived. The likelihood function associated with a finite sample is a continuous function on each of the manifolds, and the α-divergences are of class C∞.


1987 ◽  
Vol 3 (3) ◽  
pp. 387-408 ◽  
Author(s):  
J.C. Nankervis ◽  
N.E. Savin

The distributions of the test statistics are investigated in the context of an AR(1) model where the root is unity or near unity and where the exogenous process is a stable process, a random walk or a time trend. The finite sample distributions are estimated by Monte Carlo methods assuming normal disturbances. The sensitivity of the distributions to both the values of the parameters of the AR(1) model and the process generating the exogenous time series is examined. The Monte Carlo results motivate several theorems which describe the exact sampling behavior of the test statistics. The analytical and empirical results present a mixed picture with respect to the accuracy of the relevant asymptotic approximations.


2016 ◽  
Vol 5 (4) ◽  
pp. 9 ◽  
Author(s):  
Hérica P. A. Carneiro ◽  
Dione M. Valença

In some survival studies part of the population may be no longer subject to the event of interest. The called cure rate models take this fact into account. They have been extensively studied for several authors who have proposed extensions and applications in real lifetime data. Classic large sample tests are usually considered in these applications, especially the likelihood ratio. Recently  a new test called \textit{gradient test} has been proposed. The gradient statistic shares the same asymptotic properties with the classic likelihood ratio and does not involve knowledge of the information matrix, which can be an advantage in survival models. Some simulation studies have been carried out to explore the behavior of the gradient test in finite samples and compare it with the classic tests in different models. However little is known about the properties of these large sample tests in finite sample for cure rate models. In this work we  performed a simulation study based on the promotion time model with Weibull distribution, to assess the performance of likelihood ratio and gradient tests in finite samples. An application is presented to illustrate the results.


2002 ◽  
Vol 34 (4) ◽  
pp. 733-754 ◽  
Author(s):  
Antonio Páez ◽  
Takashi Uchida ◽  
Kazuaki Miyamoto

Geographically weighted regression (GWR) has been proposed as a technique to explore spatial parametric nonstationarity. The method has been developed mainly along the lines of local regression and smoothing techniques, a strategy that has led to a number of difficult questions about the regularity conditions of the likelihood function, the effective number of degrees of freedom, and in general the relevance of extending the method to derive inference and model specification tests. In this paper we argue that placing GWR within a different statistical context, as a spatial model of error variance heterogeneity, or what might be termed locational heterogeneity, solves these difficulties. A maximum-likelihood-based framework for estimation and inference of a general geographically weighted regression model is presented that leads to a method to estimate location-specific kernel bandwidths. Moreover, a test for locational heterogeneity is derived and its use exemplified with a case study.


Sign in / Sign up

Export Citation Format

Share Document