scholarly journals Inference for copula modeling of discrete data: a cautionary tale and some facts

2017 ◽  
Vol 5 (1) ◽  
pp. 121-132 ◽  
Author(s):  
Olivier P. Faugeras

AbstractIn this note, we elucidate some of the mathematical, statistical and epistemological issues involved in using copulas to model discrete data. We contrast the possible use of (nonparametric) copula methods versus the problematic use of parametric copula models. For the latter, we stress, among other issues, the possibility of obtaining impossible models, arising from model misspecification or unidentifiability of the copula parameter.

2007 ◽  
Author(s):  
Αριστείδης Νικολουλόπουλος

Studying associations among multivariate outcomes is an interesting problem in statistical science. The dependence between random variables is completely described by their multivariate distribution. When the multivariate distribution has a simple form, standard methods can be used to make inference. On the other hand one may create multivariate distributions based on particular assumptions, limiting thus their use. Unfortunately, these limitations occur very often when working with multivariate discrete distributions. Some multivariate discrete distributions used in practice can have only certain properties, as for example they allow only for positive dependence or they can have marginal distributions of a given form. To solve this problem copulas seem to be a promising solution. Copulas are a currently fashionable way to model multivariate data as they account for the dependence structure and provide a flexible representation of the multivariate distribution. Furthermore, for copulas the dependence properties can be separated from their marginal properties and multivariate models with marginal densities of arbitrary form can be constructed, allowing a wide range of possible association structures. In fact they allow for flexible dependence modelling, different from assuming simple linear correlation structures. However, in the application of copulas to discrete data marginal parameters affect dependence structure, too, and, hence the dependence properties are not fully separated from the marginal properties. Introducing covariates to describe the dependence by modelling the copula parameters is of special interest in this thesis. Thus, covariate information can describe the dependence either indirectly through the marginalparameters or directly through the parameters of the copula . We examine the case when the covariates are used both in marginal and/or copula parameters aiming at creating a highly flexible model producing very elegant dependence structures. Furthermore, the literature contains many theoretical results and families of copulas with several properties but there are few papers that compare the copula families and discuss model selection issues among candidate copula models rendering the question of which copulas are appropriate and whether we are able, from real data, to select the true copula that generated the data, among a series of candidates with, perhaps, very similar dependence properties. We examined a large set of candidate copula families taking intoaccount properties like concordance and tail dependence. The comparison is made theoretically using Kullback-Leibler distances between them. We have selected this distance because it has a nice relationship with log-likelihood and thus it can provide interesting insight on the likelihood based procedures used in practice. Furthermore a goodness of fit test based on Mahalanobisdistance, which is computed through parametric bootstrap, will be provided. Moreover we adopt a model averaging approach on copula modelling, based on the non-parametric bootstrap. Our intention is not to underestimate variability but add some additional variability induced by model selection making the precision of the estimate unconditional on the selected model. Moreover our estimates are synthesize from several different candidate copula models and thus they can have a flexible dependence structure. Taking under consideration the extended literature of copula for multivariate continuous data we concentrated our interest on fitting copulas on multivariate discrete data. The applications of multivariate copula models for discrete data are limited. Usually we have to trade off between models with limited dependence (e.g. only positive association) and models with flexible dependence but computational intractabilities. For example, the elliptical copulas provide a wide range of flexible dependence, but do not have closed form cumulative distribution functions. Thus one needs to evaluate the multivariate copula and, hence, a multivariate integral repeatedly for a large number of times. This can be time consuming but also, because of the numerical approach used to evaluate a multivariate integral, it may produce roundoff errors. On the other hand, multivariate Archimedean copulas, partially-symmetric m-variate copulas with m − 1 dependence parameters and copulas that are mixtures of max-infinitely divisible bivariate copulas have closed form cumulative distribution functions and thus computations are easy, but allow only positive dependence among the random variables. The bridge of the two above-mentioned problems might be the definition of a copula family which has simple form for its distribution function while allowing for negative dependence among the variables. We define such a multivariate copula family exploiting the use of finite mixture of simple uncorrelated normal distributions. Since the correlation vanishes, the cumulative distribution is simply the product of univariate normal cumulative distribution functions. The mixing operation introduces dependence. Hence we obtain a kind of flexible dependence, and allow for negative dependence.


2020 ◽  
Vol 49 (4) ◽  
pp. 9-18
Author(s):  
Alessandro Barbiero

The need for building and generating statistically dependent random variables arises in various fields of study where simulation has proven to be a useful tool.In this work, we present an approach for constructing ordinal variables with arbitrarily assigned marginal distributions and value of association or correlation, expressed in terms of either Goodman and Kruskal's gamma or Pearson's linear correlation. The approach first constructs a class of bivariate copula-based distributions matching the assigned margins, and then, within this class, identifies the distribution matching the assigned association or correlation, by calibrating the copula parameter. A numerical example and a possible application are illustrated.


2018 ◽  
Vol 45 (1) ◽  
pp. 61-70 ◽  
Author(s):  
Farzana Atique ◽  
Nii Attoh-Okine

Water main systems are aging and becoming a growing concern for maintenance. The structural deterioration of water mains is affected by different factors, such as pipe age, pipe material, soil condition, and pipe size, among others. Various methods of modeling have been used to predict the failure of water mains. Since pipe networks are underground and obtaining data on pipe conditions is very costly, statistical modeling has been widely used for pipe condition assessment. An emerging statistical method known as copula modeling is used here for pipe data analysis. The copula method is very useful in cases where marginals belong to different families of distributions. It is also useful for generating a large number of data points when it is difficult to obtain a data set, as is the case for pipe condition assessment, and where data sets have random variables belonging to non-Gaussian family distributions. Different copula families are applied here to model the dependency between the pipe age and repair age of pipes. The paper uses a Bayesian framework to estimate the parameter values in the copula model. This approach offers an additional option for estimating copula parameters for pipe data.


2007 ◽  
Vol 44 (02) ◽  
pp. 393-408 ◽  
Author(s):  
Allan Sly

Multifractional Brownian motion is a Gaussian process which has changing scaling properties generated by varying the local Hölder exponent. We show that multifractional Brownian motion is very sensitive to changes in the selected Hölder exponent and has extreme changes in magnitude. We suggest an alternative stochastic process, called integrated fractional white noise, which retains the important local properties but avoids the undesirable oscillations in magnitude. We also show how the Hölder exponent can be estimated locally from discrete data in this model.


2003 ◽  
Author(s):  
Gerard J. Solan ◽  
Jean M. Casey

Sign in / Sign up

Export Citation Format

Share Document