Large-sample confidence intervals of information-theoretic measures in linguistics

This article explores a method of creating confidence bounds for information-theoretic measures in linguistics, such as entropy, Kullback-Leibler Divergence (KLD), and mutual information. We show that a useful measure of uncertainty can be derived from simple statistical principles, namely the asymptotic distribution of the maximum likelihood estimator (MLE) and the delta method. Three case studies from phonology and corpus linguistics are used to demonstrate how to apply it and examine its robustness against common violations of its assumptions in linguistics, such as insufficient sample size and non-independence of data points.

Download Full-text

Bounds for the asymptotic normality of the maximum likelihood estimator using the Delta method

Latin American Journal of Probability and Mathematical Statistics ◽

10.30757/alea.v14-09 ◽

2017 ◽

Vol 14 (1) ◽

pp. 153

Author(s):

Andreas Anastasiou ◽

Christophe Ley

Keyword(s):

Maximum Likelihood ◽

Asymptotic Normality ◽

Maximum Likelihood Estimator ◽

Delta Method ◽

Likelihood Estimator

Download Full-text

On the maximum likelihood estimator for a discrete multivariate crash frequencies model

Afrika Statistika ◽

10.16929/as/2020.2325.161 ◽

2020 ◽

Vol 15 (2) ◽

pp. 2335-2348

Author(s):

Issa Cherif Geraldo

Keyword(s):

Statistical Analysis ◽

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Road Safety ◽

Strong Consistency ◽

Delta Method ◽

Closed Form Expression ◽

Parameter Vector ◽

Form Expression ◽

Likelihood Estimator

In this paper, we study the maximum likelihood estimator (MLE) of the parameter vector of a discrete multivariate crash frequencies model used in the statistical analysis of the effectiveness of a road safety measure. We derive the closed-form expression of the MLE afterwards we prove its strong consistency and we obtain the exact variance of the components of the MLE except one component whose variance is approximated via the delta method.

Download Full-text

A comparison between robust information theoretic estimator and asymptotic maximum likelihood estimator for misspecified model

Signal Processing, Sensor/Information Fusion, and Target Recognition XXVII ◽

10.1117/12.2304550 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xin Zhou ◽

Steven Kay

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Likelihood Estimator ◽

Information Theoretic ◽

Misspecified Model

Download Full-text

Guaranteed Bounds on Information-Theoretic Measures of Univariate Mixtures Using Piecewise Log-Sum-Exp Inequalities

10.20944/preprints201610.0086.v1 ◽

2016 ◽

Cited By ~ 1

Author(s):

Frank Nielsen ◽

Ke Sun

Keyword(s):

Closed Form ◽

Upper Bounds ◽

Cross Entropy ◽

Gaussian Mixtures ◽

Stochastic Integration ◽

Lower And Upper Bounds ◽

Information Theoretic ◽

Gamma Mixtures ◽

Leibler Divergence ◽

Information Theoretic Measures

Information-theoretic measures such as the entropy, cross-entropy and the Kullback-Leibler divergence between two mixture models is a core primitive in many signal processing tasks. Since the Kullback-Leibler divergence of mixtures provably does not admit a closed-form formula, it is in practice either estimated using costly Monte-Carlo stochastic integration, approximated, or bounded using various techniques. We present a fast and generic method that builds algorithmically closed-form lower and upper bounds on the entropy, the cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate the versatile method by reporting on our experiments for approximating the Kullback-Leibler divergence between univariate exponential mixtures, Gaussian mixtures, Rayleigh mixtures, and Gamma mixtures.

Download Full-text

Multivariate normal approximation of the maximum likelihood estimator via the delta method

Brazilian Journal of Probability and Statistics ◽

10.1214/18-bjps411 ◽

2020 ◽

Vol 34 (1) ◽

pp. 136-149

Author(s):

Andreas Anastasiou ◽

Robert E. Gaunt

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Normal Approximation ◽

Delta Method ◽

Multivariate Normal ◽

Likelihood Estimator ◽

Multivariate Normal Approximation

Download Full-text

The Comparison Between the Bayes Estimator and the Maximum Likelihood Estimator of the Reliability Function for Negative Exponential Distribution

Ibn AL- Haitham Journal For Pure and Applied Science ◽

10.30526/2017.ihsciconf.1815 ◽

2018 ◽

pp. 439

Author(s):

Hazim Mansour Gorgees ◽

Bushra Abdualrasool Ali ◽

Raghad Ibrahim Kathum

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Exponential Distribution ◽

Reliability Function ◽

Bayes Estimator ◽

Likelihood Estimator ◽

Monte Carlo Simulation Technique ◽

Negative Exponential Distribution ◽

Negative Exponential ◽

Better Than

In this paper, the maximum likelihood estimator and the Bayes estimator of the reliability function for negative exponential distribution has been derived, then a Monte –Carlo simulation technique was employed to compare the performance of such estimators. The integral mean square error (IMSE) was used as a criterion for this comparison. The simulation results displayed that the Bayes estimator performed better than the maximum likelihood estimator for different samples sizes.

Download Full-text

Transforming variables to central normality

Machine Learning ◽

10.1007/s10994-021-05960-5 ◽

2021 ◽

Author(s):

Jakob Raymaekers ◽

Peter J. Rousseeuw

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Simulation Study ◽

Real Data ◽

Data Sets ◽

Transformation Parameter ◽

Likelihood Estimator ◽

Extensive Simulation ◽

Highly Sensitive

AbstractMany real data sets contain numerical features (variables) whose distribution is far from normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it is customary to preprocess the variables to make them more normal. The Box–Cox and Yeo–Johnson transformations are well-known tools for this. However, the standard maximum likelihood estimator of their transformation parameter is highly sensitive to outliers, and will often try to move outliers inward at the expense of the normality of the central part of the data. We propose a modification of these transformations as well as an estimator of the transformation parameter that is robust to outliers, so the transformed data can be approximately normal in the center and a few outliers may deviate from it. It compares favorably to existing techniques in an extensive simulation study and on real data.

Download Full-text

Export Potential of Climate Smart Goods in India: Evidence from the Poisson Pseudo Maximum Likelihood Estimator

The International Trade Journal ◽

10.1080/08853908.2021.1890652 ◽

2021 ◽

pp. 1-21

Author(s):

Pushp Kumar ◽

Naresh Chandra Sahu ◽

Mohd Arshad Ansari

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Likelihood Estimator ◽

Pseudo Maximum Likelihood

Download Full-text

Optimized permutation testing for information theoretic measures of multi-gene interactions

BMC Bioinformatics ◽

10.1186/s12859-021-04107-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

James M. Kunert-Graf ◽

Nikita A. Sakhanenko ◽

David J. Galas

Keyword(s):

Large Scale ◽

Permutation Test ◽

Association Studies ◽

Genome Wide Association Studies ◽

Permutation Testing ◽

Exact Test ◽

Information Theoretic ◽

Information Theoretic Measures ◽

Full Analysis ◽

Computational Bottleneck

Abstract Background Permutation testing is often considered the “gold standard” for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large. Results In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP–SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 103 for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples. Conclusions The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts.

Download Full-text

Insights into codeswitching from online communication: Effects of language preference and conditions arising from vocabulary richness

Bilingualism Language and Cognition ◽

10.1017/s1366728921000122 ◽

2021 ◽

pp. 1-7

Author(s):

Laurie Beth Feldman ◽

Vidhushini Srinivasan ◽

Rachel B. Fernandes ◽

Samira Shaikh

Keyword(s):

Online Communication ◽

Language Preference ◽

Lexical Diversity ◽

Information Theoretic ◽

Vocabulary Richness ◽

Twitter Data ◽

Language Mixing ◽

Communication Effects ◽

Information Theoretic Measures ◽

Spanish Bilinguals

Abstract Twitter data from a crisis that impacted many English–Spanish bilinguals show that the direction of codeswitches is associated with the statistically documented tendency of single speakers to prefer one language over another in their tweets, as gleaned from their tweeting history. Further, lexical diversity, a measure of vocabulary richness derived from information-theoretic measures of uncertainty in communication, is greater in proximity to a codeswitch than in productions remote from a switch. The prospects of a role for lexical diversity in characterizing the conditions for a language switch suggest that communicative precision may induce conditions that attenuate constraints against language mixing.

Download Full-text