PEPA test: fast and powerful differential analysis from relative quantitative proteomics data using shared peptides

Summary We propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences, different proteins can lead to peptides with identical amino acid chains, so that their parent protein is ambiguous. These so-called shared peptides make the protein-level statistical analysis a challenge and are often not accounted for. In this article, we use a linear model describing peptide–protein relationships to build a likelihood ratio test of differential abundance for proteins. We show that the likelihood ratio statistic can be computed in linear time with the number of peptides. We also provide the asymptotic null distribution of a regularized version of our statistic. Experiments on both real and simulated datasets show that our procedures outperforms state-of-the-art methods. The procedures are available via the pepa.test function of the DAPAR Bioconductor R package.

Download Full-text

PEPA test: fast and powerful differential analysis from relative quantitative proteomics data using shared peptides

10.1101/158212 ◽

2017 ◽

Author(s):

Laurent Jacob ◽

Florence Combes ◽

Thomas Burger

Keyword(s):

Likelihood Ratio ◽

Linear Time ◽

Hypothesis Test ◽

Likelihood Ratio Statistic ◽

Null Distribution ◽

Enzymatic Digestion ◽

Ratio Test ◽

Differential Analysis ◽

Proteomics Data ◽

Differential Abundance

AbstractWe propose a new hypothesis test for the differential abundance of proteins in mass-spectrometry based relative quantification. An important feature of this type of high-throughput analyses is that it involves an enzymatic digestion of the sample proteins into peptides prior to identification and quantification. Due to numerous homology sequences, different proteins can lead to peptides with identical amino acid chains, so that their parent protein is ambiguous. These so-called shared peptides make the protein-level statistical analysis a challenge, so that they are often not accounted for. In this article, we use a linear model describing peptide-protein relationships to build a likelihood ratio test of differential abundance for proteins. We show that the likelihood ratio statistic can be computed in linear time with the number of peptides. We also provide the asymptotic null distribution of a regularized version of our statistic. Experiments on both real and simulated datasets show that our procedures outperforms state-of-the-art methods. The procedures are available via the pepa.test function of the DAPAR Bioconductor R package.

Download Full-text

Universal inference

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1922664117 ◽

2020 ◽

Vol 117 (29) ◽

pp. 16880-16890 ◽

Cited By ~ 2

Author(s):

Larry Wasserman ◽

Aaditya Ramdas ◽

Sivaraman Balakrishnan

Keyword(s):

Maximum Likelihood ◽

Likelihood Ratio ◽

Likelihood Ratio Statistic ◽

Model Misspecification ◽

Null Distribution ◽

Regularity Conditions ◽

Ratio Test ◽

Finite Sample ◽

Confidence Sets ◽

Nonparametric Models

We propose a general method for constructing confidence sets and hypothesis tests that have finite-sample guarantees without regularity conditions. We refer to such procedures as “universal.” The method is very simple and is based on a modified version of the usual likelihood-ratio statistic that we call “the split likelihood-ratio test” (split LRT) statistic. The (limiting) null distribution of the classical likelihood-ratio statistic is often intractable when used to test composite null hypotheses in irregular statistical models. Our method is especially appealing for statistical inference in these complex setups. The method we suggest works for any parametric model and also for some nonparametric models, as long as computing a maximum-likelihood estimator (MLE) is feasible under the null. Canonical examples arise in mixture modeling and shape-constrained inference, for which constructing tests and confidence sets has been notoriously difficult. We also develop various extensions of our basic methods. We show that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood. We investigate some conditions under which our methods yield valid inferences under model misspecification. Further, the split LRT can be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid P values and confidence sequences. Finally, when combined with the method of sieves, it can be used to perform model selection with nested model classes.

Download Full-text

On sequential versions of the generalized likelihood ratio test

Mathematical Proceedings of the Cambridge Philosophical Society ◽

10.1017/s0305004100000657 ◽

1979 ◽

Vol 86 (1) ◽

pp. 85-90 ◽

Cited By ~ 9

Author(s):

Andrew D. Barbour

Keyword(s):

Likelihood Ratio ◽

Null Hypothesis ◽

Exponential Family ◽

Likelihood Ratio Statistic ◽

Generalized Likelihood Ratio Test ◽

Ratio Test ◽

Uhlenbeck Process ◽

Large Sample ◽

Sequential Tests ◽

Composite Hypotheses

AbstractIt is shown that the Wilks large sample likelihood ratio statistic λn, for testing between composite hypotheses Θ0 ⊂ Θ1 on the basis of a sample of size n, behaves as n varies like a diffusion process related to an equilibrium Ornstein-Uhlenbeck process, whenever the null hypothesis is true. This fact is used to construct large sample sequential tests based on λn, which are the same whatever the underlying distributions. In particular, the underlying distributions need not belong to an exponential family.

Download Full-text

Super-Resolved Multiple Scatterers Detection in SAR Tomography Based on Compressive Sensing Generalized Likelihood Ratio Test (CS-GLRT)

Remote Sensing ◽

10.3390/rs11161930 ◽

2019 ◽

Vol 11 (16) ◽

pp. 1930 ◽

Cited By ~ 4

Author(s):

Hui Luo ◽

Zhenhong Li ◽

Zhen Dong ◽

Anxi Yu ◽

Yongsheng Zhang ◽

...

Keyword(s):

Compressive Sensing ◽

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Urban Areas ◽

Hypothesis Test ◽

Least Square ◽

Estimation Accuracy ◽

Generalized Likelihood Ratio Test ◽

Ratio Test ◽

Generalized Likelihood Ratio

The application of SAR tomography (TomoSAR) on the urban infrastructure and other man-made buildings has gained increasing popularity with the development of modern high-resolution spaceborne satellites. Urban tomography focuses on the separation of the overlaid targets within one azimuth-range resolution cell, and on the reconstruction of their reflectivity profiles. In this work, we build on the existing methods of compressive sensing (CS) and generalized likelihood ratio test (GLRT), and develop a multiple scatterers detection method named CS-GLRT to automatically recognize the number of scatterers superimposed within a single pixel as well as to reconstruct the backscattered reflectivity profiles of the detected scatterers. The proposed CS-GLRT adopts a two-step strategy. In the first step, an L1-norm minimization is carried out to give a robust estimation of the candidate positions pixel by pixel with super-resolution. In the second step, a multiple hypothesis test is implemented in the GLRT to achieve model order selection, where the mapping matrix is constrained within the afore-selected columns, namely, within the candidate positions, and the parameters are estimated by least square (LS) method. Numerical experiments on simulated data were carried out, and the presented results show its capability of separating the closely located scatterers with a quasi-constant false alarm rate (QCFAR), as well as of obtaining an estimation accuracy approaching the Cramer–Rao Low Bound (CRLB). Experiments on real data of Spotlight TerraSAR-X show that CS-GLRT allows detecting single scatterers with high density, distinguishing a considerable number of double scatterers, and even detecting triple scatterers. The estimated results agree well with the ground truth and help interpret the true structure of the complex or buildings studied in the SAR images. It should be noted that this method is especially suitable for urban areas with very dense infrastructure and man-made buildings, and for datasets with tightly-controlled baseline distribution.

Download Full-text

On the Exact Null Distribution of the Generalised Likelihood Ratio Test for Analysing Unreplicated Factorial Designs

Biometrical Journal ◽

10.1002/bimj.200310149 ◽

2005 ◽

Vol 47 (5) ◽

pp. 755-762

Author(s):

Ying Chen

Keyword(s):

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Null Distribution ◽

Ratio Test ◽

Factorial Designs

Download Full-text

Asymptotic expansion of the null distribution of the likelihood ratio statistic for testing the equality of variances in a nonnormal one-way {ANOVA} model

Hiroshima Mathematical Journal ◽

10.32917/hmj/1150997870 ◽

2003 ◽

Vol 33 (1) ◽

pp. 113-126 ◽

Cited By ~ 6

Author(s):

Tetsuji Tonda ◽

Hirofumi Wakaki

Keyword(s):

Asymptotic Expansion ◽

Likelihood Ratio ◽

Likelihood Ratio Statistic ◽

Null Distribution ◽

Anova Model ◽

One Way Anova

Download Full-text

On the Asymptotic Non-Null Distribution of Randomly Stopped Log-Likelihood Ratio Statistic

Calcutta Statistical Association Bulletin ◽

10.1177/0008068319920310 ◽

1992 ◽

Vol 42 (3-4) ◽

pp. 255-260 ◽

Cited By ~ 3

Author(s):

A.K. Basu ◽

Debasis Bhattacharya

Keyword(s):

Likelihood Ratio ◽

Alternative Hypothesis ◽

Likelihood Ratio Statistic ◽

Null Distribution ◽

Regularity Conditions ◽

Normal Distributions ◽

Log Likelihood ◽

Log Likelihood Ratio ◽

Mixture Of Normal Distributions ◽

Loglikelihood Ratio

Non-null asymptotic distribution of randomly stopped loglikelihood ratio statistic for general dependent process has been obtained. We observed that under certain regularity conditions the limiting distribution of randomly stopped log-likelihood ratio statistic under alternative hypothesis is again a mixture of Normal distributions. Some possible applications of the result has also been pointed out in this note.

Download Full-text

Statistical inference for Markov processes when the model is incorrect

Advances in Applied Probability ◽

10.1017/s0001867800033012 ◽

1979 ◽

Vol 11 (04) ◽

pp. 737-749

Author(s):

Robert V. Foutz ◽

R. C. Srivastava

Keyword(s):

Maximum Likelihood ◽

Statistical Inference ◽

Likelihood Ratio ◽

Markov Processes ◽

Transition Probability ◽

Likelihood Ratio Statistic ◽

Null Distribution ◽

Transition Density ◽

Likelihood Method ◽

Incorrect Model

Statistical inference for Markov processes is commonly based on the maximum likelihood method of estimation and the likelihood ratio criterion for testing hypotheses. Construction of estimators and test statistics by these methods require that a model be chosen in the form of a family of transition density functions. In this paper, asymptotic properties of the maximum likelihood estimator and of the likelihood ratio statistic λ n are examined when the model chosen for their construction is incorrect—that is, when no density in the model is a density for the transition probability distribution of the Markov process. It is shown that if and λ n are constructed from a ‘regular’ incorrect model, then is consistent and asymptotically normally distributed and the asymptotic null distribution of −2 log λ n is that of a linear combination of independent chi-squared random variables. These results are applied to propose measures of the performance of the test based on λ n when the statistic is constructed from an incorrect model.

Download Full-text