An Inference Methodology for Selecting and Clustering Genes Based on Likelihood Ratio Test

Author(s):  
Ruiyin Liu ◽  
Jian Tao ◽  
Dehui Wang

Peddada et al. (Gene selected and clustering for time-course and close-response microarray experiments using order-restricted inference, Bioinformatics 19 (2003): 834–841) proposed a new method for selecting and clustering genes according to their time-course or dose-response profiles. Their method necessitates the assumption of a constant variance through time or among dosages. This homoscedasticity assumption is, however, seldom satisfied in practice. In this paper, via the application of Shi’s algorithms and a modified bootstrap procedure (N. Z. Shi, Maximum likelihood estimation of means and variances from normal populations under simulations order restrictions (J. Multivariate Anal. 50 (1994) 282–293), we proposed a generalized order-restricted inference method which releases the homoscedasticity restriction. Simulation results show that procedures considered in this paper as well as those by Peddada et al. (Gene selected and clustering for time-course and close-response microarray experiments using order-restricted inference, Bioinformatics 19 (2003) 834–841) are generally comparable in terms of Type I error rate while our proposed algorithms are usually more powerful.

2003 ◽  
Vol 19 (7) ◽  
pp. 834-841 ◽  
Author(s):  
S. D. Peddada ◽  
E. K. Lobenhofer ◽  
L. Li ◽  
C. A. Afshari ◽  
C. R. Weinberg ◽  
...  

2021 ◽  
Author(s):  
Dapeng Hu ◽  
Chong Wang ◽  
Annette O'Connor

Abstract Background: Network meta-analysis (NMA) is a statistical method used to combine results from several clinical trials and simultaneously compare multiple treatments using direct and indirect evidence. Statistical heterogeneity is a characteristic describing the variability in the intervention effects being evaluated in the different studies in network meta-analysis. One approach to dealing with statistical heterogeneity is to perform a random effects network meta-analysis that incorporates a between-study variance into the statistical model. A common assumption in the random effects model for network meta-analysis is the homogeneity of between-study variance across all interventions. However, there are applications of NMA where the single between-study assumption is potentially incorrect and instead the model should incorporate more than one between-study variances. Methods: In this paper, we develop an approach to testing the homogeneity of between-study variance assumption based on a likelihood ratio test. A simulation study was conducted to assess the type I error and power of the proposed test. This method is then applied to a network meta-analysis of antibiotic treatments for Bovine respiratory disease (BRD). Results: The type I error rate was well controlled in the Monte Carlo simulation. The homogeneous between-study variance assumption is unrealistic both statistically and practically in the network meta-analysis BRD. The point estimate and conffdence interval of relative effect sizes are strongly inuenced by this assumption. Conclusions: Since homogeneous between-study variance assumption is a strong assumption, it is crucial to test the validity of this assumption before conducting a network meta-analysis. Here we propose and validate a method for testing this single between-study variance assumption which is widely used for many NMA.


Author(s):  
Georg Krammer

The Andersen LRT uses sample characteristics as split criteria to evaluate Rasch model fit, or theory driven hypothesis testing for a test. The power and Type I error of a random split criterion was evaluated with a simulation study. Results consistently show a random split criterion lacks power.


2016 ◽  
Vol 12 (2) ◽  
Author(s):  
Mahdis Azadbakhsh ◽  
Xin Gao ◽  
Hanna Jankowski

AbstractWe study the problem of multiple hypothesis testing for correlated clustered data. As the existing multiple comparison procedures based on maximum likelihood estimation could be computationally intensive, we propose to construct multiple comparison procedures based on composite likelihood method. The new test statistics account for the correlation structure within the clusters and are computationally convenient to compute. Simulation studies show that the composite likelihood based procedures maintain good control of the familywise type I error rate in the presence of intra-cluster correlation, whereas ignoring the correlation leads to erratic performance.


2019 ◽  
Author(s):  
Melissa Angelina Rodgers ◽  
James E Pustejovsky

Selective reporting of results based on their statistical significance threatens the validity of meta-analytic findings. A variety of techniques for detecting selective reporting, publication bias, or small-study effects are available and are routinely used in research syntheses. Most such techniques are univariate, in that they assume that each study contributes a single, independent effect size estimate to the meta-analysis. In practice, however, studies often contribute multiple, statistically dependent effect size estimates, such as for multiple measures of a common outcome construct. Many methods are available for meta-analyzing dependent effect sizes, but methods for investigating selective reporting while also handling effect size dependencies require further investigation. Using Monte Carlo simulations, we evaluate three available univariate tests for small-study effects or selective reporting, including the Trim & Fill test, Egger's regression test, and a likelihood ratio test from a three-parameter selection model (3PSM), when dependence is ignored or handled using ad hoc techniques. We also examine two variants of Egger’s regression test that incorporate robust variance estimation (RVE) or multi-level meta-analysis (MLMA) to handle dependence. Simulation results demonstrate that ignoring dependence inflates Type I error rates for all univariate tests. Variants of Egger's regression maintain Type I error rates when dependent effect sizes are sampled or handled using RVE or MLMA. The 3PSM likelihood ratio test does not fully control Type I error rates. With the exception of the 3PSM, all methods have limited power to detect selection bias except under strong selection for statistically significant effects.


2016 ◽  
Vol 42 (1) ◽  
pp. 46-68 ◽  
Author(s):  
Sandip Sinharay

An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have benefited from item preknowledge; the statistics can be used for both nonadaptive and adaptive assessments that may include either or both of dichotomous and polytomous items. Each new statistic has an asymptotic standard normal n distribution. It is demonstrated in detailed simulation studies that the Type I error rates of the new statistics are close to the nominal level and the values of power of the new statistics are larger than those of an existing statistic for addressing the same problem.


2017 ◽  
Vol 41 (5) ◽  
pp. 353-371
Author(s):  
Sun-Joo Cho ◽  
Paul De Boeck ◽  
Woo-Yeol Lee

The linear logistic test model (LLTM) has been widely applied to investigate the effects of item covariates on item difficulty. The LLTM was extended with random item residuals to account for item differences not explained by the item covariates. This extended LLTM is called the LLTM-R. In this article, statistical inference methods are investigated for these two models. Type I error rates and power are compared via Monte Carlo studies. Based on the simulation results, the use of the likelihood ratio test (LRT) is recommended over the paired-sample t test based on sum scores, the Wald z test, and information criteria, and the LRT is recommended over the profile likelihood confidence interval because of the simplicity of the LRT. In addition, it is concluded that the LLTM-R is the better general model approach. Inferences based on the LLTM while the LLTM-R is the true model appear to be largely biased in the liberal way, while inferences based on the LLTM-R while the LLTM is the true model are only biased in a very minor and conservative way. Furthermore, in the absence of residual variance, Type I error rate and power were acceptable except for power when the number of items is small (10 items) and also the number of persons is small (200 persons). In the presence of residual variance, however, the number of items needs to be large (80 items) to avoid an inflated Type I error and to reach a power level of .90 for a moderate effect.


2011 ◽  
Vol 72 (3) ◽  
pp. 469-492 ◽  
Author(s):  
Eun Sook Kim ◽  
Myeongsun Yoon ◽  
Taehun Lee

Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be tested through other measurement invariance testing techniques. MIMIC modeling is also used for measurement invariance testing by allowing a direct path from a grouping covariate to each observed variable. This simulation study with both continuous and categorical variables investigated the performance of MIMIC in detecting noninvariant variables under various study conditions and showed that the likelihood ratio test of MIMIC with Oort adjustment not only controlled Type I error rates below the nominal level but also maintained high power across study conditions.


2004 ◽  
Vol 3 (1) ◽  
pp. 1-32 ◽  
Author(s):  
Derek Gordon ◽  
Yaning Yang ◽  
Chad Haynes ◽  
Stephen J Finch ◽  
Nancy R Mendell ◽  
...  

Phenotype and/or genotype misclassification can: significantly increase type II error probabilities for genetic case/control association, causing decrease in statistical power; and produce inaccurate estimates of population frequency parameters. We present a method, the likelihood ratio test allowing for errors (LRTae) that incorporates double-sample information for phenotypes and/or genotypes on a sub-sample of cases/controls. Population frequency parameters and misclassification probabilities are determined using a double-sample procedure as implemented in the Expectation-Maximization (EM) method. We perform null simulations assuming a SNP marker or a 4-allele (multi-allele) marker locus. To compare our method with the standard method that makes no adjustment for errors (LRTstd), we perform power simulations using a 2^k factorial design with high and low settings of: case/control samples, phenotype/genotype costs, double-sampled phenotypes/genotypes costs, phenotype/genotype error, and proportions of double-sampled individuals. All power simulations are performed fixing equal costs for the LRTstd and LRTae methods. We also consider case/control ApoE genotype data for an actual Alzheimer's study.The LRTae method maintains correct type I error proportions for all null simulations and all significance level thresholds (10%, 5%, 1%). LRTae average estimates of population frequencies and misclassification probabilities are equal to the true values, with variances of 10e-7 to 10e-8. For power simulations, the median power difference LRTae-LRTstd at the 5% significance level is 0.06 for multi-allele data and 0.01 for SNP data. For the ApoE data example, the LRTae and LRTstd p-values are 5.8 x 10e-5 and 1.6 x 10e-3, respectively. The increase in significance is due to adjustment in the LRTae for misclassification of the most commonly reported risk allele. We have developed freely available software that performs our LRTae statistic.


Sign in / Sign up

Export Citation Format

Share Document