Anchor Selection Using the Wald Test Anchor-All-Test-All Procedure

Methods for testing differential item functioning (DIF) require that the reference and focal groups are linked on a common scale using group-invariant anchor items. Several anchor-selection strategies have been introduced in an item response theory framework. However, popular strategies often utilize likelihood ratio testing with all-others-as-anchors that requires multiple model fittings. The current study explored alternative anchor-selection strategies based on a modified version of the Wald χ2 test that is implemented in flexMIRT and IRTPRO, and made comparisons with methods based on the popular likelihood ratio test. Accuracies of anchor identification of four different strategies (two testing methods combined with two selection criteria), along with the power and Type I error associated with respective follow-up DIF tests, will be presented. Implications for applied researchers and suggestions for future research will be discussed.

Download Full-text

The Effects of Purification and the Evaluation of Differential Item Functioning With the Likelihood Ratio Test

Methodology ◽

10.1027/1614-2241/a000046 ◽

2012 ◽

Vol 8 (4) ◽

pp. 134-145 ◽

Cited By ~ 6

Author(s):

Fabiola González-Betanzos ◽

Francisco J. Abad

Keyword(s):

Differential Item Functioning ◽

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Type I Error ◽

Error Rates ◽

Ratio Test ◽

Type I ◽

Two Stage ◽

Item Functioning ◽

Size Type

The current research compares the effects of several strategies to establish the anchor subtest when detecting for differential item functioning (DIF) using the IRT likelihood ratio test in one- and two-stage procedures. Two one-stage strategies were examined: (1) “One item” and (2) “All other items” used as anchor. Additionally, two two-stage strategies were tested: (3) “One anchor item with posterior anchor test augmentation” and (4) “All other items with purification.” The strategies were compared in a simulation study, where sample sizes, DIF size, type of DIF, and software implementation (MULTILOG vs. IRTLRDIF) were manipulated. Results indicated that Procedure (1) was more efficient than (2). Purification was found to improve Type I error rates substantially with the “all other items” strategy, while “posterior anchor test augmentation” did not yield a significant improvement. In relation to the effect of the software used, we found that MULTILOG generally offers better results than IRTLRDIF.

Download Full-text

Likelihood Ratio Testing under Measurement Errors

Entropy ◽

10.3390/e20120966 ◽

2018 ◽

Vol 20 (12) ◽

pp. 966 ◽

Cited By ~ 3

Author(s):

Michel Broniatowski ◽

Jana Jurečková ◽

Jan Kalina

Keyword(s):

Likelihood Ratio ◽

Measurement Errors ◽

Type I Error ◽

Alternative Hypothesis ◽

Random Variable ◽

Ratio Test ◽

Type I ◽

Likelihood Ratio Testing ◽

And Performance ◽

Unobservable Parameter

We consider the likelihood ratio test of a simple null hypothesis (with density f 0 ) against a simple alternative hypothesis (with density g 0 ) in the situation that observations X i are mismeasured due to the presence of measurement errors. Thus instead of X i for i = 1 , … , n , we observe Z i = X i + δ V i with unobservable parameter δ and unobservable random variable V i . When we ignore the presence of measurement errors and perform the original test, the probability of type I error becomes different from the nominal value, but the test is still the most powerful among all tests on the modified level. Further, we derive the minimax test of some families of misspecified hypotheses and alternatives. The test exploits the concept of pseudo-capacities elaborated by Huber and Strassen (1973) and Buja (1986). A numerical experiment illustrates the principles and performance of the novel test.

Download Full-text

A Likelihood Ratio Test For The Homogeneity of Between-Study Variance in Network Meta-Analysis

10.21203/rs.3.rs-224184/v1 ◽

2021 ◽

Author(s):

Dapeng Hu ◽

Chong Wang ◽

Annette O'Connor

Keyword(s):

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Random Effects ◽

Type I Error ◽

Meta Analysis ◽

Indirect Evidence ◽

Point Estimate ◽

Ratio Test ◽

Type I ◽

Statistical Heterogeneity

Abstract Background: Network meta-analysis (NMA) is a statistical method used to combine results from several clinical trials and simultaneously compare multiple treatments using direct and indirect evidence. Statistical heterogeneity is a characteristic describing the variability in the intervention effects being evaluated in the different studies in network meta-analysis. One approach to dealing with statistical heterogeneity is to perform a random effects network meta-analysis that incorporates a between-study variance into the statistical model. A common assumption in the random effects model for network meta-analysis is the homogeneity of between-study variance across all interventions. However, there are applications of NMA where the single between-study assumption is potentially incorrect and instead the model should incorporate more than one between-study variances. Methods: In this paper, we develop an approach to testing the homogeneity of between-study variance assumption based on a likelihood ratio test. A simulation study was conducted to assess the type I error and power of the proposed test. This method is then applied to a network meta-analysis of antibiotic treatments for Bovine respiratory disease (BRD). Results: The type I error rate was well controlled in the Monte Carlo simulation. The homogeneous between-study variance assumption is unrealistic both statistically and practically in the network meta-analysis BRD. The point estimate and conffdence interval of relative effect sizes are strongly inuenced by this assumption. Conclusions: Since homogeneous between-study variance assumption is a strong assumption, it is crucial to test the validity of this assumption before conducting a network meta-analysis. Here we propose and validate a method for testing this single between-study variance assumption which is widely used for many NMA.

Download Full-text

Detecting Differential Item Functioning Using Cognitive Diagnosis Models: Applications of the Wald Test and Likelihood Ratio Test in a University Entrance Examination

Applied Measurement in Education ◽

10.1080/08957347.2021.1987906 ◽

2021 ◽

pp. 1-23

Author(s):

Roghayeh Mehrazmay ◽

Behzad Ghonsooly ◽

Jimmy de la Torre

Keyword(s):

Differential Item Functioning ◽

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Wald Test ◽

Cognitive Diagnosis ◽

Entrance Examination ◽

Ratio Test ◽

Cognitive Diagnosis Models ◽

Item Functioning ◽

University Entrance Examination

Download Full-text

A global test for competing risks survival analysis

Statistical Methods in Medical Research ◽

10.1177/0962280220938402 ◽

2020 ◽

Vol 29 (12) ◽

pp. 3666-3683

Author(s):

Dominic Edelmann ◽

Maral Saadati ◽

Hein Putter ◽

Jelle Goeman

Keyword(s):

Survival Analysis ◽

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Competing Risks ◽

Cox Model ◽

Wald Test ◽

Ratio Test ◽

Type I ◽

Multistate Models ◽

Global Test

Standard tests for the Cox model, such as the likelihood ratio test or the Wald test, do not perform well in situations, where the number of covariates is substantially higher than the number of observed events. This issue is perpetuated in competing risks settings, where the number of observed occurrences for each event type is usually rather small. Yet, appropriate testing methodology for competing risks survival analysis with few events per variable is missing. In this article, we show how to extend the global test for survival by Goeman et al. to competing risks and multistate models[Per journal style, abstracts should not have reference citations. Therefore, can you kindly delete this reference citation.]. Conducting detailed simulation studies, we show that both for type I error control and for power, the novel test outperforms the likelihood ratio test and the Wald test based on the cause-specific hazards model in settings where the number of events is small compared to the number of covariates. The benefit of the global tests for competing risks survival analysis and multistate models is further demonstrated in real data examples of cancer patients from the European Society for Blood and Marrow Transplantation.

Download Full-text

A Note on the Sampling Distribution of the Likelihood Ratio Test in the Context of the Linear Logistic Test Model

Austrian Journal of Statistics ◽

10.17713/ajs.v38i4.275 ◽

2016 ◽

Vol 38 (4) ◽

Cited By ~ 1

Author(s):

Rainer W. Alexandrowicz

Keyword(s):

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Type I Error ◽

Sampling Distribution ◽

Weight Matrix ◽

Ratio Test ◽

Type I ◽

Test Model ◽

Linear Logistic Test Model ◽

Error Risk

One important tool for assessing whether a data set can be described equally well with a Rasch Model (RM) or a Linear Logistic Test Model (LLTM) is the Likelihood Ratio Test (LRT). In practical applications this test seems to overly reject the null hypothesis, even when the null hypothesis is true. Aside from obvious reasons like inadequate restrictiveness of linear restrictions formulated in the LLTM or the RM not being true, doubts have arisen whether the test holds the nominal type-I error risk, that is whether its theoretically derived sampling distribution applies. Therefore, the present contribution explores the sampling distribution of the likelihood ratio test comparing a Rasch model with a Linear Logistic Test Model. Particular attention is put on the issue of similar columns in the weight matrixW of the LLTM: Although full column rank of this matrix is a technical requirement, columns can differ in only a few entries, what in turn might have an impact on the sampling distribution of the test statistic. Therefore, a system of how to generate weight matrices with similar columns has been established and tested in a simulation study. The results were twofold: In general, the matricesconsidered in the study showed LRT results where the empirical alpha showed only spurious deviations from the nominal alpha. Hence the theoretically chosen alpha seems maintained up to random variation. Yet, one specific matrix clearly indicated a highly increased type-I error risk: The empirical alpha was at least twice the nominal alpha when using this weight matrix. This shows that we have to indeed consider the internal structure of the weight matrix when applying the LRT for testing the LLTM. Best practice would be to perform a simulation or bootstrap/re-sampling study for the weight matrix under consideration in order to rule out a misleadingly significant result due to reasons other than true model misfit.

Download Full-text

ModL: exploring and restoring regularity when testing for positive selection

Bioinformatics ◽

10.1093/bioinformatics/bty1019 ◽

2018 ◽

Vol 35 (15) ◽

pp. 2545-2554 ◽

Cited By ~ 3

Author(s):

Joseph Mingrone ◽

Edward Susko ◽

Joseph P Bielawski

Keyword(s):

Positive Selection ◽

Likelihood Ratio ◽

Type I Error ◽

Error Rates ◽

Ratio Test ◽

Type I ◽

Chi Square ◽

Type I Error Rates ◽

Modified Likelihood Ratio Test ◽

Modified Likelihood

Abstract Motivation Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. Although it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. We show that commonly used thresholds need not yield conservative tests, but instead give larger than expected Type I error rates. Statistical regularity can be restored by using a modified likelihood ratio test. Results We give theoretical results to prove that, if the number of sites is not too small, the modified likelihood ratio test gives approximately correct Type I error probabilities regardless of the parameter settings of the underlying null hypothesis. Simulations show that modification gives Type I error rates closer to those stated without a loss of power. The simulations also show that parameter estimation for mixture models of codon evolution can be challenging in certain data-generation settings with very different mixing distributions giving nearly identical site pattern distributions unless the number of taxa and tree length are large. Because mixture models are widely used for a variety of problems in molecular evolution, the challenges and general approaches to solving them presented here are applicable in a broader context. Availability and implementation https://github.com/jehops/codeml_modl Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Comparing Performances (Type I error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

Educational Sciences Theory & Practice ◽

10.12738/estp.2014.6.2165 ◽

2014 ◽

Cited By ~ 3

Keyword(s):

Differential Item Functioning ◽

Likelihood Ratio ◽

Type I Error ◽

Type I ◽

Item Functioning

Download Full-text

Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

Applied Psychological Measurement ◽

10.1177/0146621617698453 ◽

2017 ◽

Vol 41 (6) ◽

pp. 403-421 ◽

Cited By ~ 5

Author(s):

Sandip Sinharay

Keyword(s):

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Type I Error ◽

Null Distribution ◽

Real Data ◽

Major Type ◽

Ratio Test ◽

Type I ◽

Posterior Shift ◽

Item Preknowledge

Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.

Download Full-text

A likelihood ratio test for the homogeneity of between-study variance in network meta-analysis

Systematic Reviews ◽

10.1186/s13643-021-01859-3 ◽

2021 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Dapeng Hu ◽

Chong Wang ◽

Annette M. O’Connor

Keyword(s):

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Random Effects ◽

Type I Error ◽

Meta Analysis ◽

Indirect Evidence ◽

Point Estimate ◽

Ratio Test ◽

Type I ◽

Statistical Heterogeneity

Abstract Background Network meta-analysis (NMA) is a statistical method used to combine results from several clinical trials and simultaneously compare multiple treatments using direct and indirect evidence. Statistical heterogeneity is a characteristic describing the variability in the intervention effects being evaluated in the different studies in network meta-analysis. One approach to dealing with statistical heterogeneity is to perform a random effects network meta-analysis that incorporates a between-study variance into the statistical model. A common assumption in the random effects model for network meta-analysis is the homogeneity of between-study variance across all interventions. However, there are applications of NMA where the single between-study assumption is potentially incorrect and instead the model should incorporate more than one between-study variances. Methods In this paper, we develop an approach to testing the homogeneity of between-study variance assumption based on a likelihood ratio test. A simulation study was conducted to assess the type I error and power of the proposed test. This method is then applied to a network meta-analysis of antibiotic treatments for Bovine respiratory disease (BRD). Results The type I error rate was well controlled in the Monte Carlo simulation. We found statistical evidence (p value = 0.052) against the homogeneous between-study variance assumption in the network meta-analysis BRD. The point estimate and confidence interval of relative effect sizes are strongly influenced by this assumption. Conclusions Since homogeneous between-study variance assumption is a strong assumption, it is crucial to test the validity of this assumption before conducting a network meta-analysis. Here we propose and validate a method for testing this single between-study variance assumption which is widely used for many NMA.

Download Full-text