scholarly journals Multivariate Hypothesis Testing Methods for Evaluating Significant Individual Change

2017 ◽  
Vol 42 (3) ◽  
pp. 221-239 ◽  
Author(s):  
Chun Wang ◽  
David J. Weiss

The measurement of individual change has been an important topic in both education and psychology. For instance, teachers are interested in whether students have significantly improved (e.g., learned) from instruction, and counselors are interested in whether particular behaviors have been significantly changed after certain interventions. Although classical test methods have been unable to adequately resolve the problems in measuring change, recent approaches for measuring change have begun to use item response theory (IRT). However, all prior methods mainly focus on testing whether growth is significant at the group level. The present research targets a key research question: Is the “change” in latent trait estimates for each individual significant across occasions? Many researchers have addressed this research question assuming that the latent trait is unidimensional. This research generalizes their earlier work and proposes four hypothesis testing methods to evaluate individual change on multiple latent traits: a multivariate Z-test, a multivariate likelihood ratio test, a multivariate score test, and a Kullback–Leibler test. Simulation results show that these tests hold promise of detecting individual change with low Type I error and high power. A real-data example from an educational assessment illustrates the application of the proposed methods.

2016 ◽  
Vol 42 (1) ◽  
pp. 46-68 ◽  
Author(s):  
Sandip Sinharay

An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have benefited from item preknowledge; the statistics can be used for both nonadaptive and adaptive assessments that may include either or both of dichotomous and polytomous items. Each new statistic has an asymptotic standard normal n distribution. It is demonstrated in detailed simulation studies that the Type I error rates of the new statistics are close to the nominal level and the values of power of the new statistics are larger than those of an existing statistic for addressing the same problem.


2018 ◽  
Vol 28 (4) ◽  
pp. 1188-1202 ◽  
Author(s):  
Tsung-Shan Tsou

We construct a legitimate likelihood function for the agreement kappa coefficient for correlated data without specifically modelling all levels of correlation. This makes available the likelihood ratio test, the score test and other tools without the knowledge of the underlying distributions. This parametric robust likelihood approach applies to general clustered data scenarios. We provide simulations and real data analysis to demonstrate the advantage of the robust procedure.


2018 ◽  
Vol 28 (10-11) ◽  
pp. 3123-3141 ◽  
Author(s):  
Yi Tang ◽  
Wan Tang

Excessive zeros are common in practice and may cause overdispersion and invalidate inferences when fitting Poisson regression models. Zero-inflated Poisson regression models may be applied if there are inflated zeros; however, it is desirable to test if there are inflated zeros before such zero-inflated Poisson models are applied. Assuming a constant probability of being a structural zero in a zero-inflated Poisson regression model, the existence of the inflated zeros may be tested by testing whether the constant probability is zero. In such situations, the Wald, score, and likelihood ratio tests can be applied. Without specifying a zero-inflated Poisson model, He et al. recently developed a test by comparing the amount of observed zeros with that expected under the Poisson model. In this paper, we develop a closed form for the test and compare it with the Wald, score, and likelihood ratio tests through simulation studies. The simulation studies show that the test of He et al. is the best in controlling type I errors, while the score test generally has the least power among the tests. The tests are illustrated with two real data examples.


2019 ◽  
Author(s):  
Georg Krammer

The Andersen LRT uses sample characteristics as split criteria to evaluate Rasch model fit, or theory driven hypothesis testing for a test. This simulation study is the first to evaluate power and Type I error of a random split criterion. Results consistently show that a random split criterion lacks power.


2017 ◽  
Vol 41 (6) ◽  
pp. 403-421 ◽  
Author(s):  
Sandip Sinharay

Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.


Author(s):  
Guanghao Qi ◽  
Nilanjan Chatterjee

Abstract Background Previous studies have often evaluated methods for Mendelian randomization (MR) analysis based on simulations that do not adequately reflect the data-generating mechanisms in genome-wide association studies (GWAS) and there are often discrepancies in the performance of MR methods in simulations and real data sets. Methods We use a simulation framework that generates data on full GWAS for two traits under a realistic model for effect-size distribution coherent with the heritability, co-heritability and polygenicity typically observed for complex traits. We further use recent data generated from GWAS of 38 biomarkers in the UK Biobank and performed down sampling to investigate trends in estimates of causal effects of these biomarkers on the risk of type 2 diabetes (T2D). Results Simulation studies show that weighted mode and MRMix are the only two methods that maintain the correct type I error rate in a diverse set of scenarios. Between the two methods, MRMix tends to be more powerful for larger GWAS whereas the opposite is true for smaller sample sizes. Among the other methods, random-effect IVW (inverse-variance weighted method), MR-Robust and MR-RAPS (robust adjust profile score) tend to perform best in maintaining a low mean-squared error when the InSIDE assumption is satisfied, but can produce large bias when InSIDE is violated. In real-data analysis, some biomarkers showed major heterogeneity in estimates of their causal effects on the risk of T2D across the different methods and estimates from many methods trended in one direction with increasing sample size with patterns similar to those observed in simulation studies. Conclusion The relative performance of different MR methods depends heavily on the sample sizes of the underlying GWAS, the proportion of valid instruments and the validity of the InSIDE assumption. Down-sampling analysis can be used in large GWAS for the possible detection of bias in the MR methods.


2021 ◽  
pp. 001316442199489
Author(s):  
Luyao Peng ◽  
Sandip Sinharay

Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of Wollack and Eckerly (2017) and Sinharay (2018) and suggests a new aggregate-level EDI by incorporating the empirical best linear unbiased predictor from the literature of linear mixed-effects models (e.g., McCulloch et al., 2008). A simulation study shows that the new EDI has larger power than the indices of Wollack and Eckerly (2017) and Sinharay (2018). In addition, the new index has satisfactory Type I error rates. A real data example is also included.


Genetics ◽  
2000 ◽  
Vol 154 (1) ◽  
pp. 381-395
Author(s):  
Pavel Morozov ◽  
Tatyana Sitnikova ◽  
Gary Churchill ◽  
Francisco José Ayala ◽  
Andrey Rzhetsky

Abstract We propose models for describing replacement rate variation in genes and proteins, in which the profile of relative replacement rates along the length of a given sequence is defined as a function of the site number. We consider here two types of functions, one derived from the cosine Fourier series, and the other from discrete wavelet transforms. The number of parameters used for characterizing the substitution rates along the sequences can be flexibly changed and in their most parameter-rich versions, both Fourier and wavelet models become equivalent to the unrestricted-rates model, in which each site of a sequence alignment evolves at a unique rate. When applied to a few real data sets, the new models appeared to fit data better than the discrete gamma model when compared with the Akaike information criterion and the likelihood-ratio test, although the parametric bootstrap version of the Cox test performed for one of the data sets indicated that the difference in likelihoods between the two models is not significant. The new models are applicable to testing biological hypotheses such as the statistical identity of rate variation profiles among homologous protein families. These models are also useful for determining regions in genes and proteins that evolve significantly faster or slower than the sequence average. We illustrate the application of the new method by analyzing human immunoglobulin and Drosophilid alcohol dehydrogenase sequences.


Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 934
Author(s):  
Yuxuan Zhang ◽  
Kaiwei Liu ◽  
Wenhao Gui

For the purpose of improving the statistical efficiency of estimators in life-testing experiments, generalized Type-I hybrid censoring has lately been implemented by guaranteeing that experiments only terminate after a certain number of failures appear. With the wide applications of bathtub-shaped distribution in engineering areas and the recently introduced generalized Type-I hybrid censoring scheme, considering that there is no work coalescing this certain type of censoring model with a bathtub-shaped distribution, we consider the parameter inference under generalized Type-I hybrid censoring. First, estimations of the unknown scale parameter and the reliability function are obtained under the Bayesian method based on LINEX and squared error loss functions with a conjugate gamma prior. The comparison of estimations under the E-Bayesian method for different prior distributions and loss functions is analyzed. Additionally, Bayesian and E-Bayesian estimations with two unknown parameters are introduced. Furthermore, to verify the robustness of the estimations above, the Monte Carlo method is introduced for the simulation study. Finally, the application of the discussed inference in practice is illustrated by analyzing a real data set.


Sign in / Sign up

Export Citation Format

Share Document