scholarly journals New Insights in Power and Power Analysis in Mediation Models

2021 ◽  
Author(s):  
Amanda Kay Montoya ◽  
Chris Aberson ◽  
Jessica Fossum ◽  
Donna Chen ◽  
Oscar Gonzalez

Mediation analysis is commonly used in social-personality psychology to evaluate potential mechanisms of effects. With the recent replicability crisis, researchers are turning to power analysis to help plan studies; however, power analysis for mediation is not implemented in popular software (e.g., G*Power). Our symposium includes two presentations focusing on implementation of power analysis for mediation: (1) describing easy-to-use tools for implementing power analysis (e.g., pwr2ppl R package), and (2) evaluating whether different inferential methods result in similar recommended sample sizes and the role of assumption violations in these differences. Two presenters focus on study characteristics which can affect power: (1) use of the bias-corrected confidence interval and alternatives which better balance power and type I error, and (2) how measurement error on the mediator can impact power and how to correct this issue with latent variable models. Presentations will include applied examples, aimed at a social-personality audience, and provide concrete steps for increasing the validity and replicability of mediation analyses conducted in social-personality research. (Symposium Presented at SPSP 2021)

2021 ◽  
Author(s):  
Ye Yue ◽  
Yi-Juan Hu

Background: Understanding whether and which microbes played a mediating role between an exposure and a disease outcome are essential for researchers to develop clinical interventions to treat the disease by modulating the microbes. Existing methods for mediation analysis of the microbiome are often limited to a global test of community-level mediation or selection of mediating microbes without control of the false discovery rate (FDR). Further, while the null hypothesis of no mediation at each microbe is a composite null that consists of three types of null (no exposure-microbe association, no microbe-outcome association given the exposure, or neither), most existing methods for the global test such as MedTest and MODIMA treat the microbes as if they are all under the same type of null. Methods: We propose a new approach based on inverse regression that regresses the (possibly transformed) relative abundance of each taxon on the exposure and the exposure-adjusted outcome to assess the exposure-taxon and taxon-outcome associations simultaneously. Then the association p-values are used to test mediation at both the community and individual taxon levels. This approach fits nicely into our Linear Decomposition Model (LDM) framework, so our new method is implemented in the LDM and enjoys all the features of the LDM, i.e., allowing an arbitrary number of taxa to be tested, supporting continuous, discrete, or multivariate exposures and outcomes as well as adjustment of confounding covariates, accommodating clustered data, and offering analysis at the relative abundance or presence-absence scale. We refer to this new method as LDM-med. Results: Using extensive simulations, we showed that LDM-med always controlled the type I error of the global test and had compelling power over existing methods; LDM-med always preserved the FDR of testing individual taxa and had much better sensitivity than alternative approaches. In contrast, MedTest and MODIMA had severely inflated type I error when different taxa were under different types of null. The flexibility of LDM-med for a variety of mediation analyses is illustrated by the application to a murine microbiome dataset. Availability and Implementation: Our new method has been added to our R package LDM, which is available on GitHub at https://github.com/yijuanhu/LDM.


2019 ◽  
Author(s):  
Alvin Vista

Cheating detection is an important issue in standardized testing, especially in large-scale settings. Statistical approaches are often computationally intensive and require specialised software to conduct. We present a two-stage approach that quickly filters suspected groups using statistical testing on an IRT-based answer-copying index. We also present an approach to mitigate data contamination and improve the performance of the index. The computation of the index was implemented through a modified version of an open source R package, thus enabling wider access to the method. Using data from PIRLS 2011 (N=64,232) we conduct a simulation to demonstrate our approach. Type I error was well-controlled and no control group was falsely flagged for cheating, while 16 (combined n=12,569) of the 18 (combined n=14,149) simulated groups were detected. Implications for system-level cheating detection and further improvements of the approach were discussed.


2019 ◽  
Author(s):  
Rob Cribbie ◽  
Nataly Beribisky ◽  
Udi Alter

Many bodies recommend that a sample planning procedure, such as traditional NHST a priori power analysis, is conducted during the planning stages of a study. Power analysis allows the researcher to estimate how many participants are required in order to detect a minimally meaningful effect size at a specific level of power and Type I error rate. However, there are several drawbacks to the procedure that render it “a mess.” Specifically, the identification of the minimally meaningful effect size is often difficult but unavoidable for conducting the procedure properly, the procedure is not precision oriented, and does not guide the researcher to collect as many participants as feasibly possible. In this study, we explore how these three theoretical issues are reflected in applied psychological research in order to better understand whether these issues are concerns in practice. To investigate how power analysis is currently used, this study reviewed the reporting of 443 power analyses in high impact psychology journals in 2016 and 2017. It was found that researchers rarely use the minimally meaningful effect size as a rationale for the chosen effect in a power analysis. Further, precision-based approaches and collecting the maximum sample size feasible are almost never used in tandem with power analyses. In light of these findings, we offer that researchers should focus on tools beyond traditional power analysis when sample planning, such as collecting the maximum sample size feasible.


2019 ◽  
Vol 35 (24) ◽  
pp. 5155-5162 ◽  
Author(s):  
Chengzhong Ye ◽  
Terence P Speed ◽  
Agus Salim

Abstract Motivation Dropout is a common phenomenon in single-cell RNA-seq (scRNA-seq) data, and when left unaddressed it affects the validity of the statistical analyses. Despite this, few current methods for differential expression (DE) analysis of scRNA-seq data explicitly model the process that gives rise to the dropout events. We develop DECENT, a method for DE analysis of scRNA-seq data that explicitly and accurately models the molecule capture process in scRNA-seq experiments. Results We show that DECENT demonstrates improved DE performance over existing DE methods that do not explicitly model dropout. This improvement is consistently observed across several public scRNA-seq datasets generated using different technological platforms. The gain in improvement is especially large when the capture process is overdispersed. DECENT maintains type I error well while achieving better sensitivity. Its performance without spike-ins is almost as good as when spike-ins are used to calibrate the capture model. Availability and implementation The method is implemented as a publicly available R package available from https://github.com/cz-ye/DECENT. Supplementary information Supplementary data are available at Bioinformatics online.


Biometrika ◽  
2019 ◽  
Vol 106 (2) ◽  
pp. 353-367 ◽  
Author(s):  
B Karmakar ◽  
B French ◽  
D S Small

Summary A sensitivity analysis for an observational study assesses how much bias, due to nonrandom assignment of treatment, would be necessary to change the conclusions of an analysis that assumes treatment assignment was effectively random. The evidence for a treatment effect can be strengthened if two different analyses, which could be affected by different types of biases, are both somewhat insensitive to bias. The finding from the observational study is then said to be replicated. Evidence factors allow for two independent analyses to be constructed from the same dataset. When combining the evidence factors, the Type I error rate must be controlled to obtain valid inference. A powerful method is developed for controlling the familywise error rate for sensitivity analyses with evidence factors. It is shown that the Bahadur efficiency of sensitivity analysis for the combined evidence is greater than for either evidence factor alone. The proposed methods are illustrated through a study of the effect of radiation exposure on the risk of cancer. An R package, evidenceFactors, is available from CRAN to implement the methods of the paper.


1982 ◽  
Vol 7 (3) ◽  
pp. 207-214 ◽  
Author(s):  
Jennifer J. Clinch ◽  
H. J. Keselman

The ANOVA, Welch, and Brown and Forsyth tests for mean equality were compared using Monte Carlo methods. The tests’ rates of Type I error and power were examined when populations were non-normal, variances were heterogeneous, and group sizes were unequal. The ANOVA F test was most affected by the assumption violations. The test proposed by Brown and Forsyth appeared, on the average, to be the “best” test statistic for testing an omnibus hypothesis of mean equality.


1992 ◽  
Vol 17 (4) ◽  
pp. 297-313 ◽  
Author(s):  
Michael R. Harwell

Monte Carlo studies provide information that can assist researchers in selecting a statistical test when underlying assumptions of the test are violated. Effective use of this literature is hampered by the lack of an overarching theory to guide the interpretation of Monte Carlo studies. The problem is exacerbated by the impressionistic nature of the studies, which can lead different readers to different conclusions. These shortcomings can be addressed using meta-analytic methods to integrate the results of Monte Carlo studies. Quantitative summaries of the effects of assumption violations on the Type I error rate and power of a test can assist researchers in selecting the best test for their data. Such summaries can also be used to evaluate the validity of previously published statistical results. This article provides a methodological framework for quantitatively integrating Type I error rates and power values for Monte Carlo studies. An example is provided using Monte Carlo studies of Bartlett’s (1937) test of equality of variances. The importance of relating meta-analytic results to exact statistical theory is emphasized.


2011 ◽  
Vol 71 (6) ◽  
pp. 986-1005 ◽  
Author(s):  
Ying Li ◽  
André A. Rupp

This study investigated the Type I error rate and power of the multivariate extension of the S − χ2 statistic using unidimensional and multidimensional item response theory (UIRT and MIRT, respectively) models as well as full-information bifactor (FI-bifactor) models through simulation. Manipulated factors included test length, sample size, latent trait characteristics such as discrimination pattern and intertrait correlations, and model type misspecification. The nominal Type I error rates were observed under all conditions. The power of the S − χ2 statistic for UIRT models was high for MIRT and FI-bifactor models that were structurally most distinct from the UIRT models but was low otherwise. The power of the S − χ2 statistic to detect misfitting between MIRT and FI-bifactor models was low across all conditions because of the structural similarity of these two models. Finally, information-based indices of relative model–data fit and latent variable correlations were obtained, and these showed expected patterns across conditions.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
William R. P. Denault ◽  
Astanand Jugessur

Abstract Background We present here a computational shortcut to improve a powerful wavelet-based method by Shim and Stephens (Ann Appl Stat 9(2):665–686, 2015. 10.1214/14-AOAS776) called WaveQTL that was originally designed to identify DNase I hypersensitivity quantitative trait loci (dsQTL). Results WaveQTL relies on permutations to evaluate the significance of an association. We applied a recent method by Zhou and Guan (J Am Stat Assoc 113(523):1362–1371, 2017. 10.1080/01621459.2017.1328361) to boost computational speed, which involves calculating the distribution of Bayes factors and estimating the significance of an association by simulations rather than permutations. We called this simulation-based approach “fast functional wavelet” (FFW), and tested it on a publicly available DNA methylation (DNAm) dataset on colorectal cancer. The simulations confirmed a substantial gain in computational speed compared to the permutation-based approach in WaveQTL. Furthermore, we show that FFW controls the type I error satisfactorily and has good power for detecting differentially methylated regions. Conclusions Our approach has broad utility and can be applied to detect associations between different types of functions and phenotypes. As more and more DNAm datasets are being made available through public repositories, an attractive application of FFW would be to re-analyze these data and identify associations that might have been missed by previous efforts. The full R package for FFW is freely available at GitHub https://github.com/william-denault/ffw.


2021 ◽  
Author(s):  
Moritz Heene ◽  
Michael Maraun ◽  
Nadine J. Glushko ◽  
Sunthud Pornprasertmanit

To provide researchers with a means of assessing the fit of the structural component of structural equation models, structural fit indices- modifications of the composite fit indices, RMSEA, SRMR, and CFI- have recently been developed. We investigated the performance of four of these structural fit indices- RMSEA-P, RMSEAs, SRMRs, and CFIs-, when paired with widely accepted cutoff values, in the service of detecting structural misspecification. In particular, by way of simulation study, for each of seven fit indices- 3 composite and 4 structural-, and the traditional chi-square test of perfect composite fit, we estimated the following rates: a) Type I error rate (i.e., the probability of (incorrect) rejection of a correctly specified structural component), under each of four degrees of misspecification in the measurement component; and b) Power (i.e., the probability of (correct) rejection of an incorrectly specified structural model), under each condition formed of the pairing of one of three degrees of structural misspecification with one of four degrees of measurement component misspecification. In addition to sample size, the impacts of two model features, incidental to model misspecification- number of manifest variables per latent variable and magnitude of factor loading- were investigated. The results suggested that, although the structural fit indices performed relatively better than the composite fit indices, none of the GFICV pairings was capable of delivering an entirely satisfactory Type I error rate/Power balance, [RMSEA-S,.05] failing entirely in this regard. Of the remaining pairings; a) RMSEA-P and CFIs suffered from a severely inflated Type I error rate; b) despite the fact that they were designed to pick up on structural features of candidate models, all pairings- and especially, RMSEA-P and CFIs- manifested sensitivities to model features, incidental to structural misspecification; and c) although, in the main, behaving in a sensible fashion, SRMRS was only sensitive to structural misspecification when it occurred at a relatively high degree.


Sign in / Sign up

Export Citation Format

Share Document