Sample Size/Power Calculation for Case-Cohort Studies

Biometrics ◽  
2004 ◽  
Vol 60 (4) ◽  
pp. 1015-1024 ◽  
Author(s):  
Jianwen Cai ◽  
Donglin Zeng
2020 ◽  
Vol 26 (Supplement_1) ◽  
pp. S9-S9
Author(s):  
Svetlana Lakunina ◽  
Zipporah Iheozor-Ejiofor ◽  
Morris Gordon ◽  
Daniel Akintelure ◽  
Vassiliki Sinopoulou

Abstract Inflammatory bowel disease is a collection of disorders of the gastrointestinal tract, characterised by relapsing and remitting inflammation. Studies have reported several pharmacological or non-pharmacological interventions being effective in the management of the disease. Sample size estimation with power calculation is necessary for a trial to detect the effect of an intervention. This project critically evaluates the sample size estimation and power calculation reported by randomised controlled studies of inflammatory bowel disease management to effectively conclude appropriateness of the studies results. We conducted a literature search in the Cochrane database to identify systematic literature reviews. Their reference lists were screened, and studies were selected if they met the inclusion criteria. The data was extracted based on power calculation parameters and outcomes, results were analysed and summarised in percentages, means and graphs. We screened almost all trials about the management of inflammatory bowel disease published in the past 25 years. 232 studies were analysed, of which 167 reported power calculation. Less than half (48%) of these studies achieved their target sample size, needed for them to accurately conclude that the interventions were effective. Moreover, the average minimal difference those studies were aimed to detect was 30%, which could be not enough to prove the effect of an intervention. To conclude inaccurate power calculations and failure to achieve the target sample sizes can lead to errors in the results on how effective an intervention is in the management of inflammatory bowel disease.


Epidemiology ◽  
2011 ◽  
Vol 22 (2) ◽  
pp. 279 ◽  
Author(s):  
Kiyoshi Kubota ◽  
Akira Wakana
Keyword(s):  

2020 ◽  
Author(s):  
Xianjun Dong ◽  
Xiaoqi Li ◽  
Tzuu-Wang Chang ◽  
Scott T Weiss ◽  
Weiliang Qiu

Genome-wide association studies (GWAS) have revealed thousands of genetic loci for common diseases. One of the main challenges in the post-GWAS era is to understand the causality of the genetic variants. Expression quantitative trait locus (eQTL) analysis has been proven to be an effective way to address this question by examining the relationship between gene expression and genetic variation in a sufficiently powered cohort. However, it is often tricky to determine the sample size at which a variant with a specific allele frequency will be detected to associate with gene expression with sufficient power. This is particularly demanding with single-cell RNAseq studies. Therefore, a user-friendly tool to perform power analysis for eQTL at both bulk tissue and single-cell level will be critical. Here, we presented an R package called powerEQTL with flexible functions to calculate power, minimal sample size, or detectable minor allele frequency in both bulk tissue and single-cell eQTL analysis. A user-friendly, program-free web application is also provided, allowing customers to calculate and visualize the parameters interactively.


Author(s):  
Kamala Adhikari Dahal ◽  
Scott Patten ◽  
Tyler Williamson ◽  
Alka Patel ◽  
Shahirose Premji ◽  
...  

IntroductionPooling data from cohort studies can be used to increase sample size. However, individual datasets may contain variables that measure the same construct differently, posing challenges in the usefulness of combined datasets. Variable harmonization (an effort that provides comparable view of data from different studies) may address this issue. Objectives and ApproachThis study harmonized existing datasets from two prospective pregnancy cohort studies in Alberta Canada (All Our Families (n=3,351) and Alberta Pregnancy Outcome and Nutrition (n=2,187)). Given the comparability of the characteristics of the two cohorts and similarities of the core data elements of interest, data harmonization was justifiable. Data harmonization was performed considering multiple factors, such as complete or partial variable matching regarding question asked/responded, the response coded (value level, value definition, data type), the frequency of measurement, the pregnancy time-period of measurement, and missing values. Multiple imputation was used to address missing data resulting from the data harmonization process. ResultsSeveral variables such as ethnicity, income, parity, gestational age, anxiety, and depression were harmonized using different procedures. If the question asked/answered and the response recorded was the same in both datasets, no variable manipulation was done. If the response recorded was different, the response was re-categorized/re-organized to optimize comparability of data from both datasets. Missing values were created for each resulting unmatched variables and were replaced using multiple imputation if the same construct was measured in both datasets but using different ways/scales. A scale that was used in both datasets was identified as a reference standard. If the variables were measured in multiple times and/or different time-periods, variables were synchronized using pregnancy trimesters data. Finally, harmonized datasets were then combined/pooled into a single dataset (n=5,588). Conclusion/ImplicationsVariable harmonization is an important aspect of conducting research using multiple datasets. It provides an opportunity to increase study power through maximizing sample size, permitting more sophisticated statistical analyses, and to answer novel research questions that could not be addressed using a single study.


Sign in / Sign up

Export Citation Format

Share Document