scholarly journals Sample Size Estimation in Veterinary Epidemiologic Research

2021 ◽  
Vol 7 ◽  
Author(s):  
Mark A. Stevenson

In the design of intervention and observational epidemiological studies sample size calculations are used to provide estimates of the minimum number of observations that need to be made to ensure that the stated objectives of a study are met. Justification of the number of subjects enrolled into a study and details of the assumptions and methodologies used to derive sample size estimates are now a mandatory component of grant application processes by funding agencies. Studies with insufficient numbers of study subjects run the risk of failing to identify differences among treatment or exposure groups when differences do, in fact, exist. Selection of a number of study subjects greater than that actually required results in a wastage of time and resources. In contrast to human epidemiological research, individual study subjects in a veterinary setting are almost always aggregated into hierarchical groups and, for this reason, sample size estimates calculated using formulae that assume data independence are not appropriate. This paper provides an overview of the reasons researchers might need to calculate an appropriate sample size in veterinary epidemiology and a summary of sample size calculation methods. Two approaches are presented for dealing with lack of data independence when calculating sample sizes: (1) inflation of crude sample size estimates using a design effect; and (2) simulation-based methods. The advantage of simulation methods is that appropriate sample sizes can be estimated for complex study designs for which formula-based methods are not available. A description of the methodological approach for simulation is described and a worked example provided.

2021 ◽  
Vol 8 (3) ◽  
pp. 184
Author(s):  
Xiaoping Zhu

<p><strong>Background: </strong>Precise sample size estimation plays a vital role in the planning of a study specifically for medical treatment expenses that are expensive and studies that are of high risk.</p><p><strong>Methods: </strong>Among a variety of sample size calculation methods for the nonparametric Mann-Whitney U test, five potential methods are selected for evaluation in this article. The evaluation of method performance is based on the results obtained from high precision Monte Carlo simulations.</p><p><strong>Results: </strong>The sample size deviations (from the simulation ones) are performance indicators. The sum of the squared deviations over all scenarios is used as the criterion for ranking the five methods. For power comparisons, the percentage errors (relative to the simulated powers) are used. The effect size and target power both have large impacts on the minimum required sample sizes.</p><p><strong>Conclusions: </strong>Based on the ranking criterion, Shieh's method has the best performance. Noether's method always overestimates the minimum required sample sizes but not too severe.</p>


2005 ◽  
Vol 35 (1) ◽  
pp. 1-20 ◽  
Author(s):  
G. K. Huysamen

Criticisms of traditional null hypothesis significance testing (NHST) became more pronounced during the 1960s and reached a climax during the past decade. Among others, NHST says nothing about the size of the population parameter of interest and its result is influenced by sample size. Estimation of confidence intervals around point estimates of the relevant parameters, model fitting and Bayesian statistics represent some major departures from conventional NHST. Testing non-nil null hypotheses, determining optimal sample size to uncover only substantively meaningful effect sizes and reporting effect-size estimates may be regarded as minor extensions of NHST. Although there seems to be growing support for the estimation of confidence intervals around point estimates of the relevant parameters, it is unlikely that NHST-based procedures will disappear in the near future. In the meantime, it is widely accepted that effect-size estimates should be reported as a mandatory adjunct to conventional NHST results.


2020 ◽  
Vol 26 (Supplement_1) ◽  
pp. S9-S9
Author(s):  
Svetlana Lakunina ◽  
Zipporah Iheozor-Ejiofor ◽  
Morris Gordon ◽  
Daniel Akintelure ◽  
Vassiliki Sinopoulou

Abstract Inflammatory bowel disease is a collection of disorders of the gastrointestinal tract, characterised by relapsing and remitting inflammation. Studies have reported several pharmacological or non-pharmacological interventions being effective in the management of the disease. Sample size estimation with power calculation is necessary for a trial to detect the effect of an intervention. This project critically evaluates the sample size estimation and power calculation reported by randomised controlled studies of inflammatory bowel disease management to effectively conclude appropriateness of the studies results. We conducted a literature search in the Cochrane database to identify systematic literature reviews. Their reference lists were screened, and studies were selected if they met the inclusion criteria. The data was extracted based on power calculation parameters and outcomes, results were analysed and summarised in percentages, means and graphs. We screened almost all trials about the management of inflammatory bowel disease published in the past 25 years. 232 studies were analysed, of which 167 reported power calculation. Less than half (48%) of these studies achieved their target sample size, needed for them to accurately conclude that the interventions were effective. Moreover, the average minimal difference those studies were aimed to detect was 30%, which could be not enough to prove the effect of an intervention. To conclude inaccurate power calculations and failure to achieve the target sample sizes can lead to errors in the results on how effective an intervention is in the management of inflammatory bowel disease.


Author(s):  
Patrick Royston

The changes made to Royston (2018) and to power_ct are i) in section 2.4 ( Sample-size calculation for the combined test), to replace ordinary least-squares (OLS) regression using regress with grouped probit regression using glm; ii) in section 4 ( Examples), to revisit the worked examples of sample-size estimation in light of the revised estimation procedure; and iii) to update the help file entry for the option n( numlist). The updated software is version 1.2.0.


1968 ◽  
Vol 27 (2) ◽  
pp. 363-367 ◽  
Author(s):  
John E. Overall ◽  
Sudhir N. Dalal

Simple empirical formulae are presented for estimating appropriate sample size for simple randomized analysis of variance designs involving 2, 3, 4 or 5 treatments. In order to use these formulae one must specify the magnitude of a meaningful treatment difference and must have an estimate of the error variance. Sample size estimates derived from the simple formulae have been found to differ from values obtained using constant power curves by no more than one sampling unit on the low side and no more than two sampling units on the high side.


2020 ◽  
Author(s):  
Evangelia Christodoulou ◽  
Maarten van Smeden ◽  
Michael Edlinger ◽  
Dirk Timmerman ◽  
Maria Wanitschek ◽  
...  

Abstract Background: We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. Methods: We illustrate the approach using data for the diagnosis of ovarian cancer (n=5914, 33% event fraction) and obstructive coronary artery disease (CAD; n=4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a-priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000, and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥0.9 and optimism in the c-statistic (ΔAUC) <=0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors, and applying Firth’s bias correction.Results: Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24), and 750 patients (700-800) for the CAD data (30 EPP, 28-33). A stricter criterion, requiring ΔAUC <=0.01, was met with a median of 500 (23 EPP) and 1350 (54 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth’s correction was used. Conclusions: Adaptive sample size determination can be a useful supplement to a priori sample size calculations, because it allows to further tailor the sample size to the specific prediction modeling context in a dynamic fashion.


2019 ◽  
Author(s):  
Joseph F. Mudge ◽  
Jeffrey E. Houlahan

AbstractTraditional study design tools for estimating appropriate sample sizes are not consistently used in ecology and can lead to low statistical power to detect biologically relevant effects. We have developed a new approach to estimating optimal sample sizes, requiring only three parameters; a maximum acceptable average of α and β, a critical effect size of minimum biological relevance, and an estimate of the relative costs of Type I vs. Type II errors.This approach can be used to show the general circumstances under which different combinations of critical effect sizes and maximum acceptable combinations of α and β are attainable for different statistical tests. The optimal α sample size estimation approach can require fewer samples than traditional sample size estimation methods when costs of Type I and II errors are assumed to be equal but recommends comparatively more samples for increasingly unequal Type I vs. Type II errors costs. When sampling costs and absolute costs of Type I and II errors are known, optimal sample size estimation can be used to determine the smallest sample size at which the cost of an additional sample outweighs its associated reduction in errors. Optimal sample size estimation constitutes a more flexible and intuitive tool than traditional sample size estimation approaches, given the constraints and unknowns commonly faced by ecologists during study.


2015 ◽  
Author(s):  
Michael V. Lombardo ◽  
Bonnie Auyeung ◽  
Rosemary J. Holt ◽  
Jack Waldman ◽  
Amber N. V. Ruigrok ◽  
...  

AbstractFunctional magnetic resonance imaging (fMRI) research is routinely criticized for being statistically underpowered due to characteristically small sample sizes and much larger sample sizes are being increasingly recommended. Additionally, various sources of artifact inherent in fMRI data can have detrimental impact on effect size estimates and statistical power. Here we show how specific removal of non-BOLD artifacts can improve effect size estimation and statistical power in task-fMRI contexts, with particular application to the social-cognitive domain of mentalizing/theory of mind. Non-BOLD variability identification and removal is achieved in a biophysical and statistically principled manner by combining multi-echo fMRI acquisition and independent components analysis (ME-ICA). Group-level effect size estimates on two different mentalizing tasks were enhanced by ME-ICA at a median rate of 24% in regions canonically associated with mentalizing, while much more substantial boosts (40-149%) were observed in non-canonical cerebellar areas. This effect size boosting is primarily a consequence of reduction of non-BOLD noise at the subject-level, which then translates into consequent reductions in between-subject variance at the group-level. Power simulations demonstrate that enhanced effect size enables highly-powered studies at traditional sample sizes. Cerebellar effects observed after applying ME-ICA may be unobservable with conventional imaging at traditional sample sizes. Thus, ME-ICA allows for principled design-agnostic non-BOLD artifact removal that can substantially improve effect size estimates and statistical power in task-fMRI contexts. ME-ICA could help issues regarding statistical power and non-BOLD noise and enable potential for novel discovery of aspects of brain organization that are currently under-appreciated and not well understood.


2021 ◽  
Vol 1 (2) ◽  
pp. 47-63
Author(s):  
Xiaohong Li ◽  
Shesh N. Rai ◽  
Eric C. Rouchka ◽  
Timothy E. O’Toole ◽  
Nigel G. F. Cooper

Sample size calculation for adequate power analysis is critical in optimizing RNA-seq experimental design. However, the complexity increases for directly estimating sample size when taking into consideration confounding covariates. Although a number of approaches for sample size calculation have been proposed for RNA-seq data, most ignore any potential heterogeneity. In this study, we implemented a simulation-based and confounder-adjusted method to provide sample size recommendations for RNA-seq differential expression analysis. The data was generated using Monte Carlo simulation, given an underlined distribution of confounding covariates and parameters for a negative binomial distribution. The relationship between the sample size with the power and parameters, such as dispersion, fold change and mean read counts, can be visualized. We demonstrate that the adjusted sample size for a desired power and type one error rate of α is usually larger when taking confounding covariates into account. More importantly, our simulation study reveals that sample size may be underestimated by existing methods if a confounding covariate exists in RNA-seq data. Consequently, this underestimate could affect the detection power for the differential expression analysis. Therefore, we introduce confounding covariates for sample size estimation for heterogeneous RNA-seq data.


Nutrients ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 2309
Author(s):  
Luisa Finkeldey ◽  
Elena Schmitz ◽  
Sabine Ellinger

Epidemiological studies suggest that high intake of soy isoflavones may protect against breast cancer, but causal relationships can only be established by experimental trials. Thus, we aimed to provide a systematic review of randomized controlled trials (RCTs) on the effect of an isoflavone intake on risk factors of breast cancer in healthy subjects. After a systematic literature search in PubMed, 18 different RCTs with pre- and/or postmenopausal women were included and investigated for details according to the PRISMA guideline. In these studies, isoflavones were provided by soy food or supplements in amounts between 36.5–235 mg/d for a period of 1–36 months. Breast density, estrogens including precursors, metabolites, estrogen response such as length of menstrual cycle, and markers of proliferation and inflammation were considered. However, in most studies, differences were not detectable between isoflavone and control/placebo treatment despite a good adherence to isoflavone treatment, irrespective of the kind of intervention, the dose of isoflavones used, and the duration of isoflavone treatment. However, the lack of significant changes in most studies does not prove the lack of effects as a sample size calculation was often missing. Taking into account the risk of bias and methodological limitations, there is little evidence that isoflavone treatment modulates risk factors of breast cancer in pre- and postmenopausal women. Future studies should calculate the sample size to detect possible effects and consider methodological details to improve the study quality.


Sign in / Sign up

Export Citation Format

Share Document