MOPower: an R-shiny application for the simulation and power calculation of multi-omics studies.

Background: Multi-omics studies are increasingly used to help understand the underlying mechanisms of clinical phenotypes, integrating information from the genome, transcriptome, epigenome, metabolome, proteome and microbiome. This integration of data is of particular use in rare disease studies where the sample sizes are often relatively small. Methods development for multi-omics studies is in its early stages due to the complexity of the different individual data types. There is a need for software to perform data simulation and power calculation for multi-omics studies to test these different methodologies and help calculate sample size before the initiation of a study. This software, in turn, will optimise the success of a study. Results: The interactive R shiny application MOPower described below simulates data based on three different omics using statistical distributions. It calculates the power to detect an association with the phenotype through analysis of n number of replicates using a variety of the latest multi-omics analysis models and packages. The simulation study confirms the efficiency of the software when handling thousands of simulations over ten different sample sizes. The average time elapsed for a power calculation run between integration models was approximately 500 seconds. Additionally, for the given study design model, power varied with the increase in the number of features affecting each method differently. For example, using MOFA had an increase in power to detect an association when the study sample size equally matched the number of features. Conclusions: MOPower addresses the need for flexible and user-friendly software that undertakes power calculations for multi-omics studies. MOPower offers users a wide variety of integration methods to test and full customisation of omics features to cover a range of study designs.

Download Full-text

Optimal Sample Sizes for Testing the Equivalence of Two Means

Methodology ◽

10.1027/1614-2241/a000171 ◽

2019 ◽

Vol 15 (3) ◽

pp. 128-136

Author(s):

Jiin-Huarng Guo ◽

Hubert J. Chen ◽

Wei-Ming Luh

Keyword(s):

Sample Size ◽

Statistical Power ◽

Search Algorithm ◽

Sample Sizes ◽

Unequal Variances ◽

Population Means ◽

Optimal Sample ◽

Equivalence Tests ◽

Shiny App ◽

R Shiny

Abstract. Equivalence tests (also known as similarity or parity tests) have become more and more popular in addition to equality tests. However, in testing the equivalence of two population means, approximate sample sizes developed using conventional techniques found in the literature on this topic have usually been under-valued as having less statistical power than is required. In this paper, the authors first address the reason for this problem and then provide a solution using an exhaustive local search algorithm to find the optimal sample size. The proposed method is not only accurate but is also flexible so that unequal variances or sampling unit costs for different groups can be considered using different sample size allocations. Figures and a numerical example are presented to demonstrate various configurations. An R Shiny App is also available for easy use ( https://optimal-sample-size.shinyapps.io/equivalence-of-means/ ).

Download Full-text

Best (but oft-forgotten) practices: sample size and power calculation for a dietary intervention trial with episodically consumed foods

American Journal of Clinical Nutrition ◽

10.1093/ajcn/nqaa176 ◽

2020 ◽

Vol 112 (4) ◽

pp. 920-925

Author(s):

Wei Zhang ◽

Aiyi Liu ◽

Zhiwei Zhang ◽

Tonja Nansel ◽

Susan Halabi

Keyword(s):

Sample Size ◽

Ad Hoc ◽

Intervention Studies ◽

Dietary Intervention ◽

Dietary Guidelines ◽

Control Group ◽

Power Calculation ◽

Sample Sizes ◽

Dietary Interventions ◽

Dietary Intervention Trial

ABSTRACT Dietary interventions often target foods that are underconsumed relative to dietary guidelines, such as vegetables, fruits, and whole grains. Because these foods are only consumed episodically for some participants, data from such a study often contains a disproportionally large number of zeros due to study participants who do not consume any of the target foods on the days that dietary intake is assessed, thus generating semicontinuous data. These zeros need to be properly accounted for when calculating sample sizes to ensure that the study is adequately powered to detect a meaningful intervention effect size. Nonetheless, this issue has not been well addressed in the literature. Instead, methods that are common for continuous outcomes are typically used to compute the sample sizes, resulting in a substantially under- or overpowered study. We propose proper approaches to calculating the sample size needed for dietary intervention studies that target episodically consumed foods. Sample size formulae are derived for detecting the mean difference in the amount of intake of an episodically consumed food between an intervention and a control group. Numerical studies are conducted to investigate the accuracy of the sample size formulae as compared with the ad hoc methods. The simulation results show that the proposed formulae are appropriate for estimating the sample sizes needed to achieve the desired power for the study. The proposed method for sample size is recommended for designing dietary intervention studies targeting episodically consumed foods.

Download Full-text

Effects of Training Set Size on Supervised Machine-Learning Land-Cover Classification of Large-Area High-Resolution Remotely Sensed Data

Remote Sensing ◽

10.3390/rs13030368 ◽

2021 ◽

Vol 13 (3) ◽

pp. 368

Author(s):

Christopher A. Ramezan ◽

Timothy A. Warner ◽

Aaron E. Maxwell ◽

Bradley S. Price

Keyword(s):

Machine Learning ◽

Sample Size ◽

Remotely Sensed ◽

Training Data ◽

Supervised Machine Learning ◽

Sample Sizes ◽

Remotely Sensed Data ◽

Large Area ◽

Training Set ◽

Set Size

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.

Download Full-text

Sample Size/Power Calculation for Case-Cohort Studies

Biometrics ◽

10.1111/j.0006-341x.2004.00257.x ◽

2004 ◽

Vol 60 (4) ◽

pp. 1015-1024 ◽

Cited By ~ 62

Author(s):

Jianwen Cai ◽

Donglin Zeng

Keyword(s):

Cohort Studies ◽

Sample Size ◽

Power Calculation

Download Full-text

What can we Learn from Studies Based on Small Sample Sizes? Comment on Regan, Lakhanpal, and Anguiano (2012)

Psychological Reports ◽

10.2466/21.02.07.pr0.113x12z8 ◽

2013 ◽

Vol 113 (1) ◽

pp. 221-224 ◽

Cited By ~ 3

Author(s):

David R. Johnson ◽

Lauren K. Bachan

Keyword(s):

Sample Size ◽

The Probability That a Measurement Falls within a Range of Standard Deviations from an Estimate of the Mean

ISRN Applied Mathematics ◽

10.5402/2012/710806 ◽

2012 ◽

Vol 2012 ◽

pp. 1-8 ◽

Cited By ~ 2

Author(s):

Louis M. Houston

Keyword(s):

Confidence Interval ◽

Sample Size ◽

General Equation ◽

Sample Sizes ◽

The Mean ◽

Standard Deviations ◽

Intermediate Value ◽

Theoretical Results

We derive a general equation for the probability that a measurement falls within a range of n standard deviations from an estimate of the mean. So, we provide a format that is compatible with a confidence interval centered about the mean that is naturally independent of the sample size. The equation is derived by interpolating theoretical results for extreme sample sizes. The intermediate value of the equation is confirmed with a computational test.

Download Full-text

Methodological Reporting Behavior, Sample Sizes, and Statistical Power in Studies of Event- Related Potentials: Barriers to Reproducibility and Replicability

10.31234/osf.io/kgv9z ◽

2019 ◽

Author(s):

Peter E Clayson ◽

Kaylie Amanda Carbine ◽

Scott Baldwin ◽

Michael J. Larson

Keyword(s):

Sample Size ◽

Statistical Power ◽

Event Related Potentials ◽

Reporting Guidelines ◽

Medium Effect ◽

Sample Sizes ◽

Reporting Behavior ◽

Average Sample Size ◽

Related Potentials ◽

Average Sample

Methodological reporting guidelines for studies of event-related potentials (ERPs) were updated in Psychophysiology in 2014. These guidelines facilitate the communication of key methodological parameters (e.g., preprocessing steps). Failing to report key parameters represents a barrier to replication efforts, and difficultly with replicability increases in the presence of small sample sizes and low statistical power. We assessed whether guidelines are followed and estimated the average sample size and power in recent research. Reporting behavior, sample sizes, and statistical designs were coded for 150 randomly-sampled articles from five high-impact journals that frequently publish ERP research from 2011 to 2017. An average of 63% of guidelines were reported, and reporting behavior was similar across journals, suggesting that gaps in reporting is a shortcoming of the field rather than any specific journal. Publication of the guidelines paper had no impact on reporting behavior, suggesting that editors and peer reviewers are not enforcing these recommendations. The average sample size per group was 21. Statistical power was conservatively estimated as .72-.98 for a large effect size, .35-.73 for a medium effect, and .10-.18 for a small effect. These findings indicate that failing to report key guidelines is ubiquitous and that ERP studies are primarily powered to detect large effects. Such low power and insufficient following of reporting guidelines represent substantial barriers to replication efforts. The methodological transparency and replicability of studies can be improved by the open sharing of processing code and experimental tasks and by a priori sample size calculations to ensure adequately powered studies.

Download Full-text

No one accelerometer-based physical activity data collection protocol can fit all research questions

10.21203/rs.2.11020/v2 ◽

2019 ◽

Author(s):

Patrick Bergman ◽

Maria Hagströmer

Keyword(s):

Physical Activity ◽

Sample Size ◽

Large Sample Size ◽

Intensity Level ◽

Sample Sizes ◽

Activity Data ◽

Convenience Sample ◽

Mean Values ◽

Repeated Observations ◽

Measurement Protocol

Abstract BACKGROUND Measuring physical activity and sedentary behavior accurately remains a challenge. When describing the uncertainty of mean values or when making group comparisons, minimising Standard Error of the Mean (SEM) is important. The sample size and the number of repeated observations within each subject influence the size of the SEM. In this study we have investigated how different combinations of sample sizes and repeated observations influence the magnitude of the SEM. METHODS A convenience sample were asked to wear an accelerometer for 28 consecutive days. Based on the within and between subject variances the SEM for the different combinations of sample sizes and number of monitored days was calculated. RESULTS Fifty subjects (67% women, mean±SD age 41±19 years) were included. The analyses showed, independent of which intensity level of physical activity or how measurement protocol was designed, that the largest reductions in SEM was seen as the sample size were increased. The same magnitude in reductions to SEM was not seen for increasing the number of repeated measurement days within each subject. CONCLUSION The most effective way of reducing the SEM is to have a large sample size rather than a long observation period within each individual. Even though the importance of reducing the SEM to increase the power of detecting differences between groups is well-known it is seldom considered when developing appropriate protocols for accelerometer based research. Therefore the results presented herein serves to highlight this fact and have the potential to stimulate debate and challenge current best practice recommendations of accelerometer based physical activity research.

Download Full-text

Developing a composite outcome measure for frailty prevention trials – rationale, derivation and sample size comparison with other candidate measures

10.21203/rs.2.13602/v2 ◽

2020 ◽

Author(s):

Miles D. Witham ◽

James Wason ◽

Richard M Dodds ◽

Avan A Sayer

Keyword(s):

Sample Size ◽

Transition Probabilities ◽

Adverse Outcomes ◽

Transition Rates ◽

Composite Outcome ◽

Composite Measure ◽

Sample Sizes ◽

Loss To Follow Up ◽

Prevention Trials

Abstract Introduction Frailty is the loss of ability to withstand a physiological stressor, and is associated with multiple adverse outcomes in older people. Trials to prevent or ameliorate frailty are in their infancy. A range of different outcome measures have been proposed, but current measures require either large sample sizes, long follow-up, or do not directly measure the construct of frailty. Methods We propose a composite outcome for frailty prevention trials, comprising progression to the frail state, death, or being too unwell to continue in a trial. To determine likely event rates, we used data from the English Longitudinal Study for Ageing, collected 4 years apart. We calculated transition rates between non-frail, prefrail, frail or loss to follow up due to death or illness. We used Markov state transition models to interpolate one- and two-year transition rates, and performed sample size calculations for a range of differences in transition rates using simple and composite outcomes. Results The frailty category was calculable for 4650 individuals at baseline (2226 non-frail, 1907 prefrail, 517 frail); at follow up, 1282 were non-frail, 1108 were prefrail, 318 were frail and 1936 had dropped out or were unable to complete all tests for frailty. Transition probabilities for those prefrail at baseline, measured at wave 4 were respectively 0.176, 0.286, 0.096 and 0.442 to non-frail, prefrail, frail and dead/dropped out. Interpolated transition probabilities were 0.159, 0.494, 0.113 and 0.234 at two years, and 0.108, 0.688, 0.087 and 0.117 at one year. Required sample sizes for a two-year outcome were between 1000 and 7200 for transition from prefrailty to frailty alone, 250 to 1600 for transition to the composite measure, and 75 to 350 using the composite measure with an ordinal logistic regression approach. Conclusion Use of a composite outcome for frailty trials offers reduced sample sizes and could ameliorate the effect of high loss to follow up inherent in such trials due to death and illness.

Download Full-text