scholarly journals Comparison of statistical methods for analysis of small sample sizes for detecting the differences in efficacy between treatments for knee osteoarthritis

2020 ◽  
Author(s):  
Chia-Lung Shih ◽  
Te-Yu Hung

Abstract Background A small sample size (n < 30 for each treatment group) is usually enrolled to investigate the differences in efficacy between treatments for knee osteoarthritis (OA). The objective of this study was to use simulation for comparing the power of four statistical methods for analysis of small sample size for detecting the differences in efficacy between two treatments for knee OA. Methods A total of 10,000 replicates of 5 sample sizes (n=10, 15, 20, 25, and 30 for each group) were generated based on the previous reported measures of treatment efficacy. Four statistical methods were used to compare the differences in efficacy between treatments, including the two-sample t-test (t-test), the Mann-Whitney U-test (M-W test), the Kolmogorov-Smirnov test (K-S test), and the permutation test (perm-test). Results The bias of simulated parameter means showed a decreased trend with sample size but the CV% of simulated parameter means varied with sample sizes for all parameters. For the largest sample size (n=30), the CV% could achieve a small level (<20%) for almost all parameters but the bias could not. Among the non-parametric tests for analysis of small sample size, the perm-test had the highest statistical power, and its false positive rate was not affected by sample size. However, the power of the perm-test could not achieve a high value (80%) even using the largest sample size (n=30). Conclusion The perm-test is suggested for analysis of small sample size to compare the differences in efficacy between two treatments for knee OA.

1990 ◽  
Vol 47 (1) ◽  
pp. 2-15 ◽  
Author(s):  
Randall M. Peterman

Ninety-eight percent of recently surveyed papers in fisheries and aquatic sciences that did not reject some null hypothesis (H0) failed to report β, the probability of making a type II error (not rejecting H0 when it should have been), or statistical power (1 – β). However, 52% of those papers drew conclusions as if H0 were true. A false H0 could have been missed because of a low-power experiment, caused by small sample size or large sampling variability. Costs of type II errors can be large (for example, for cases that fail to detect harmful effects of some industrial effluent or a significant effect of fishing on stock depletion). Past statistical power analyses show that abundance estimation techniques usually have high β and that only large effects are detectable. I review relationships among β, power, detectable effect size, sample size, and sampling variability. I show how statistical power analysis can help interpret past results and improve designs of future experiments, impact assessments, and management regulations. I make recommendations for researchers and decision makers, including routine application of power analysis, more cautious management, and reversal of the burden of proof to put it on industry, not management agencies.


2020 ◽  
Vol 57 (2) ◽  
pp. 237-251
Author(s):  
Achilleas Anastasiou ◽  
Alex Karagrigoriou ◽  
Anastasios Katsileros

SummaryThe normal distribution is considered to be one of the most important distributions, with numerous applications in various fields, including the field of agricultural sciences. The purpose of this study is to evaluate the most popular normality tests, comparing the performance in terms of the size (type I error) and the power against a large spectrum of distributions with simulations for various sample sizes and significance levels, as well as through empirical data from agricultural experiments. The simulation results show that the power of all normality tests is low for small sample size, but as the sample size increases, the power increases as well. Also, the results show that the Shapiro–Wilk test is powerful over a wide range of alternative distributions and sample sizes and especially in asymmetric distributions. Moreover the D’Agostino–Pearson Omnibus test is powerful for small sample sizes against symmetric alternative distributions, while the same is true for the Kurtosis test for moderate and large sample sizes.


Methodology ◽  
2016 ◽  
Vol 12 (2) ◽  
pp. 61-71 ◽  
Author(s):  
Antoine Poncet ◽  
Delphine S. Courvoisier ◽  
Christophe Combescure ◽  
Thomas V. Perneger

Abstract. Many applied researchers are taught to use the t-test when distributions appear normal and/or sample sizes are large and non-parametric tests otherwise, and fear inflated error rates if the “wrong” test is used. In a simulation study (four tests: t-test, Mann-Whitney test, Robust t-test, Permutation test; seven sample sizes between 2 × 10 and 2 × 500; four distributions: normal, uniform, log-normal, bimodal; under the null and alternate hypotheses), we show that type 1 errors are well controlled in all conditions. The t-test is most powerful under the normal and the uniform distributions, the Mann-Whitney test under the lognormal distribution, and the robust t-test under the bimodal distribution. Importantly, even the t-test was more powerful under asymmetric distributions than under the normal distribution for the same effect size. It appears that normality and sample size do not matter for the selection of a test to compare two groups of same size and variance. The researcher can opt for the test that fits the scientific hypothesis the best, without fear of poor test performance.


2016 ◽  
Vol 74 (6) ◽  
pp. 1723-1734 ◽  
Author(s):  
David M. Kaplan ◽  
Marion Cuif ◽  
Cécile Fauvelot ◽  
Laurent Vigliola ◽  
Tri Nguyen-Huu ◽  
...  

Abstract Despite major advances in our capacity to measure marine larval connectivity (i.e. the pattern of transport of marine larvae from spawning to settlement sites) and the importance of these measurements for ecological and management questions, uncertainty in experimental estimates of marine larval connectivity has been given little attention. We review potential uncertainty sources in empirical larval connectivity studies and develop Bayesian statistical methods for estimating these uncertainties based on standard techniques in the mark-recapture and genetics literature. These methods are implemented in an existing R package for working with connectivity data, ConnMatTools, and applied to a number of published connectivity estimates. We find that the small sample size of collected settlers at destination sites is a dominant source of uncertainty in connectivity estimates in many published results. For example, widths of 95% CIs for relative connectivity, the value of which is necessarily between 0 and 1, exceeded 0.5 for many published connectivity results, complicating using individual results to conclude that marine populations are relatively closed or open. This “small sample size” uncertainty is significant even for studies with near-exhaustive sampling of spawners and settlers. Though largely ignored in the literature, the magnitude of this uncertainty is straightforward to assess. Better accountability of this and other uncertainties is needed in the future so that marine larval connectivity studies can fulfill their promises of providing important ecological insights and informing management questions (e.g. related to marine protected area network design, and stock structure of exploited organisms). In addition to using the statistical methods developed here, future studies should consistently evaluate and report a small number of critical factors, such as the exhaustivity of spawner and settler sampling, and the mating structure of target species in genetic studies.


Author(s):  
Derek Stephens ◽  
Diana J. Schwerha

The purpose of this study was to determine if safety professionals can use an ergonomic intervention costing calculator, which integrates performance and quality data into the costing matrix, to increase communication and better of decision making for the company. The sample size included 9 participants, which included four safety managers, four EHS managers, and one HR generalist. Results showed that all participants found the calculator very useful, well integrated, and it increased communication across the company. The mean System Usability Score (SUS) score was 82, which is rated as a perfectly acceptable software for use. Recommendations from this study include adding some additional features to the calculator, increasing awareness and availability of the calculator, and conducting further analysis using larger sample sizes. Limitations in this study include small sample size and limited interventions that were tested.


Methodology ◽  
2005 ◽  
Vol 1 (3) ◽  
pp. 86-92 ◽  
Author(s):  
Cora J. M. Maas ◽  
Joop J. Hox

Abstract. An important problem in multilevel modeling is what constitutes a sufficient sample size for accurate estimation. In multilevel analysis, the major restriction is often the higher-level sample size. In this paper, a simulation study is used to determine the influence of different sample sizes at the group level on the accuracy of the estimates (regression coefficients and variances) and their standard errors. In addition, the influence of other factors, such as the lowest-level sample size and different variance distributions between the levels (different intraclass correlations), is examined. The results show that only a small sample size at level two (meaning a sample of 50 or less) leads to biased estimates of the second-level standard errors. In all of the other simulated conditions the estimates of the regression coefficients, the variance components, and the standard errors are unbiased and accurate.


2015 ◽  
Vol 13 (04) ◽  
pp. 1550018 ◽  
Author(s):  
Kevin Lim ◽  
Zhenhua Li ◽  
Kwok Pui Choi ◽  
Limsoon Wong

Transcript-level quantification is often measured across two groups of patients to aid the discovery of biomarkers and detection of biological mechanisms involving these biomarkers. Statistical tests lack power and false discovery rate is high when sample size is small. Yet, many experiments have very few samples (≤ 5). This creates the impetus for a method to discover biomarkers and mechanisms under very small sample sizes. We present a powerful method, ESSNet, that is able to identify subnetworks consistently across independent datasets of the same disease phenotypes even under very small sample sizes. The key idea of ESSNet is to fragment large pathways into smaller subnetworks and compute a statistic that discriminates the subnetworks in two phenotypes. We do not greedily select genes to be included based on differential expression but rely on gene-expression-level ranking within a phenotype, which is shown to be stable even under extremely small sample sizes. We test our subnetworks on null distributions obtained by array rotation; this preserves the gene–gene correlation structure and is suitable for datasets with small sample size allowing us to consistently predict relevant subnetworks even when sample size is small. For most other methods, this consistency drops to less than 10% when we test them on datasets with only two samples from each phenotype, whereas ESSNet is able to achieve an average consistency of 58% (72% when we consider genes within the subnetworks) and continues to be superior when sample size is large. We further show that the subnetworks identified by ESSNet are highly correlated to many references in the biological literature. ESSNet and supplementary material are available at: http://compbio.ddns.comp.nus.edu.sg:8080/essnet .


2020 ◽  
Vol 10 (1) ◽  
pp. 46-55
Author(s):  
Somayeh Soltani Nejad ◽  
◽  
Maryam Zeighami ◽  
Ashraf Beirami ◽  
Ahmadali Amirifar ◽  
...  

Objective: Humans always have faced with the phenomenon of anxiety and have tried to find solutions to overcome this problem by various methods. The aim of this study was to determine the effect of echium amoenum on the anxiety of college students. Methods: This is a clinical trial study. Participants were 40 nursing students in Kerman, Iran who were randomly assigned into two groups of intervention (n=20) and control (n=20). The data collection tools were a demographic form and Cattle’s anxiety questionnaire. First, the baseline assessment was conducted in both groups. Then, the intervention group received 1 g echium amoenum powder in 250cc boiling water daily. After a month, both groups were assessed again. Data analysis was performed in SPSS v.20 software using descriptive and inferential statistics (mean, standard deviation, chi-square test, paired t-test, independent t-test, Mann-Witney U test). Results: At baseline, there was no significant difference between the two groups. After consumption of echium amoenum, the overall anxiety score decreased from 40.4±6.31 to 38.65±3.39 in the intervention group and increased from 39.7±9.29 to 41.75±9.91 in the control group; however, these differences were not statistically significant. Conclusion: Echium amoenum could reduce anxiety in the students, but its effect was not significant maybe due to the short duration of its use or small sample size. Hence, further studies with a larger sample size are recommended.


2006 ◽  
Vol 24 (18_suppl) ◽  
pp. 5034-5034
Author(s):  
J. A. Borgia ◽  
C. Frankenberger ◽  
S. E. McCormack ◽  
K. A. Kaiser ◽  
L. Usha ◽  
...  

5034 Background: A practical serum-based screening test for endometrial and ovarian carcinomas could greatly improve their early diagnosis, but developing such a test has proved elusive. Herein, we describe methods offering increased statistical power over conventional ‘batch’ analyses, via paired patient serum samples to identify biomarkers specific for uterine endometriod carcinomas (UEA). Methods: Paired serum samples (collected pre- and post- surgery) were prepared from patients undergoing UEA resection. A Ciphergen SELDI-TOF mass spectrometer was used to generate the patient serum proteomic profiles with data acquisition optimized to the 5,000–40,000 m/z (mass) range. Spectra quality was assessed using a specific function within the R-project statistical platform (v2.2.0), whereas data analyses were performed using Bioconductor, an extension of the R statistical platform; all in a blinded manner. Raw spectra were processed as follows: spectra pair normalization, baseline subtraction, and differential peak detection (with calibrated alignment). Aligned peaks were sorted into groups, based on tumor pathology and pre- vs. post-operative specimen type, and compared using a two tailed homoscedastic t-test in Microsoft Excel 2003. Results: Pre- and post-operative serum from 10 patients with UEA had their serum proteomic profiles evaluated for peaks lost after surgery. These values were then compared with a set of ‘normal’ samples (i.e. patients undergoing surgery for benign gynecologic disease). Even with our small sample size we identified 16 serum components that were significantly reduced or absent (p < 0.04) after tumor resection; 10 unique to UEA (p <0.04). From this group, species at m/z values of 4291.9, 4414.2, 5036.1, 9,615.0, and 25,508.8 seem most promising, based on their high level of significance (p < 0.02), for 10/10 patients. Conclusions: These results validate the power of our pre-and post-operative sample sets for the identification of new serum biomarker candidates for UEA. Work is in progress to increase the patient sample size, expand the study to patients with serous uterine and ovarian carcinomas, and identify the reported biomarker candidates via tandem mass spectrometry No significant financial relationships to disclose.


Sign in / Sign up

Export Citation Format

Share Document