Small Sample Asymptotics

1988 ◽  
Vol 13 (1) ◽  
pp. 53-61 ◽  
Author(s):  
Michael A. Fligner

An approach for modifying the results of asymptotic theory to improve the performance of statistical procedures in small to moderate sample sizes is described in the context of hypothesis testing. The method is illustrated by a series of examples.

2004 ◽  
Vol 286 (4) ◽  
pp. E495-E501 ◽  
Author(s):  
Tyson H. Holmes

A simple framework is introduced that defines ten categories of statistical errors on the basis of type of error, bias or imprecision, and source: sampling, measurement, estimation, hypothesis testing, and reporting. Each of these ten categories is illustrated with examples pertinent to research and publication in the disciplines of endocrinology and metabolism. Some suggested remedies are discussed, where appropriate. A review of recent issues of American Journal of Physiology: Endocrinology and Metabolism and of Endocrinology finds that very small sample sizes may be the most prevalent cause of statistical error in this literature.


2018 ◽  
Author(s):  
Christopher Chabris ◽  
Patrick Ryan Heck ◽  
Jaclyn Mandart ◽  
Daniel Jacob Benjamin ◽  
Daniel J. Simons

Williams and Bargh (2008) reported that holding a hot cup of coffee caused participants to judge a person’s personality as warmer, and that holding a therapeutic heat pad caused participants to choose rewards for other people rather than for themselves. These experiments featured large effects (r = .28 and .31), small sample sizes (41 and 53 participants), and barely statistically significant results. We attempted to replicate both experiments in field settings with more than triple the sample sizes (128 and 177) and double-blind procedures, but found near-zero effects (r = –.03 and .02). In both cases, Bayesian analyses suggest there is substantially more evidence for the null hypothesis of no effect than for the original physical warmth priming hypothesis.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Weitong Cui ◽  
Huaru Xue ◽  
Lei Wei ◽  
Jinghua Jin ◽  
Xuewen Tian ◽  
...  

Abstract Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. Results Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. Conclusions High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Florent Le Borgne ◽  
Arthur Chatton ◽  
Maxime Léger ◽  
Rémi Lenain ◽  
Yohann Foucher

AbstractIn clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.


Author(s):  
Kathryn Rayson ◽  
Louise Waddington ◽  
Dougal Julian Hare

Abstract Background: Cognitive behavioural therapy (CBT) is in high demand due to its strong evidence base and cost effectiveness. To ensure CBT is delivered as intended in research, training and practice, fidelity assessment is needed. Fidelity is commonly measured by assessors rating treatment sessions, using CBT competence scales (CCSs). Aims: The current review assessed the quality of the literature examining the measurement properties of CCSs and makes recommendations for future research, training and practice. Method: Medline, PsychINFO, Scopus and Web of Science databases were systematically searched to identify relevant peer-reviewed, English language studies from 1980 onwards. Relevant studies were those that were primarily examining the measurement properties of CCSs used to assess adult 1:1 CBT treatment sessions. The quality of studies was assessed using a novel tool created for this study, following which a narrative synthesis is presented. Results: Ten studies met inclusion criteria, most of which were assessed as being ‘fair’ methodological quality, primarily due to small sample sizes. Construct validity and responsiveness definitions were applied inconsistently in the studies, leading to confusion over what was being measured. Conclusions: Although CCSs are widely used, we need to pay careful attention to the quality of research exploring their measurement properties. Consistent definitions of measurement properties, consensus about adequate sample sizes and improved reporting of individual properties are required to ensure the quality of future research.


Sign in / Sign up

Export Citation Format

Share Document