scholarly journals Sample size evolution in neuroimaging research: an evaluation of highly-cited studies (1990-2012) and of latest practices (2017-2018) in high-impact journals

2019 ◽  
Author(s):  
Denes Szucs ◽  
John PA Ioannidis

AbstractWe evaluated 1038 of the most cited structural and functional (fMRI) magnetic resonance brain imaging papers (1161 studies) published during 1990-2012 and 273 papers (302 studies) published in top neuroimaging journals in 2017 and 2018. 96% of highly cited experimental fMRI studies had a single group of participants and these studies had median sample size of 12, highly cited clinical fMRI studies (with patient participants) had median sample size of 14.5, and clinical structural MRI studies had median sample size of 50. The sample size of highly cited experimental fMRI studies increased at a rate of 0.74 participant/year and this rate of increase was commensurate with the median sample sizes of neuroimaging studies published in top neuroimaging journals in 2017 (23 participants) and 2018 (24 participants). Only 4 of 131 papers in 2017 and 5 of 142 papers in 2018 had pre-study power calculations, most for single t-tests and correlations. Only 14% of highly cited papers reported the number of excluded participants whereas about 45% of papers in 2017 and 2018 reported excluded participants. Targeted interventions from publishers and funders could facilitate increase in sample sizes and adherence to better standards.

2021 ◽  
Vol 99 (Supplement_1) ◽  
pp. 218-219
Author(s):  
Andres Fernando T Russi ◽  
Mike D Tokach ◽  
Jason C Woodworth ◽  
Joel M DeRouchey ◽  
Robert D Goodband ◽  
...  

Abstract The swine industry has been constantly evolving to select animals with improved performance traits and to minimize variation in body weight (BW) in order to meet packer specifications. Therefore, understanding variation presents an opportunity for producers to find strategies that could help reduce, manage, or deal with variation of pigs in a barn. A systematic review and meta-analysis was conducted by collecting data from multiple studies and available data sets in order to develop prediction equations for coefficient of variation (CV) and standard deviation (SD) as a function of BW. Information regarding BW variation from 16 papers was recorded to provide approximately 204 data points. Together, these data included 117,268 individually weighed pigs with a sample size that ranged from 104 to 4,108 pigs. A random-effects model with study used as a random effect was developed. Observations were weighted using sample size as an estimate for precision on the analysis, where larger data sets accounted for increased accuracy in the model. Regression equations were developed using the nlme package of R to determine the relationship between BW and its variation. Polynomial regression analysis was conducted separately for each variation measurement. When CV was reported in the data set, SD was calculated and vice versa. The resulting prediction equations were: CV (%) = 20.04 – 0.135 × (BW) + 0.00043 × (BW)2, R2=0.79; SD = 0.41 + 0.150 × (BW) - 0.00041 × (BW)2, R2 = 0.95. These equations suggest that there is evidence for a decreasing quadratic relationship between mean CV of a population and BW of pigs whereby the rate of decrease is smaller as mean pig BW increases from birth to market. Conversely, the rate of increase of SD of a population of pigs is smaller as mean pig BW increases from birth to market.


2021 ◽  
Vol 13 (3) ◽  
pp. 368
Author(s):  
Christopher A. Ramezan ◽  
Timothy A. Warner ◽  
Aaron E. Maxwell ◽  
Bradley S. Price

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.


2013 ◽  
Vol 113 (1) ◽  
pp. 221-224 ◽  
Author(s):  
David R. Johnson ◽  
Lauren K. Bachan

In a recent article, Regan, Lakhanpal, and Anguiano (2012) highlighted the lack of evidence for different relationship outcomes between arranged and love-based marriages. Yet the sample size ( n = 58) used in the study is insufficient for making such inferences. This reply discusses and demonstrates how small sample sizes reduce the utility of this research.


2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Louis M. Houston

We derive a general equation for the probability that a measurement falls within a range of n standard deviations from an estimate of the mean. So, we provide a format that is compatible with a confidence interval centered about the mean that is naturally independent of the sample size. The equation is derived by interpolating theoretical results for extreme sample sizes. The intermediate value of the equation is confirmed with a computational test.


2019 ◽  
Author(s):  
Peter E Clayson ◽  
Kaylie Amanda Carbine ◽  
Scott Baldwin ◽  
Michael J. Larson

Methodological reporting guidelines for studies of event-related potentials (ERPs) were updated in Psychophysiology in 2014. These guidelines facilitate the communication of key methodological parameters (e.g., preprocessing steps). Failing to report key parameters represents a barrier to replication efforts, and difficultly with replicability increases in the presence of small sample sizes and low statistical power. We assessed whether guidelines are followed and estimated the average sample size and power in recent research. Reporting behavior, sample sizes, and statistical designs were coded for 150 randomly-sampled articles from five high-impact journals that frequently publish ERP research from 2011 to 2017. An average of 63% of guidelines were reported, and reporting behavior was similar across journals, suggesting that gaps in reporting is a shortcoming of the field rather than any specific journal. Publication of the guidelines paper had no impact on reporting behavior, suggesting that editors and peer reviewers are not enforcing these recommendations. The average sample size per group was 21. Statistical power was conservatively estimated as .72-.98 for a large effect size, .35-.73 for a medium effect, and .10-.18 for a small effect. These findings indicate that failing to report key guidelines is ubiquitous and that ERP studies are primarily powered to detect large effects. Such low power and insufficient following of reporting guidelines represent substantial barriers to replication efforts. The methodological transparency and replicability of studies can be improved by the open sharing of processing code and experimental tasks and by a priori sample size calculations to ensure adequately powered studies.


2019 ◽  
Author(s):  
Patrick Bergman ◽  
Maria Hagströmer

Abstract BACKGROUND Measuring physical activity and sedentary behavior accurately remains a challenge. When describing the uncertainty of mean values or when making group comparisons, minimising Standard Error of the Mean (SEM) is important. The sample size and the number of repeated observations within each subject influence the size of the SEM. In this study we have investigated how different combinations of sample sizes and repeated observations influence the magnitude of the SEM. METHODS A convenience sample were asked to wear an accelerometer for 28 consecutive days. Based on the within and between subject variances the SEM for the different combinations of sample sizes and number of monitored days was calculated. RESULTS Fifty subjects (67% women, mean±SD age 41±19 years) were included. The analyses showed, independent of which intensity level of physical activity or how measurement protocol was designed, that the largest reductions in SEM was seen as the sample size were increased. The same magnitude in reductions to SEM was not seen for increasing the number of repeated measurement days within each subject. CONCLUSION The most effective way of reducing the SEM is to have a large sample size rather than a long observation period within each individual. Even though the importance of reducing the SEM to increase the power of detecting differences between groups is well-known it is seldom considered when developing appropriate protocols for accelerometer based research. Therefore the results presented herein serves to highlight this fact and have the potential to stimulate debate and challenge current best practice recommendations of accelerometer based physical activity research.


2020 ◽  
Author(s):  
Miles D. Witham ◽  
James Wason ◽  
Richard M Dodds ◽  
Avan A Sayer

Abstract Introduction Frailty is the loss of ability to withstand a physiological stressor, and is associated with multiple adverse outcomes in older people. Trials to prevent or ameliorate frailty are in their infancy. A range of different outcome measures have been proposed, but current measures require either large sample sizes, long follow-up, or do not directly measure the construct of frailty. Methods We propose a composite outcome for frailty prevention trials, comprising progression to the frail state, death, or being too unwell to continue in a trial. To determine likely event rates, we used data from the English Longitudinal Study for Ageing, collected 4 years apart. We calculated transition rates between non-frail, prefrail, frail or loss to follow up due to death or illness. We used Markov state transition models to interpolate one- and two-year transition rates, and performed sample size calculations for a range of differences in transition rates using simple and composite outcomes. Results The frailty category was calculable for 4650 individuals at baseline (2226 non-frail, 1907 prefrail, 517 frail); at follow up, 1282 were non-frail, 1108 were prefrail, 318 were frail and 1936 had dropped out or were unable to complete all tests for frailty. Transition probabilities for those prefrail at baseline, measured at wave 4 were respectively 0.176, 0.286, 0.096 and 0.442 to non-frail, prefrail, frail and dead/dropped out. Interpolated transition probabilities were 0.159, 0.494, 0.113 and 0.234 at two years, and 0.108, 0.688, 0.087 and 0.117 at one year. Required sample sizes for a two-year outcome were between 1000 and 7200 for transition from prefrailty to frailty alone, 250 to 1600 for transition to the composite measure, and 75 to 350 using the composite measure with an ordinal logistic regression approach. Conclusion Use of a composite outcome for frailty trials offers reduced sample sizes and could ameliorate the effect of high loss to follow up inherent in such trials due to death and illness.


2019 ◽  
Author(s):  
Miles D. Witham ◽  
James Wason ◽  
Richard M Dodds ◽  
Avan A Sayer

Abstract Introduction Frailty is the loss of ability to withstand a physiological stressor, and is associated with multiple adverse outcomes in older people. Trials to prevent or ameliorate frailty are in their infancy. A range of different outcome measures have been proposed, but current measures require either large sample sizes, long follow-up, or do not directly measure the construct of frailty. Methods We propose a composite outcome for frailty prevention trials, comprising progression to the frail state, death, or being too unwell to continue in a trial. To determine likely event rates, we used data from the English Longitudinal Study for Ageing, collected 4 years apart. We calculated transition rates between non-frail, prefrail, frail or loss to follow up due to death or illness. We used Markov state transition models to interpolate one- and two-year transition rates, and performed sample size calculations for a range of differences in transition rates using simple and composite outcomes. Results The frailty category was calculable for 4650 individuals at baseline (2226 non-frail, 1907 prefrail, 517 frail); at follow up, 1282 were non-frail, 1108 were prefrail, 318 were frail and 1936 had dropped out or were unable to complete all tests for frailty. Transition probabilities for those prefrail at baseline, measured at wave 4 were respectively 0.176, 0.286, 0.096 and 0.442 to non-frail, prefrail, frail and dead/dropped out. Interpolated transition probabilities were 0.159, 0.494, 0.113 and 0.234 at two years, and 0.108, 0.688, 0.087 and 0.117 at one year. Required sample sizes for a two-year outcome were between 1000 and 7200 for transition from prefrailty to frailty alone, 250 to 1600 for transition to the composite measure, and 75 to 350 using the composite measure with an ordinal logistic regression approach. Conclusion Use of a composite outcome for frailty trials offers reduced sample sizes and could ameliorate the effect of high loss to follow up inherent in such trials due to death and illness.


Author(s):  
Emilie Laurin ◽  
Julia Bradshaw ◽  
Laura Hawley ◽  
Ian A. Gardner ◽  
Kyle A Garver ◽  
...  

Proper sample size must be considered when designing infectious-agent prevalence studies for mixed-stock fisheries, because bias and uncertainty complicate interpretation of apparent (test)-prevalence estimates. Sample size varies between stocks, often smaller than expected during wild-salmonid surveys. Our case example of 2010-2016 survey data of Sockeye salmon (Oncorhynchus nerka) from different stocks of origin in British Columbia, Canada, illustrated the effect of sample size on apparent-prevalence interpretation. Molecular testing (viral RNA RT-qPCR) for infectious hematopoietic necrosis virus (IHNv) revealed large differences in apparent-prevalence across wild salmon stocks (much higher from Chilko Lake) and sampling location (freshwater or marine), indicating differences in both stock and host life-stage effects. Ten of the 13 marine non-Chilko stock-years with IHNv-positive results had small sample sizes (< 30 samples per stock-year) which, with imperfect diagnostic tests (particularly lower diagnostic sensitivity), could lead to inaccurate apparent-prevalence estimation. When calculating sample size for expected apparent prevalence using different approaches, smaller sample sizes often led to decreased confidence in apparent-prevalence results and decreased power to detect a true difference from a reference value.


Sign in / Sign up

Export Citation Format

Share Document