scholarly journals Including Data Analytical Stability in Cluster-based Inference

2019 ◽  
Author(s):  
Sanne P. Roels ◽  
Tom Loeys ◽  
Beatrijs Moerkerke

AbstractIn the statistical analysis of functional Magnetic Resonance Imaging (fMRI) brain data it remains a challenge to account for simultaneously testing activation in over 100.000 volume units or voxels. A popular method that reduces the dimensionality of this test problem is cluster-based inference. We propose a new testing procedure that allows to control the family-wise error (FWE) rate at the cluster level but improves cluster-based test decisions in two ways by (1) taking into account a measure for data analytical stability and (2) allowing a more voxel-based interpretation of the results. For each voxel, we define the re-selection rate conditional on a given FWE-corrected threshold and use this rate, which is a measure of stability, into the selection process. In our procedure, we set a more liberal and a more conservative FWE controlling threshold. Clusters that survive the liberal but not the conservative threshold are retained if sufficient evidence for voxelwise stability is available. Cluster that survive the conservative threshold are retained anyhow, and clusters that do not survive the liberal threshold are not further considered. Using the Human Connectome Project Data (Van Essen et al., 2012), we demonstrate how in a group analysis our method results not only in a higher number of selected voxels but also in a larger overlap between different test images. Additionally, we demonstrate the ability of our procedure to control the FWE, also in relatively small sample sizes.

2020 ◽  
Author(s):  
Philip A. Kragel ◽  
Xiaochun Han ◽  
Thomas Edward Kraynak ◽  
Peter J. Gianaros ◽  
Tor D Wager

Elliot and colleagues (2020) systematically evaluated the reliability of individual differences in task-based fMRI activity and found reliability to be poor. Here we demonstrate that task-based fMRI can be quite reliable, and that the small sample sizes, task types, and dated region of interest measures used in Elliot et al. lead to an overly negative picture. We show evidence from recent studies using multivariate models in larger samples, which have short-term test-retest reliability in the “excellent” range (ICC > 0.75). These include 8 fMRI studies of pain and a large study of affective images (N > 300). In addition, while some use cases for biomarkers require reliable individual differences, others do not. They require only that fMRI measures serve as reliable indicators of the presence of a mental state or event, which we term ‘task reliability’. In a re-analysis of the Human Connectome Project data reported in Elliot et al., we show excellent task reliability across roughly 4 months. Despite difficulties with some experimental paradigms and measurement models, the future is bright for fMRI research focused on biomarker development.


2018 ◽  
Author(s):  
Stephan Geuter ◽  
Guanghao Qi ◽  
Robert C. Welsh ◽  
Tor D. Wager ◽  
Martin A. Lindquist

AbstractMulti-subject functional magnetic resonance imaging (fMRI) analysis is often concerned with determining whether there exists a significant population-wide ‘activation’ in a comparison between two or more conditions. Typically this is assessed by testing the average value of a contrast of parameter estimates (COPE) against zero in a general linear model (GLM) analysis. In this work we investigate several aspects of this type of analysis. First, we study the effects of sample size on the sensitivity and reliability of the group analysis, allowing us to evaluate the ability of small sampled studies to effectively capture population-level effects of interest. Second, we assess the difference in sensitivity and reliability when using volumetric or surface based data. Third, we investigate potential biases in estimating effect sizes as a function of sample size. To perform this analysis we utilize the task-based fMRI data from the 500-subject release from the Human Connectome Project (HCP). We treat the complete collection of subjects (N = 491) as our population of interest, and perform a single-subject analysis on each subject in the population. We investigate the ability to recover population level effects using a subset of the population and standard analytical techniques. Our study shows that sample sizes of 40 are generally able to detect regions with high effect sizes (Cohen’s d > 0.8), while sample sizes closer to 80 are required to reliably recover regions with medium effect sizes (0.5 < d < 0.8). We find little difference in results when using volumetric or surface based data with respect to standard mass-univariate group analysis. Finally, we conclude that special care is needed when estimating effect sizes, particularly for small sample sizes.


Econometrics ◽  
2019 ◽  
Vol 7 (1) ◽  
pp. 12
Author(s):  
Karl-Heinz Schild ◽  
Karsten Schweikert

This paper investigates the properties of tests for asymmetric long-run adjustment which are often applied in empirical studies on asymmetric price transmissions. We show that substantial size distortions are caused by preconditioning the test on finding sufficient evidence for cointegration in a first step. The extent of oversizing the test for long-run asymmetry depends inversely on the power of the primary cointegration test. Hence, tests for long-run asymmetry become invalid in cases of small sample sizes or slow speed of adjustment. Further, we provide simulation evidence that tests for long-run asymmetry are generally oversized if the threshold parameter is estimated by conditional least squares and show that bootstrap techniques can be used to obtain the correct size.


2018 ◽  
Author(s):  
Prathiba Natesan ◽  
Smita Mehta

Single case experimental designs (SCEDs) have become an indispensable methodology where randomized control trials may be impossible or even inappropriate. However, the nature of SCED data presents challenges for both visual and statistical analyses. Small sample sizes, autocorrelations, data types, and design types render many parametric statistical analyses and maximum likelihood approaches ineffective. The presence of autocorrelation decreases interrater reliability in visual analysis. The purpose of the present study is to demonstrate a newly developed model called the Bayesian unknown change-point (BUCP) model which overcomes all the above-mentioned data analytic challenges. This is the first study to formulate and demonstrate rate ratio effect size for autocorrelated data, which has remained an open question in SCED research until now. This expository study also compares and contrasts the results from BUCP model with visual analysis, and rate ratio effect size with nonoverlap of all pairs (NAP) effect size. Data from a comprehensive behavioral intervention are used for the demonstration.


2018 ◽  
Author(s):  
Christopher Chabris ◽  
Patrick Ryan Heck ◽  
Jaclyn Mandart ◽  
Daniel Jacob Benjamin ◽  
Daniel J. Simons

Williams and Bargh (2008) reported that holding a hot cup of coffee caused participants to judge a person’s personality as warmer, and that holding a therapeutic heat pad caused participants to choose rewards for other people rather than for themselves. These experiments featured large effects (r = .28 and .31), small sample sizes (41 and 53 participants), and barely statistically significant results. We attempted to replicate both experiments in field settings with more than triple the sample sizes (128 and 177) and double-blind procedures, but found near-zero effects (r = –.03 and .02). In both cases, Bayesian analyses suggest there is substantially more evidence for the null hypothesis of no effect than for the original physical warmth priming hypothesis.


Animals ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 75
Author(s):  
Álvaro Navarro-Castilla ◽  
Mario Garrido ◽  
Hadas Hawlena ◽  
Isabel Barja

The study of the endocrine status can be useful to understand wildlife responses to the changing environment. Here, we validated an enzyme immunoassay (EIA) to non-invasively monitor adrenocortical activity by measuring fecal corticosterone metabolites (FCM) in three sympatric gerbil species (Gerbillus andersoni, G. gerbillus and G. pyramidum) from the Northwestern Negev Desert’s sands (Israel). Animals included into treatment groups were injected with adrenocorticotropic hormone (ACTH) to stimulate adrenocortical activity, while control groups received a saline solution. Feces were collected at different intervals and FCM were quantified by an EIA. Basal FCM levels were similar in the three species. The ACTH effect was evidenced, but the time of FCM peak concentrations appearance differed between the species (6–24 h post-injection). Furthermore, FCM peak values were observed sooner in G. andersoni females than in males (6 h and 18 h post-injection, respectively). G. andersoni and G. gerbillus males in control groups also increased FCM levels (18 h and 48 h post-injection, respectively). Despite the small sample sizes, our results confirmed the EIA suitability for analyzing FCM in these species as a reliable indicator of the adrenocortical activity. This study also revealed that close species, and individuals within a species, can respond differently to the same stressor.


2021 ◽  
Vol 224 ◽  
pp. 108731
Author(s):  
Guangfei Li ◽  
Yu Chen ◽  
Thang M. Le ◽  
Simon Zhornitsky ◽  
Wuyi Wang ◽  
...  

2021 ◽  
Vol 11 (6) ◽  
pp. 497
Author(s):  
Yoonsuk Jung ◽  
Eui Im ◽  
Jinhee Lee ◽  
Hyeah Lee ◽  
Changmo Moon

Previous studies have evaluated the effects of antithrombotic agents on the performance of fecal immunochemical tests (FITs) for the detection of colorectal cancer (CRC), but the results were inconsistent and based on small sample sizes. We studied this topic using a large-scale population-based database. Using the Korean National Cancer Screening Program Database, we compared the performance of FITs for CRC detection between users and non-users of antiplatelet agents and warfarin. Non-users were matched according to age and sex. Among 5,426,469 eligible participants, 768,733 used antiplatelet agents (mono/dual/triple therapy, n = 701,683/63,211/3839), and 19,569 used warfarin, while 4,638,167 were non-users. Among antiplatelet agents, aspirin, clopidogrel, and cilostazol ranked first, second, and third, respectively, in terms of prescription rates. Users of antiplatelet agents (3.62% vs. 4.45%; relative risk (RR): 0.83; 95% confidence interval (CI): 0.78–0.88), aspirin (3.66% vs. 4.13%; RR: 0.90; 95% CI: 0.83–0.97), and clopidogrel (3.48% vs. 4.88%; RR: 0.72; 95% CI: 0.61–0.86) had lower positive predictive values (PPVs) for CRC detection than non-users. However, there were no significant differences in PPV between cilostazol vs. non-users and warfarin users vs. non-users. For PPV, the RR (users vs. non-users) for antiplatelet monotherapy was 0.86, while the RRs for dual and triple antiplatelet therapies (excluding cilostazol) were 0.67 and 0.22, respectively. For all antithrombotic agents, the sensitivity for CRC detection was not different between users and non-users. Use of antiplatelet agents, except cilostazol, may increase the false positives without improving the sensitivity of FITs for CRC detection.


Sign in / Sign up

Export Citation Format

Share Document