scholarly journals Common Data Elements, Scalable Data Management Infrastructure, and Analytics Workflows for Large-Scale Neuroimaging Studies

2021 ◽  
Vol 12 ◽  
Author(s):  
Rayus Kuplicki ◽  
James Touthang ◽  
Obada Al Zoubi ◽  
Ahmad Mayeli ◽  
Masaya Misaki ◽  
...  

Neuroscience studies require considerable bioinformatic support and expertise. Numerous high-dimensional and multimodal datasets must be preprocessed and integrated to create robust and reproducible analysis pipelines. We describe a common data elements and scalable data management infrastructure that allows multiple analytics workflows to facilitate preprocessing, analysis and sharing of large-scale multi-level data. The process uses the Brain Imaging Data Structure (BIDS) format and supports MRI, fMRI, EEG, clinical, and laboratory data. The infrastructure provides support for other datasets such as Fitbit and flexibility for developers to customize the integration of new types of data. Exemplar results from 200+ participants and 11 different pipelines demonstrate the utility of the infrastructure.

2021 ◽  
Author(s):  
Rayus Kuplicki ◽  
James Touthang ◽  
Obada Al Zoubi ◽  
Ahmad Mayeli ◽  
Masaya Misaki ◽  
...  

Neuroscience studies require considerable bioinformatic support and expertise. Numerous high-dimensional and multimodal datasets must be preprocessed and integrated to create robust and reproducible analysis pipelines. We describe a common data elements and scalable data management infrastructure that allows multiple analytics workflows to facilitate preprocessing, analysis and sharing of large-scale multi-level data. The process uses the Brain Imaging Data Structure (BIDS) format and supports MRI, fMRI, EEG, clinical and laboratory data. The infrastructure provides support for other datasets such as Fitbit and flexibility for developers to customize the integration of new types of data. Exemplar results from 200+ participants and 11 different pipelines demonstrate the utility of the infrastructure.


2021 ◽  
Vol 15 ◽  
Author(s):  
Tinashe M. Tapera ◽  
Matthew Cieslak ◽  
Max Bertolero ◽  
Azeez Adebimpe ◽  
Geoffrey K. Aguirre ◽  
...  

The recent and growing focus on reproducibility in neuroimaging studies has led many major academic centers to use cloud-based imaging databases for storing, analyzing, and sharing complex imaging data. Flywheel is one such database platform that offers easily accessible, large-scale data management, along with a framework for reproducible analyses through containerized pipelines. The Brain Imaging Data Structure (BIDS) is the de facto standard for neuroimaging data, but curating neuroimaging data into BIDS can be a challenging and time-consuming task. In particular, standard solutions for BIDS curation are limited on Flywheel. To address these challenges, we developed “FlywheelTools,” a software toolbox for reproducible data curation and manipulation on Flywheel. FlywheelTools includes two elements: fw-heudiconv, for heuristic-driven curation of data into BIDS, and flaudit, which audits and inventories projects on Flywheel. Together, these tools accelerate reproducible neuroscience research on the widely used Flywheel platform.


2020 ◽  
Author(s):  
Christopher R Madan

We are now in a time of readily available brain imaging data. Not only are researchers now sharing data more than ever before, but additionally large-scale data collecting initiatives are underway with the vision that many future researchers will use the data for secondary analyses. Here I provide an overview of available datasets and some example use cases. Example use cases include examining individual differences, more robust findings, reproducibility--both in public input data and availability as a replication sample, and methods development. I further discuss a variety of considerations associated with using existing data and the opportunities associated with large datasets. Suggestions for further readings on general neuroimaging and topic-specific discussions are also provided.


2021 ◽  
Author(s):  
Tinashe M. Tapera ◽  
Matthew Cieslak ◽  
Max Bertolero ◽  
Azeez Adebimpe ◽  
Geoffrey K. Aguirre ◽  
...  

ABSTRACTThe recent and growing focus on reproducibility in neuroimaging studies has led many major academic centers to use cloud-based imaging databases for storing, analyzing, and sharing complex imaging data. Flywheel is one such database platform that offers easily accessible, large-scale data management, along with a framework for reproducible analyses through containerized pipelines. The Brain Imaging Data Structure (BIDS) is a data storage specification for neuroimaging data, but curating neuroimaging data into BIDS can be a challenging and time-consuming task. In particular, standard solutions for BIDS curation are not designed for use on cloud-based systems such as Flywheel. To address these challenges, we developed “FlywheelTools”, a software toolbox for reproducible data curation and manipulation on Flywheel. FlywheelTools includes two elements: fw-heudiconv, for heuristic-driven curation of data into BIDS, and flaudit, which audits and inventories projects on Flywheel. Together, these tools accelerate reproducible neuroscience research on the widely used Flywheel platform.


2021 ◽  
Author(s):  
Christopher R. Madan

AbstractWe are now in a time of readily available brain imaging data. Not only are researchers now sharing data more than ever before, but additionally large-scale data collecting initiatives are underway with the vision that many future researchers will use the data for secondary analyses. Here I provide an overview of available datasets and some example use cases. Example use cases include examining individual differences, more robust findings, reproducibility–both in public input data and availability as a replication sample, and methods development. I further discuss a variety of considerations associated with using existing data and the opportunities associated with large datasets. Suggestions for further readings on general neuroimaging and topic-specific discussions are also provided.


2021 ◽  
Vol 54 (4) ◽  
pp. 1-36
Author(s):  
Yunbo Tang ◽  
Dan Chen ◽  
Xiaoli Li

The past century has witnessed the grand success of brain imaging technologies, such as electroencephalography and magnetic resonance imaging, in probing cognitive states and pathological brain dynamics for neuroscience research and neurology practices. Human brain is “the most complex object in the universe,” and brain imaging data ( BID ) are routinely of multiple/many attributes and highly non-stationary. These are determined by the nature of BID as the recordings of the evolving processes of the brain(s) under examination in various views. Driven by the increasingly high demands for precision, efficiency, and reliability in neuro-science and engineering tasks, dimensionality reduction has become a priority issue in BID analysis to handle the notoriously high dimensionality and large scale of big BID sets as well as the enormously complicated interdependencies among data elements. This has become particularly urgent and challenging in this big data era. Dimensionality reduction theories and methods manifest unrivaled potential in revealing key insights to BID via offering the low-dimensional/tiny representations/features, which may preserve critical characterizations of massive neuronal activities and brain functional and/or malfunctional states of interest. This study surveys the most salient work along this direction conforming to a 3-dimensional taxonomy with respect to (1) the scale of BID , of which the design with this consideration is important for the potential applications; (2) the order of BID , in which a higher order denotes more BID attributes manipulatable by the method; and (3) linearity , in which the method’s degree of linearity largely determines the “fidelity” in BID exploration. This study defines criteria for qualitative evaluations of these works in terms of effectiveness, interpretability, efficiency, and scalability. The classifications and evaluations based on the taxonomy provide comprehensive guides to (1) how existing research and development efforts are distributed and (2) their performance, features, and potential in influential applications especially when involving big data. In the end, this study crystallizes the open technical issues and proposes research challenges that must be solved to enable further researches in this area of great potential.


GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Ariel Rokem ◽  
Kendrick Kay

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.


Circulation ◽  
2018 ◽  
Vol 138 (Suppl_1) ◽  
Author(s):  
Helena Sviglin ◽  
Gauri Dandi ◽  
Eileen Navarro Almario ◽  
Tejas Patel ◽  
Colin O Wu ◽  
...  

Introduction: An objective of the Meta-AnalyTical Interagency Group (MATIG) is to conduct patient-level meta-analyses of cardiovascular outcomes using data from publicly available repositories. We describe challenges with data re-use from a seminal trial, provide a systematic approach to identify and curate data elements for hypothesis generation, and establish stackable trials to support these analyses. Methods: We used data from the ACCORD trial to assess risk factors and their gender specific differences for the event of hospitalization or death due to heart failure (hdHF), in patients with type 2 diabetes*. We identified the data elements needed to answer the research questions, reviewed the trial protocol to verify definitions, extracted patient-level data, performed quality assessment and statistical analysis. The results showed a gender difference in the effect of intensive vs. standard glucose-lowering therapy on hdHF. To validate the findings, we sought additional trials in BioLINCC to develop a compendium for meta-analysis, and repeated these steps for each trial. Results: Challenges for reusing the ACCORD trial included access to complete patient-level data and metadata. The compendium, developed to evaluate the stackability** of data across trials, identified differences in trial designs, patient populations, study interventions, and data elements that may impact the feasibility and interpretation of meta-analysis. An example of compendium components is shown in Table 1. Conclusion: High-quality metadata facilitate re-use of trial repository data. This compendium standardizes common data elements for gender, racial and age-group specific outcome assessment in major clinical trials. It provides the framework to assess the fitness of trials for patient-level meta-analyses. Efforts are underway by MATIG to expand the compendium to include risk factors and major cardiovascular outcomes across multiple large trials for meta-analysis.


Author(s):  
Laura Dipietro ◽  
Seth Elkin-Frankston ◽  
Ciro Ramos-Estebanez ◽  
Timothy Wagner

The history of neuroscience has tracked with the evolution of science and technology. Today, neuroscience's trajectory is heavily dependent on computational systems and the availability of high-performance computing (HPC), which are becoming indispensable for building simulations of the brain, coping with high computational demands of analysis of brain imaging data sets, and developing treatments for neurological diseases. This chapter will briefly review the current and potential future use of supercomputers in neuroscience.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Johanna Wagner ◽  
Ramon Martinez-Cancino ◽  
Arnaud Delorme ◽  
Scott Makeig ◽  
Teodoro Solis-Escalante ◽  
...  

Abstract In this report we present a mobile brain/body imaging (MoBI) dataset that allows study of source-resolved cortical dynamics supporting coordinated gait movements in a rhythmic auditory cueing paradigm. Use of an auditory pacing stimulus stream has been recommended to identify deficits and treat gait impairments in neurologic populations. Here, the rhythmic cueing paradigm required healthy young participants to walk on a treadmill (constant speed) while attempting to maintain step synchrony with an auditory pacing stream and to adapt their step length and rate to unanticipated shifts in tempo of the pacing stimuli (e.g., sudden shifts to a faster or slower tempo). High-density electroencephalography (EEG, 108 channels), surface electromyography (EMG, bilateral tibialis anterior), pressure sensors on the heel (to register timing of heel strikes), and goniometers (knee, hip, and ankle joint angles) were concurrently recorded in 20 participants. The data is provided in the Brain Imaging Data Structure (BIDS) format to promote data sharing and reuse, and allow the inclusion of the data into fully automated data analysis workflows.


Sign in / Sign up

Export Citation Format

Share Document