COINSTAC: Decentralizing the future of brain imaging analysis

In the era of Big Data, sharing neuroimaging data across multiple sites has become increasingly important. However, researchers who want to engage in centralized, large-scale data sharing and analysis must often contend with problems such as high database cost, long data transfer time, extensive manual effort, and privacy issues for sensitive data. To remove these barriers to enable easier data sharing and analysis, we introduced a new, decentralized, privacy-enabled infrastructure model for brain imaging data called COINSTAC in 2016. We have continued development of COINSTAC since this model was first introduced. One of the challenges with such a model is adapting the required algorithms to function within a decentralized framework. In this paper, we report on how we are solving this problem, along with our progress on several fronts, including additional decentralized algorithms implementation, user interface enhancement, decentralized regression statistic calculation, and complete pipeline specifications.

Download Full-text

FlywheelTools: Data Curation and Manipulation on the Flywheel Platform

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.678403 ◽

2021 ◽

Vol 15 ◽

Author(s):

Tinashe M. Tapera ◽

Matthew Cieslak ◽

Max Bertolero ◽

Azeez Adebimpe ◽

Geoffrey K. Aguirre ◽

...

Keyword(s):

Large Scale ◽

Data Curation ◽

Imaging Data ◽

Large Scale Data ◽

Neuroscience Research ◽

Neuroimaging Data ◽

Brain Imaging Data ◽

Database Platform ◽

The Brain ◽

Scale Data

The recent and growing focus on reproducibility in neuroimaging studies has led many major academic centers to use cloud-based imaging databases for storing, analyzing, and sharing complex imaging data. Flywheel is one such database platform that offers easily accessible, large-scale data management, along with a framework for reproducible analyses through containerized pipelines. The Brain Imaging Data Structure (BIDS) is the de facto standard for neuroimaging data, but curating neuroimaging data into BIDS can be a challenging and time-consuming task. In particular, standard solutions for BIDS curation are limited on Flywheel. To address these challenges, we developed “FlywheelTools,” a software toolbox for reproducible data curation and manipulation on Flywheel. FlywheelTools includes two elements: fw-heudiconv, for heuristic-driven curation of data into BIDS, and flaudit, which audits and inventories projects on Flywheel. Together, these tools accelerate reproducible neuroscience research on the widely used Flywheel platform.

Download Full-text

TRANSPARENCY AND REPRODUCIBILITY IN THE NEUROIMAGING OF AGING

Innovation in Aging ◽

10.1093/geroni/igz038.088 ◽

2019 ◽

Vol 3 (Supplement_1) ◽

pp. S23-S24

Author(s):

Kendra L Seaman

Keyword(s):

Large Scale ◽

Meta Analysis ◽

Social Science Research ◽

Science Research ◽

Data Sets ◽

Imaging Data ◽

Large Scale Data ◽

Neuroimaging Data ◽

Brain Imaging Data

Abstract In concert with broader efforts to increase the reliability of social science research, there are several efforts to increase transparency and reproducibility in neuroimaging. The large-scale nature of neuroimaging data and constantly evolving analysis tools can make transparency challenging. I will describe emerging tools used to document, organize, and share behavioral and neuroimaging data. These tools include: (1) the preregistration of neuroimaging data sets which increases openness and protects researchers from suspicions of p-hacking, (2) the conversion of neuroimaging data into a standardized format (Brain Imaging Data Structure: BIDS) that enables standardized scripts to process and share neuroimaging data, and (3) the sharing of final neuroimaging results on Neurovault which allows the community to do rapid meta-analysis. Using these tools improves workflows within labs, improves the overall quality of our science and provides a potential model for other disciplines using large-scale data.

Download Full-text

FlywheelTools: Data Curation and Manipulation on the Flywheel Platform

10.1101/2021.03.12.434998 ◽

2021 ◽

Author(s):

Tinashe M. Tapera ◽

Matthew Cieslak ◽

Max Bertolero ◽

Azeez Adebimpe ◽

Geoffrey K. Aguirre ◽

...

Keyword(s):

Data Storage ◽

Large Scale ◽

Data Curation ◽

Imaging Data ◽

Large Scale Data ◽

Neuroscience Research ◽

Neuroimaging Data ◽

Brain Imaging Data ◽

Database Platform ◽

The Brain

ABSTRACTThe recent and growing focus on reproducibility in neuroimaging studies has led many major academic centers to use cloud-based imaging databases for storing, analyzing, and sharing complex imaging data. Flywheel is one such database platform that offers easily accessible, large-scale data management, along with a framework for reproducible analyses through containerized pipelines. The Brain Imaging Data Structure (BIDS) is a data storage specification for neuroimaging data, but curating neuroimaging data into BIDS can be a challenging and time-consuming task. In particular, standard solutions for BIDS curation are not designed for use on cloud-based systems such as Flywheel. To address these challenges, we developed “FlywheelTools”, a software toolbox for reproducible data curation and manipulation on Flywheel. FlywheelTools includes two elements: fw-heudiconv, for heuristic-driven curation of data into BIDS, and flaudit, which audits and inventories projects on Flywheel. Together, these tools accelerate reproducible neuroscience research on the widely used Flywheel platform.

Download Full-text

BIDSonym - a BIDSapp for the pseudo-anonymization of neuroimaging datasets

10.31234/osf.io/3aknq ◽

2021 ◽

Author(s):

Peer Herholz ◽

Rita M. Ludwig ◽

Jean-Baptiste Poline

Keyword(s):

Data Structure ◽

Brain Imaging ◽

Data Sharing ◽

Sensitive Information ◽

Imaging Data ◽

Manual Intervention ◽

Neuroimaging Data ◽

Brain Imaging Data ◽

The Brain ◽

Infor Mation

The amount of neuroimaging data being shared increased exponentially in recent years. While thisdevelopment introduces prominent advantages concerning open, reproducible and sustainable neu-roimaging, the process of data sharing must ensure the privacy of participant data. A requirement fromboth, Ethics Review Boards and data sharing resources, datasets need to be (pseudo-) anonymized priorto sharing in order to limit participant re-identification. Depending on the dataset at hand, this processcan however become cumbersome and prone to errors. Here we introduce BIDSonym, a tool for auto-mated pseudo-anonymization of neuroimaging datasets. BIDSonym supports multiple de-identificationprocedures and operates on neuroimaging, as well as metadata files. In addition, all metadata infor-mation present in the respective files is gathered and evaluated. Its outputs furthermore allow usersto conduct a more in-depth assessment of potentially sensitive information present in a given dataset.Through its workflow and utilization of the Brain Imaging Data Structure (BIDS), BIDSonym’s appli-cation is reproducible, requires no manual intervention and is agnostic to idiosyncrasies of small andlarge scale datasets.

Download Full-text

Fractional ridge regression: a fast, interpretable reparameterization of ridge regression

GigaScience ◽

10.1093/gigascience/giaa133 ◽

2020 ◽

Vol 9 (12) ◽

Author(s):

Ariel Rokem ◽

Kendrick Kay

Keyword(s):

Ridge Regression ◽

Large Scale ◽

Imaging Data ◽

Regularization Technique ◽

Large Scale Data ◽

Novel Approach ◽

Manual Exploration ◽

L2 Norm ◽

Software Implementations ◽

Brain Imaging Data

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.

Download Full-text

The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism

Molecular Psychiatry ◽

10.1038/mp.2013.78 ◽

2013 ◽

Vol 19 (6) ◽

pp. 659-667 ◽

Cited By ~ 874

Author(s):

A Di Martino ◽

C-G Yan ◽

Q Li ◽

E Denio ◽

F X Castellanos ◽

...

Keyword(s):

Brain Imaging ◽

Data Exchange ◽

Large Scale ◽

Imaging Data ◽

Scale Evaluation ◽

Brain Architecture ◽

Brain Imaging Data

Download Full-text

Opportunities for Understanding MS Mechanisms and Progression With MRI Using Large-Scale Data Sharing and Artificial Intelligence

Neurology ◽

10.1212/wnl.0000000000012884 ◽

2021 ◽

pp. 10.1212/WNL.0000000000012884

Author(s):

Hugo Vrenken ◽

Mark Jenkinson ◽

Dzung Pham ◽

Charles R.G. Guttmann ◽

Deborah Pareto ◽

...

Keyword(s):

Artificial Intelligence ◽

Image Analysis ◽

Data Sharing ◽

Large Scale ◽

Personal Data ◽

Human Observer ◽

Imaging Data ◽

Large Scale Data ◽

Scale Data

Multiple sclerosis (MS) patients have heterogeneous clinical presentations, symptoms and progression over time, making MS difficult to assess and comprehend in vivo. The combination of large-scale data-sharing and artificial intelligence creates new opportunities for monitoring and understanding MS using magnetic resonance imaging (MRI).First, development of validated MS-specific image analysis methods can be boosted by verified reference, test and benchmark imaging data. Using detailed expert annotations, artificial intelligence algorithms can be trained on such MS-specific data. Second, understanding disease processes could be greatly advanced through shared data of large MS cohorts with clinical, demographic and treatment information. Relevant patterns in such data that may be imperceptible to a human observer could be detected through artificial intelligence techniques. This applies from image analysis (lesions, atrophy or functional network changes) to large multi-domain datasets (imaging, cognition, clinical disability, genetics, etc.).After reviewing data-sharing and artificial intelligence, this paper highlights three areas that offer strong opportunities for making advances in the next few years: crowdsourcing, personal data protection, and organized analysis challenges. Difficulties as well as specific recommendations to overcome them are discussed, in order to best leverage data sharing and artificial intelligence to improve image analysis, imaging and the understanding of MS.

Download Full-text

OpenNeuro: An open resource for sharing of neuroimaging data

10.1101/2021.06.28.450168 ◽

2021 ◽

Author(s):

Christopher J Markiewicz ◽

Krzysztof Jacek Gorgolewski ◽

Franklin Feingold ◽

Ross Blair ◽

Yaroslav O Halchenko ◽

...

Keyword(s):

Brain Imaging ◽

Open Science ◽

Imaging Data ◽

Data Types ◽

Public Investments ◽

Neuroimaging Data ◽

Share Data ◽

Brain Imaging Data ◽

Multiple Species ◽

The Impact

The sharing of research data is essential to ensure reproducibility and maximize the impact of public investments in scientific research. Here we describe OpenNeuro, a BRAIN Initiative data archive that provides the ability to openly share data from a broad range of brain imaging data types following the FAIR principles for data sharing. We highlight the importance of the Brain Imaging Data Structure (BIDS) standard for enabling effective curation, sharing, and reuse of data. The archive presently shares more than 500 datasets including data from more than 18,000 participants, comprising multiple species and measurement modalities and a broad range of phenotypes. The impact of the shared data is evident in a growing number of published reuses, currently totalling more than 150 publications. We conclude by describing plans for future development and integration with other ongoing open science efforts.

Download Full-text

BIDScoin: A User-Friendly Application to Convert Source Data to Brain Imaging Data Structure

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.770608 ◽

2022 ◽

Vol 15 ◽

Author(s):

Marcel Peter Zwiers ◽

Stefano Moia ◽

Robert Oostenveld

Keyword(s):

Data Structure ◽

Brain Imaging ◽

Imaging Data ◽

Data Set ◽

Source Data ◽

Neuroimaging Data ◽

Programming Skills ◽

Brain Imaging Data ◽

Set Up ◽

User Friendly

Analyses of brain function and anatomy using shared neuroimaging data is an important development, and have acquired the potential to be scaled up with the specification of a new Brain Imaging Data Structure (BIDS) standard. To date, a variety of software tools help researchers in converting their source data to BIDS but often require programming skills or are tailored to specific institutes, data sets, or data formats. In this paper, we introduce BIDScoin, a cross-platform, flexible, and user-friendly converter that provides a graphical user interface (GUI) to help users finding their way in BIDS standard. BIDScoin does not require programming skills to be set up and used and supports plugins to extend their functionality. In this paper, we show its design and demonstrate how it can be applied to a downloadable tutorial data set. BIDScoin is distributed as free and open-source software to foster the community-driven effort to promote and facilitate the use of BIDS standard.

Download Full-text

Deep learning encodes robust discriminative neuroimaging representations to outperform standard machine learning

Nature Communications ◽

10.1038/s41467-020-20655-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Anees Abrol ◽

Zening Fu ◽

Mustafa Salman ◽

Rogers Silva ◽

Yuhui Du ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Representation Learning ◽

Structural Mri ◽

Computational Time ◽

Imaging Data ◽

Neuroimaging Data ◽

Asymptotic Complexity ◽

Brain Imaging Data

AbstractRecent critical commentaries unfavorably compare deep learning (DL) with standard machine learning (SML) approaches for brain imaging data analysis. However, their conclusions are often based on pre-engineered features depriving DL of its main advantage — representation learning. We conduct a large-scale systematic comparison profiled in multiple classification and regression tasks on structural MRI images and show the importance of representation learning for DL. Results show that if trained following prevalent DL practices, DL methods have the potential to scale particularly well and substantially improve compared to SML methods, while also presenting a lower asymptotic complexity in relative computational time, despite being more complex. We also demonstrate that DL embeddings span comprehensible task-specific projection spectra and that DL consistently localizes task-discriminative brain biomarkers. Our findings highlight the presence of nonlinearities in neuroimaging data that DL can exploit to generate superior task-discriminative representations for characterizing the human brain.

Download Full-text