Dimensionality Reduction Methods for Brain Imaging Data Analysis

The past century has witnessed the grand success of brain imaging technologies, such as electroencephalography and magnetic resonance imaging, in probing cognitive states and pathological brain dynamics for neuroscience research and neurology practices. Human brain is “the most complex object in the universe,” and brain imaging data ( BID ) are routinely of multiple/many attributes and highly non-stationary. These are determined by the nature of BID as the recordings of the evolving processes of the brain(s) under examination in various views. Driven by the increasingly high demands for precision, efficiency, and reliability in neuro-science and engineering tasks, dimensionality reduction has become a priority issue in BID analysis to handle the notoriously high dimensionality and large scale of big BID sets as well as the enormously complicated interdependencies among data elements. This has become particularly urgent and challenging in this big data era. Dimensionality reduction theories and methods manifest unrivaled potential in revealing key insights to BID via offering the low-dimensional/tiny representations/features, which may preserve critical characterizations of massive neuronal activities and brain functional and/or malfunctional states of interest. This study surveys the most salient work along this direction conforming to a 3-dimensional taxonomy with respect to (1) the scale of BID , of which the design with this consideration is important for the potential applications; (2) the order of BID , in which a higher order denotes more BID attributes manipulatable by the method; and (3) linearity , in which the method’s degree of linearity largely determines the “fidelity” in BID exploration. This study defines criteria for qualitative evaluations of these works in terms of effectiveness, interpretability, efficiency, and scalability. The classifications and evaluations based on the taxonomy provide comprehensive guides to (1) how existing research and development efforts are distributed and (2) their performance, features, and potential in influential applications especially when involving big data. In the end, this study crystallizes the open technical issues and proposes research challenges that must be solved to enable further researches in this area of great potential.

Download Full-text

COINSTAC: Decentralizing the future of brain imaging analysis

F1000Research ◽

10.12688/f1000research.12353.1 ◽

2017 ◽

Vol 6 ◽

pp. 1512 ◽

Cited By ~ 8

Author(s):

Jing Ming ◽

Eric Verner ◽

Anand Sarwate ◽

Ross Kelly ◽

Cory Reed ◽

...

Keyword(s):

Brain Imaging ◽

Data Sharing ◽

Large Scale ◽

Data Transfer ◽

Imaging Data ◽

Sensitive Data ◽

Large Scale Data ◽

Decentralized Algorithms ◽

Neuroimaging Data ◽

Brain Imaging Data

In the era of Big Data, sharing neuroimaging data across multiple sites has become increasingly important. However, researchers who want to engage in centralized, large-scale data sharing and analysis must often contend with problems such as high database cost, long data transfer time, extensive manual effort, and privacy issues for sensitive data. To remove these barriers to enable easier data sharing and analysis, we introduced a new, decentralized, privacy-enabled infrastructure model for brain imaging data called COINSTAC in 2016. We have continued development of COINSTAC since this model was first introduced. One of the challenges with such a model is adapting the required algorithms to function within a decentralized framework. In this paper, we report on how we are solving this problem, along with our progress on several fronts, including additional decentralized algorithms implementation, user interface enhancement, decentralized regression statistic calculation, and complete pipeline specifications.

Download Full-text

Common Data Elements, Scalable Data Management Infrastructure, and Analytics Workflows for Large-Scale Neuroimaging Studies

Frontiers in Psychiatry ◽

10.3389/fpsyt.2021.682495 ◽

2021 ◽

Vol 12 ◽

Author(s):

Rayus Kuplicki ◽

James Touthang ◽

Obada Al Zoubi ◽

Ahmad Mayeli ◽

Masaya Misaki ◽

...

Keyword(s):

Data Management ◽

Large Scale ◽

Laboratory Data ◽

Imaging Data ◽

Common Data Elements ◽

Level Data ◽

Reproducible Analysis ◽

Brain Imaging Data ◽

Data Elements ◽

The Brain

Neuroscience studies require considerable bioinformatic support and expertise. Numerous high-dimensional and multimodal datasets must be preprocessed and integrated to create robust and reproducible analysis pipelines. We describe a common data elements and scalable data management infrastructure that allows multiple analytics workflows to facilitate preprocessing, analysis and sharing of large-scale multi-level data. The process uses the Brain Imaging Data Structure (BIDS) format and supports MRI, fMRI, EEG, clinical, and laboratory data. The infrastructure provides support for other datasets such as Fitbit and flexibility for developers to customize the integration of new types of data. Exemplar results from 200+ participants and 11 different pipelines demonstrate the utility of the infrastructure.

Download Full-text

The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism

Molecular Psychiatry ◽

10.1038/mp.2013.78 ◽

2013 ◽

Vol 19 (6) ◽

pp. 659-667 ◽

Cited By ~ 874

Author(s):

A Di Martino ◽

C-G Yan ◽

Q Li ◽

E Denio ◽

F X Castellanos ◽

...

Keyword(s):

Brain Imaging ◽

Data Exchange ◽

Large Scale ◽

Imaging Data ◽

Scale Evaluation ◽

Brain Architecture ◽

Brain Imaging Data

Download Full-text

Hype versus hope: Deep learning encodes more predictive and robust brain imaging representations than standard machine learning

10.1101/2020.04.14.041582 ◽

2020 ◽

Cited By ~ 4

Author(s):

Anees Abrol ◽

Zening Fu ◽

Mustafa Salman ◽

Rogers Silva ◽

Yuhui Du ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Brain Imaging ◽

Large Scale ◽

Representation Learning ◽

Training Sample ◽

Superior Performance ◽

Computational Time ◽

Imaging Data ◽

Brain Imaging Data

AbstractPrevious successes of deep learning (DL) approaches on several complex tasks have hugely inflated expectations of their power to learn subtle properties of complex brain imaging data, and scale to large datasets. Perhaps as a reaction to this inflation, recent critical commentaries unfavorably compare DL with standard machine learning (SML) approaches for the analysis of brain imaging data. Yet, their conclusions are based on pre-engineered features which deprives DL of its main advantage: representation learning. Here we evaluate this and show the importance of representation learning for DL performance on brain imaging data. We report our findings from a large-scale systematic comparison of SML approaches versus DL profiled in a ten-way age and gender-based classification task on 12,314 structural MRI images. Results show that DL methods, if implemented and trained following the prevalent DL practices, have the potential to substantially improve compared to SML approaches. We also show that DL approaches scale particularly well presenting a lower asymptotic complexity in relative computational time, despite being more complex. Our analysis reveals that the performance improvement saturates as the training sample size grows, but shows significantly higher performance throughout. We also show evidence that the superior performance of DL is primarily due to the excellent representation learning capabilities and that SML methods can perform equally well when operating on representations produced by the trained DL models. Finally, we demonstrate that DL embeddings span a comprehensible projection spectrum and that DL consistently localizes discriminative brain biomarkers, providing an example of the robustness of prediction relevance estimates. Our findings highlight the presence of non-linearities in brain imaging data that DL frameworks can exploit to generate superior predictive representations for characterizing the human brain, even with currently available data sizes.

Download Full-text

Common Data Elements, Scalable Data Management Infrastructure and Analytics Workflows for Large-scale Neuroimaging Studies

10.1101/2021.03.16.21253726 ◽

2021 ◽

Author(s):

Rayus Kuplicki ◽

James Touthang ◽

Obada Al Zoubi ◽

Ahmad Mayeli ◽

Masaya Misaki ◽

...

Keyword(s):

Data Management ◽

Large Scale ◽

Laboratory Data ◽

Imaging Data ◽

Common Data Elements ◽

Level Data ◽

Reproducible Analysis ◽

Brain Imaging Data ◽

Data Elements ◽

The Brain

Neuroscience studies require considerable bioinformatic support and expertise. Numerous high-dimensional and multimodal datasets must be preprocessed and integrated to create robust and reproducible analysis pipelines. We describe a common data elements and scalable data management infrastructure that allows multiple analytics workflows to facilitate preprocessing, analysis and sharing of large-scale multi-level data. The process uses the Brain Imaging Data Structure (BIDS) format and supports MRI, fMRI, EEG, clinical and laboratory data. The infrastructure provides support for other datasets such as Fitbit and flexibility for developers to customize the integration of new types of data. Exemplar results from 200+ participants and 11 different pipelines demonstrate the utility of the infrastructure.

Download Full-text

Dimensionality reduction of brain imaging data using graph signal processing

2016 IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2016.7532574 ◽

2016 ◽

Cited By ~ 11

Author(s):

Liu Rui ◽

Hossein Nejati ◽

Ngai-Man Cheung

Keyword(s):

Signal Processing ◽

Dimensionality Reduction ◽

Brain Imaging ◽

Imaging Data ◽

Graph Signal Processing ◽

Brain Imaging Data

Download Full-text

Fractional ridge regression: a fast, interpretable reparameterization of ridge regression

GigaScience ◽

10.1093/gigascience/giaa133 ◽

2020 ◽

Vol 9 (12) ◽

Author(s):

Ariel Rokem ◽

Kendrick Kay

Keyword(s):

Ridge Regression ◽

Large Scale ◽

Imaging Data ◽

Regularization Technique ◽

Large Scale Data ◽

Novel Approach ◽

Manual Exploration ◽

L2 Norm ◽

Software Implementations ◽

Brain Imaging Data

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.

Download Full-text