Dimensionality Reduction Methods for Brain Imaging Data Analysis

2021 ◽  
Vol 54 (4) ◽  
pp. 1-36
Author(s):  
Yunbo Tang ◽  
Dan Chen ◽  
Xiaoli Li

The past century has witnessed the grand success of brain imaging technologies, such as electroencephalography and magnetic resonance imaging, in probing cognitive states and pathological brain dynamics for neuroscience research and neurology practices. Human brain is “the most complex object in the universe,” and brain imaging data ( BID ) are routinely of multiple/many attributes and highly non-stationary. These are determined by the nature of BID as the recordings of the evolving processes of the brain(s) under examination in various views. Driven by the increasingly high demands for precision, efficiency, and reliability in neuro-science and engineering tasks, dimensionality reduction has become a priority issue in BID analysis to handle the notoriously high dimensionality and large scale of big BID sets as well as the enormously complicated interdependencies among data elements. This has become particularly urgent and challenging in this big data era. Dimensionality reduction theories and methods manifest unrivaled potential in revealing key insights to BID via offering the low-dimensional/tiny representations/features, which may preserve critical characterizations of massive neuronal activities and brain functional and/or malfunctional states of interest. This study surveys the most salient work along this direction conforming to a 3-dimensional taxonomy with respect to (1) the scale of BID , of which the design with this consideration is important for the potential applications; (2) the order of BID , in which a higher order denotes more BID attributes manipulatable by the method; and (3) linearity , in which the method’s degree of linearity largely determines the “fidelity” in BID exploration. This study defines criteria for qualitative evaluations of these works in terms of effectiveness, interpretability, efficiency, and scalability. The classifications and evaluations based on the taxonomy provide comprehensive guides to (1) how existing research and development efforts are distributed and (2) their performance, features, and potential in influential applications especially when involving big data. In the end, this study crystallizes the open technical issues and proposes research challenges that must be solved to enable further researches in this area of great potential.

F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 1512 ◽  
Author(s):  
Jing Ming ◽  
Eric Verner ◽  
Anand Sarwate ◽  
Ross Kelly ◽  
Cory Reed ◽  
...  

In the era of Big Data, sharing neuroimaging data across multiple sites has become increasingly important. However, researchers who want to engage in centralized, large-scale data sharing and analysis must often contend with problems such as high database cost, long data transfer time, extensive manual effort, and privacy issues for sensitive data. To remove these barriers to enable easier data sharing and analysis, we introduced a new, decentralized, privacy-enabled infrastructure model for brain imaging data called COINSTAC in 2016. We have continued development of COINSTAC since this model was first introduced. One of the challenges with such a model is adapting the required algorithms to function within a decentralized framework. In this paper, we report on how we are solving this problem, along with our progress on several fronts, including additional decentralized algorithms implementation, user interface enhancement, decentralized regression statistic calculation, and complete pipeline specifications.


2021 ◽  
Vol 12 ◽  
Author(s):  
Rayus Kuplicki ◽  
James Touthang ◽  
Obada Al Zoubi ◽  
Ahmad Mayeli ◽  
Masaya Misaki ◽  
...  

Neuroscience studies require considerable bioinformatic support and expertise. Numerous high-dimensional and multimodal datasets must be preprocessed and integrated to create robust and reproducible analysis pipelines. We describe a common data elements and scalable data management infrastructure that allows multiple analytics workflows to facilitate preprocessing, analysis and sharing of large-scale multi-level data. The process uses the Brain Imaging Data Structure (BIDS) format and supports MRI, fMRI, EEG, clinical, and laboratory data. The infrastructure provides support for other datasets such as Fitbit and flexibility for developers to customize the integration of new types of data. Exemplar results from 200+ participants and 11 different pipelines demonstrate the utility of the infrastructure.


2013 ◽  
Vol 19 (6) ◽  
pp. 659-667 ◽  
Author(s):  
A Di Martino ◽  
C-G Yan ◽  
Q Li ◽  
E Denio ◽  
F X Castellanos ◽  
...  

Author(s):  
Anees Abrol ◽  
Zening Fu ◽  
Mustafa Salman ◽  
Rogers Silva ◽  
Yuhui Du ◽  
...  

AbstractPrevious successes of deep learning (DL) approaches on several complex tasks have hugely inflated expectations of their power to learn subtle properties of complex brain imaging data, and scale to large datasets. Perhaps as a reaction to this inflation, recent critical commentaries unfavorably compare DL with standard machine learning (SML) approaches for the analysis of brain imaging data. Yet, their conclusions are based on pre-engineered features which deprives DL of its main advantage: representation learning. Here we evaluate this and show the importance of representation learning for DL performance on brain imaging data. We report our findings from a large-scale systematic comparison of SML approaches versus DL profiled in a ten-way age and gender-based classification task on 12,314 structural MRI images. Results show that DL methods, if implemented and trained following the prevalent DL practices, have the potential to substantially improve compared to SML approaches. We also show that DL approaches scale particularly well presenting a lower asymptotic complexity in relative computational time, despite being more complex. Our analysis reveals that the performance improvement saturates as the training sample size grows, but shows significantly higher performance throughout. We also show evidence that the superior performance of DL is primarily due to the excellent representation learning capabilities and that SML methods can perform equally well when operating on representations produced by the trained DL models. Finally, we demonstrate that DL embeddings span a comprehensible projection spectrum and that DL consistently localizes discriminative brain biomarkers, providing an example of the robustness of prediction relevance estimates. Our findings highlight the presence of non-linearities in brain imaging data that DL frameworks can exploit to generate superior predictive representations for characterizing the human brain, even with currently available data sizes.


2021 ◽  
Author(s):  
Rayus Kuplicki ◽  
James Touthang ◽  
Obada Al Zoubi ◽  
Ahmad Mayeli ◽  
Masaya Misaki ◽  
...  

Neuroscience studies require considerable bioinformatic support and expertise. Numerous high-dimensional and multimodal datasets must be preprocessed and integrated to create robust and reproducible analysis pipelines. We describe a common data elements and scalable data management infrastructure that allows multiple analytics workflows to facilitate preprocessing, analysis and sharing of large-scale multi-level data. The process uses the Brain Imaging Data Structure (BIDS) format and supports MRI, fMRI, EEG, clinical and laboratory data. The infrastructure provides support for other datasets such as Fitbit and flexibility for developers to customize the integration of new types of data. Exemplar results from 200+ participants and 11 different pipelines demonstrate the utility of the infrastructure.


GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Ariel Rokem ◽  
Kendrick Kay

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.


2021 ◽  
Author(s):  
Elise Bannier ◽  
Gareth Barker ◽  
Valentina Borghesani ◽  
Nils Broeckx ◽  
Patricia Clement ◽  
...  

Author(s):  
Tewodros Mulugeta Dagnew ◽  
Letizia Squarcina ◽  
Massimo W. Rivolta ◽  
Paolo Brambilla ◽  
Roberto Sassi

Sign in / Sign up

Export Citation Format

Share Document