scholarly journals Hype versus hope: Deep learning encodes more predictive and robust brain imaging representations than standard machine learning

Author(s):  
Anees Abrol ◽  
Zening Fu ◽  
Mustafa Salman ◽  
Rogers Silva ◽  
Yuhui Du ◽  
...  

AbstractPrevious successes of deep learning (DL) approaches on several complex tasks have hugely inflated expectations of their power to learn subtle properties of complex brain imaging data, and scale to large datasets. Perhaps as a reaction to this inflation, recent critical commentaries unfavorably compare DL with standard machine learning (SML) approaches for the analysis of brain imaging data. Yet, their conclusions are based on pre-engineered features which deprives DL of its main advantage: representation learning. Here we evaluate this and show the importance of representation learning for DL performance on brain imaging data. We report our findings from a large-scale systematic comparison of SML approaches versus DL profiled in a ten-way age and gender-based classification task on 12,314 structural MRI images. Results show that DL methods, if implemented and trained following the prevalent DL practices, have the potential to substantially improve compared to SML approaches. We also show that DL approaches scale particularly well presenting a lower asymptotic complexity in relative computational time, despite being more complex. Our analysis reveals that the performance improvement saturates as the training sample size grows, but shows significantly higher performance throughout. We also show evidence that the superior performance of DL is primarily due to the excellent representation learning capabilities and that SML methods can perform equally well when operating on representations produced by the trained DL models. Finally, we demonstrate that DL embeddings span a comprehensible projection spectrum and that DL consistently localizes discriminative brain biomarkers, providing an example of the robustness of prediction relevance estimates. Our findings highlight the presence of non-linearities in brain imaging data that DL frameworks can exploit to generate superior predictive representations for characterizing the human brain, even with currently available data sizes.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Anees Abrol ◽  
Zening Fu ◽  
Mustafa Salman ◽  
Rogers Silva ◽  
Yuhui Du ◽  
...  

AbstractRecent critical commentaries unfavorably compare deep learning (DL) with standard machine learning (SML) approaches for brain imaging data analysis. However, their conclusions are often based on pre-engineered features depriving DL of its main advantage — representation learning. We conduct a large-scale systematic comparison profiled in multiple classification and regression tasks on structural MRI images and show the importance of representation learning for DL. Results show that if trained following prevalent DL practices, DL methods have the potential to scale particularly well and substantially improve compared to SML methods, while also presenting a lower asymptotic complexity in relative computational time, despite being more complex. We also demonstrate that DL embeddings span comprehensible task-specific projection spectra and that DL consistently localizes task-discriminative brain biomarkers. Our findings highlight the presence of nonlinearities in neuroimaging data that DL can exploit to generate superior task-discriminative representations for characterizing the human brain.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 1512 ◽  
Author(s):  
Jing Ming ◽  
Eric Verner ◽  
Anand Sarwate ◽  
Ross Kelly ◽  
Cory Reed ◽  
...  

In the era of Big Data, sharing neuroimaging data across multiple sites has become increasingly important. However, researchers who want to engage in centralized, large-scale data sharing and analysis must often contend with problems such as high database cost, long data transfer time, extensive manual effort, and privacy issues for sensitive data. To remove these barriers to enable easier data sharing and analysis, we introduced a new, decentralized, privacy-enabled infrastructure model for brain imaging data called COINSTAC in 2016. We have continued development of COINSTAC since this model was first introduced. One of the challenges with such a model is adapting the required algorithms to function within a decentralized framework. In this paper, we report on how we are solving this problem, along with our progress on several fronts, including additional decentralized algorithms implementation, user interface enhancement, decentralized regression statistic calculation, and complete pipeline specifications.


2013 ◽  
Vol 19 (6) ◽  
pp. 659-667 ◽  
Author(s):  
A Di Martino ◽  
C-G Yan ◽  
Q Li ◽  
E Denio ◽  
F X Castellanos ◽  
...  

2019 ◽  
Author(s):  
Yafeng Zhan ◽  
Jianze Wei ◽  
Jian Liang ◽  
Xiu Xu ◽  
Ran He ◽  
...  

AbstractPsychiatric disorders often exhibit shared (co-morbid) symptoms, raising controversies over accurate diagnosis and the overlap of their neural underpinnings. Because the complexity of data generated by clinical studies poses a formidable challenge, we have pursued a reductionist framework using brain imaging data of a transgenic primate model of autism spectrum disorder (ASD). Here we report an interpretable cross-species machine learning approach which extracts transgene-related core regions in the monkey brain to construct the classifier for diagnostic classification in humans. The cross-species classifier based on core regions, mainly distributed in frontal and temporal cortex, identified from the transgenic primate model, achieved an accuracy of 82.14% in one clinical ASD cohort obtained from Autism Brain Imaging Data Exchange (ABIDE-I), significantly higher than the human-based classifier (61.31%, p < 0.001), which was validated in another independent ASD cohort obtained from ABIDE-II. Such monkey-based classifier generalized to achieve a better classification in obsessive-compulsive disorder (OCD) cohorts, and enabled parsing of differential connections to right ventrolateral prefrontal cortex being attributable to distinct traits in patients with ASD and OCD. These findings underscore the importance of investigating biologically homogeneous samples, particularly in the absence of real-world data adequate for deconstructing heterogeneity inherited in the clinical cohorts.One Sentence SummaryFeatures learned from transgenic monkeys enable improved diagnosis of autism-related disorders and dissection of their underlying circuits.


2021 ◽  
Vol 15 ◽  
Author(s):  
Laura Tomaz Da Silva ◽  
Nathalia Bianchini Esper ◽  
Duncan D. Ruiz ◽  
Felipe Meneguzzi ◽  
Augusto Buchweitz

Problem: Brain imaging studies of mental health and neurodevelopmental disorders have recently included machine learning approaches to identify patients based solely on their brain activation. The goal is to identify brain-related features that generalize from smaller samples of data to larger ones; in the case of neurodevelopmental disorders, finding these patterns can help understand differences in brain function and development that underpin early signs of risk for developmental dyslexia. The success of machine learning classification algorithms on neurofunctional data has been limited to typically homogeneous data sets of few dozens of participants. More recently, larger brain imaging data sets have allowed for deep learning techniques to classify brain states and clinical groups solely from neurofunctional features. Indeed, deep learning techniques can provide helpful tools for classification in healthcare applications, including classification of structural 3D brain images. The adoption of deep learning approaches allows for incremental improvements in classification performance of larger functional brain imaging data sets, but still lacks diagnostic insights about the underlying brain mechanisms associated with disorders; moreover, a related challenge involves providing more clinically-relevant explanations from the neural features that inform classification.Methods: We target this challenge by leveraging two network visualization techniques in convolutional neural network layers responsible for learning high-level features. Using such techniques, we are able to provide meaningful images for expert-backed insights into the condition being classified. We address this challenge using a dataset that includes children diagnosed with developmental dyslexia, and typical reader children.Results: Our results show accurate classification of developmental dyslexia (94.8%) from the brain imaging alone, while providing automatic visualizations of the features involved that match contemporary neuroscientific knowledge (brain regions involved in the reading process for the dyslexic reader group and brain regions associated with strategic control and attention processes for the typical reader group).Conclusions: Our visual explanations of deep learning models turn the accurate yet opaque conclusions from the models into evidence to the condition being studied.


2019 ◽  
Author(s):  
Linfeng Yang ◽  
Rajarshi. P. Ghosh ◽  
J. Matthew Franklin ◽  
Chenyu You ◽  
Jan T. Liphardt

AbstractSegmenting cell nuclei within microscopy images is a ubiquitous task in biological research and clinical applications. Unfortunately, segmenting low-contrast overlapping objects that may be tightly packed is a major bottleneck in standard deep learning-based models. We report a Nuclear Segmentation Tool (NuSeT) based on deep learning that accurately segments nuclei across multiple types of fluorescence imaging data. Using a hybrid network consisting of U-Net and Region Proposal Networks (RPN), followed by a watershed step, we have achieved superior performance in detecting and delineating nuclear boundaries in 2D and 3D images of varying complexities. By using foreground normalization and additional training on synthetic images containing non-cellular artifacts, NuSeT improves nuclear detection and reduces false positives. NuSeT addresses common challenges in nuclear segmentation such as variability in nuclear signal and shape, limited training sample size, and sample preparation artifacts. Compared to other segmentation models, NuSeT consistently fares better in generating accurate segmentation masks and assigning boundaries for touching nuclei.


2021 ◽  
Vol 54 (4) ◽  
pp. 1-36
Author(s):  
Yunbo Tang ◽  
Dan Chen ◽  
Xiaoli Li

The past century has witnessed the grand success of brain imaging technologies, such as electroencephalography and magnetic resonance imaging, in probing cognitive states and pathological brain dynamics for neuroscience research and neurology practices. Human brain is “the most complex object in the universe,” and brain imaging data ( BID ) are routinely of multiple/many attributes and highly non-stationary. These are determined by the nature of BID as the recordings of the evolving processes of the brain(s) under examination in various views. Driven by the increasingly high demands for precision, efficiency, and reliability in neuro-science and engineering tasks, dimensionality reduction has become a priority issue in BID analysis to handle the notoriously high dimensionality and large scale of big BID sets as well as the enormously complicated interdependencies among data elements. This has become particularly urgent and challenging in this big data era. Dimensionality reduction theories and methods manifest unrivaled potential in revealing key insights to BID via offering the low-dimensional/tiny representations/features, which may preserve critical characterizations of massive neuronal activities and brain functional and/or malfunctional states of interest. This study surveys the most salient work along this direction conforming to a 3-dimensional taxonomy with respect to (1) the scale of BID , of which the design with this consideration is important for the potential applications; (2) the order of BID , in which a higher order denotes more BID attributes manipulatable by the method; and (3) linearity , in which the method’s degree of linearity largely determines the “fidelity” in BID exploration. This study defines criteria for qualitative evaluations of these works in terms of effectiveness, interpretability, efficiency, and scalability. The classifications and evaluations based on the taxonomy provide comprehensive guides to (1) how existing research and development efforts are distributed and (2) their performance, features, and potential in influential applications especially when involving big data. In the end, this study crystallizes the open technical issues and proposes research challenges that must be solved to enable further researches in this area of great potential.


GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Ariel Rokem ◽  
Kendrick Kay

Abstract Background Ridge regression is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using ridge regression is the need to set a hyperparameter (α) that controls the amount of regularization. Cross-validation is typically used to select the best α from a set of candidates. However, efficient and appropriate selection of α can be challenging. This becomes prohibitive when large amounts of data are analyzed. Because the selected α depends on the scale of the data and correlations across predictors, it is also not straightforwardly interpretable. Results The present work addresses these challenges through a novel approach to ridge regression. We propose to reparameterize ridge regression in terms of the ratio γ between the L2-norms of the regularized and unregularized coefficients. We provide an algorithm that efficiently implements this approach, called fractional ridge regression, as well as open-source software implementations in Python and matlab (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems. In brain imaging data, we demonstrate that this approach delivers results that are straightforward to interpret and compare across models and datasets. Conclusion Fractional ridge regression has several benefits: the solutions obtained for different γ are guaranteed to vary, guarding against wasted calculations; and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. These properties make fractional ridge regression particularly suitable for analysis of large complex datasets.


Sign in / Sign up

Export Citation Format

Share Document