scholarly journals Sign-consistency based variable importance for machine learning in brain imaging

2017 ◽  
Author(s):  
Vanessa Gómez-Verdejo ◽  
Emilio Parrado-Hernández ◽  
Jussi Tohka ◽  

AbstractAn important problem that hinders the use of supervised classification algorithms for brain imaging is that the number of variables per single subject far exceeds the number of training subjects available. Deriving multivariate measures of variable importance becomes a challenge in such scenarios. This paper proposes a new measure of variable importance termed sign-consistency bagging (SCB). The SCB captures variable importance by analyzing the sign consistency of the corresponding weights in an ensemble of linear support vector machine (SVM) classifiers. Further, the SCB variable importances are enhanced by means of transductive conformal analysis. This extra step is important when the data can be assumed to be heterogeneous. Finally, the proposal of these SCB variable importance measures is completed with the derivation of a parametric hypothesis test of variable importance. The new importance measures were compared with a t-test based univariate and an SVM-based multivariate variable importances using anatomical and functional magnetic resonance imaging data. The obtained results demonstrated that the new SCB based importance measures were superior to the compared methods in terms of reproducibility and classification accuracy.


2019 ◽  
Author(s):  
Noah Lewis ◽  
Harshvardhan Gazula ◽  
Sergey M. Plis ◽  
Vince D. Calhoun

Abstract0.1backgroundIn this age of big data, large data stores allow researchers to compose robust models that are accurate and informative. In many cases, the data are stored in separate locations requiring data transfer between local sites, which can cause various practical hurdles, such as privacy concerns or heavy network load. This is especially true for medical imaging data, which can be constrained due to the health insurance portability and accountability act (HIPAA). Medical imaging datasets can also contain many thousands or millions of features, requiring heavy network load.0.2New MethodOur research expands upon current decentralized classification research by implementing a new singleshot method for both neural networks and support vector machines. Our approach is to estimate the statistical distribution of the data at each local site and pass this information to the other local sites where each site resamples from the individual distributions and trains a model on both locally available data and the resampled data.0.3ResultsWe show applications of our approach to handwritten digit classification as well as to multi-subject classification of brain imaging data collected from patients with schizophrenia and healthy controls. Overall, the results showed comparable classification accuracy to the centralized model with lower network load than multishot methods.0.4Comparison with Existing MethodsMany decentralized classifiers are multishot, requiring heavy network traffic. Our model attempts to alleviate this load while preserving prediction accuracy.0.5ConclusionsWe show that our proposed approach performs comparably to a centralized approach while minimizing network traffic compared to multishot methods.0.6HighlightsA novel yet simple approach to decentralized classificationReduces total network load compared to current multishot algorithmsMaintains a prediction accuracy comparable to the centralized approach



2021 ◽  
Author(s):  
Oualid Benkarim ◽  
Casey Paquola ◽  
Bo-yong Park ◽  
Valeria Kebets ◽  
Seok-Jun Hong ◽  
...  

Brain-imaging research enjoys increasing adoption of supervised machine learning for single-subject disease classification. Yet, the success of these algorithms likely depends on population diversity, including demographic differences and other factors that may be outside of primary scientific interest. Here, we capitalize on propensity scores as a composite confound index to quantify diversity due to major sources of population stratification. We delineate the impact of population heterogeneity on the predictive accuracy and pattern stability in two separate clinical cohorts: the Autism Brain Imaging Data Exchange (ABIDE, n=297) and the Healthy Brain Network (HBN, n=551). Across various analysis scenarios, our results uncover the extent to which cross-validated prediction performances are interlocked with diversity. The instability of extracted brain patterns attributable to diversity is located preferentially to the default mode network. Our collective findings highlight the limitations of prevailing deconfounding practices in mitigating the full consequences of population diversity.



2019 ◽  
Vol 17 (4) ◽  
pp. 593-609 ◽  
Author(s):  
Vanessa Gómez-Verdejo ◽  
◽  
Emilio Parrado-Hernández ◽  
Jussi Tohka


2019 ◽  
Author(s):  
Carlos Sevilla-Salcedo ◽  
Vanessa Gómez-Verdejo ◽  
Jussi Tohka ◽  

AbstractA fundamental problem of supervised learning algorithms for brain imaging applications is that the number of features far exceeds the number of subjects. In this paper, we propose a combined feature selection and extraction approach for multiclass problems. This method starts with a bagging procedure which calculates the sign consistency of the multivariate analysis (MVA) projection matrix feature-wise to determine the relevance of each feature. This relevance measure provides a parsimonious matrix, which is combined with a hypothesis test to automatically determine the number of selected features. Then, a novel MVA regularized with the sign and magnitude consistency of the features is used to generate a reduced set of summary components providing a compact data description.We evaluated the proposed method with two multiclass brain imaging problems: 1) the classification of the elderly subjects in four classes (cognitively normal, stable mild cognitive impairment (MCI), MCI converting to AD in 3 years, and Alzheimer’s disease) based on structural brain imaging data from the ADNI cohort; 2) the classification of children in 3 classes (typically developing, and 2 types of Attention Deficit/Hyperactivity Disorder (ADHD)) based on functional connectivity. Experimental results confirmed that each brain image (defined by 29.852 features in the ADNI database and 61.425 in the ADHD) could be represented with only 30 – 45% of the original features. Furthermore, this information could be redefined into two or three summary components, providing not only a gain of interpretability but also classification rate improvements when compared to state-of-art reference methods.



2010 ◽  
Vol 20 (5) ◽  
pp. 1123-1138
Author(s):  
Xiao-Jie ZHAO ◽  
Zhi-Ying LONG ◽  
Xiao-Juan GUO ◽  
Li YAO


Author(s):  
Asterios Toutios ◽  
Tanner Sorensen ◽  
Krishna Somandepalli ◽  
Rachel Alexander ◽  
Shrikanth S. Narayanan


2019 ◽  
Vol 13 ◽  
Author(s):  
Christoph Vogelbacher ◽  
Miriam H. A. Bopp ◽  
Verena Schuster ◽  
Peer Herholz ◽  
Andreas Jansen ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document