scholarly journals Functional brain segmentation using inter-subject correlation in fMRI

2016 ◽  
Author(s):  
Jukka-Pekka Kauppi ◽  
Juha Pajula ◽  
Jari Niemi ◽  
Riitta Hari ◽  
Jussi Tohka

AbstractThe human brain continuously processes massive amounts of rich sensory information. To better understand such highly complex brain processes, modern neuroimaging studies are increasingly utilizing experimental setups that better mimic daily-life situations. We propose a new exploratory data-analysis approach, functional segmentation intersubject correlation analysis (FuSeISC), to facilitate the analysis of functional magnetic resonance (fMRI) data sets collected in these experiments. The method provides a new type of functional segmentation of brain areas, not only characterizing areas that display similar processing across subjects but also areas in which processing across subjects is highly variable.We tested FuSeISC using fMRI data sets collected during traditional block-design stimuli (37 subjects) as well as naturalistic auditory narratives (19 subjects). The method identified spatially local and/or bilaterally symmetric clusters in several cortical areas, many of which are known to be processing the types of stimuli used in the experiments. The method is not only prominent for spatial exploration of large fMRI data sets obtained using naturalistic stimuli, but has other potential applications such as generation of a functional brain atlases including both lower-and higher-order processing areas.Finally, as a part of FuSeISC, we propose a criterion-based sparsification of the shared nearest-neighbor graph for detecting clusters in noisy data. In our tests with synthetic data, this technique was superior to well-known clustering methods, such as Ward's method, affinity propagation and K-means++.

2013 ◽  
Vol 748 ◽  
pp. 590-594
Author(s):  
Li Liao ◽  
Yong Gang Lu ◽  
Xu Rong Chen

We propose a novel density estimation method using both the k-nearest neighbor (KNN) graph and the potential field of the data points to capture the local and global data distribution information respectively. The clustering is performed based on the computed density values. A forest of trees is built using each data point as the tree node. And the clusters are formed according to the trees in the forest. The new clustering method is evaluated by comparing with three popular clustering methods, K-means++, Mean Shift and DBSCAN. Experiments on two synthetic data sets and one real data set show that our approach can effectively improve the clustering results.


2019 ◽  
Vol 28 (06) ◽  
pp. 1960002 ◽  
Author(s):  
Brankica Bratić ◽  
Michael E. Houle ◽  
Vladimir Kurbalija ◽  
Vincent Oria ◽  
Miloš Radovanović

The K-nearest neighbor graph (K-NNG) is a data structure used by many machine-learning algorithms. Naive computation of the K-NNG has quadratic time complexity, which in many cases is not efficient enough, producing the need for fast and accurate approximation algorithms. NN-Descent is one such algorithm that is highly efficient, but has a major drawback in that K-NNG approximations are accurate only on data of low intrinsic dimensionality. This paper represents an experimental analysis of this behavior, and investigates possible solutions. Experimental results show that there is a link between the performance of NN-Descent and the phenomenon of hubness, defined as the tendency of intrinsically high-dimensional data to contain hubs – points with large in-degrees in the K-NNG. First, we explain how the presence of the hubness phenomenon causes bad NN-Descent performance. In light of that, we propose four NN-Descent variants to alleviate the observed negative inuence of hubs. By evaluating the proposed approaches on several real and synthetic data sets, we conclude that our approaches are more accurate, but often at the cost of higher scan rates.


Author(s):  
Amit Saxena ◽  
John Wang

This paper presents a two-phase scheme to select reduced number of features from a dataset using Genetic Algorithm (GA) and testing the classification accuracy (CA) of the dataset with the reduced feature set. In the first phase of the proposed work, an unsupervised approach to select a subset of features is applied. GA is used to select stochastically reduced number of features with Sammon Error as the fitness function. Different subsets of features are obtained. In the second phase, each of the reduced features set is applied to test the CA of the dataset. The CA of a data set is validated using supervised k-nearest neighbor (k-nn) algorithm. The novelty of the proposed scheme is that each reduced feature set obtained in the first phase is investigated for CA using the k-nn classification with different Minkowski metric i.e. non-Euclidean norms instead of conventional Euclidean norm (L2). Final results are presented in the paper with extensive simulations on seven real and one synthetic, data sets. It is revealed from the proposed investigation that taking different norms produces better CA and hence a scope for better feature subset selection.


2019 ◽  
Vol 11 (3) ◽  
pp. 350 ◽  
Author(s):  
Qiang Li ◽  
Qi Wang ◽  
Xuelong Li

A hyperspectral image (HSI) has many bands, which leads to high correlation between adjacent bands, so it is necessary to find representative subsets before further analysis. To address this issue, band selection is considered as an effective approach that removes redundant bands for HSI. Recently, many band selection methods have been proposed, but the majority of them have extremely poor accuracy in a small number of bands and require multiple iterations, which does not meet the purpose of band selection. Therefore, we propose an efficient clustering method based on shared nearest neighbor (SNNC) for hyperspectral optimal band selection, claiming the following contributions: (1) the local density of each band is obtained by shared nearest neighbor, which can more accurately reflect the local distribution characteristics; (2) in order to acquire a band subset containing a large amount of information, the information entropy is taken as one of the weight factors; (3) a method for automatically selecting the optimal band subset is designed by the slope change. The experimental results reveal that compared with other methods, the proposed method has competitive computational time and the selected bands achieve higher overall classification accuracy on different data sets, especially when the number of bands is small.


2007 ◽  
Vol 17 (01) ◽  
pp. 71-103 ◽  
Author(s):  
NARGESS MEMARSADEGHI ◽  
DAVID M. MOUNT ◽  
NATHAN S. NETANYAHU ◽  
JACQUELINE LE MOIGNE

Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.


Entropy ◽  
2018 ◽  
Vol 20 (11) ◽  
pp. 830 ◽  
Author(s):  
Xulun Ye ◽  
Jieyu Zhao ◽  
Yu Chen

Multi-manifold clustering is among the most fundamental tasks in signal processing and machine learning. Although the existing multi-manifold clustering methods are quite powerful, learning the cluster number automatically from data is still a challenge. In this paper, a novel unsupervised generative clustering approach within the Bayesian nonparametric framework has been proposed. Specifically, our manifold method automatically selects the cluster number with a Dirichlet Process (DP) prior. Then, a DP-based mixture model with constrained Mixture of Gaussians (MoG) is constructed to handle the manifold data. Finally, we integrate our model with the k-nearest neighbor graph to capture the manifold geometric information. An efficient optimization algorithm has also been derived to do the model inference and optimization. Experimental results on synthetic datasets and real-world benchmark datasets exhibit the effectiveness of this new DP-based manifold method.


2017 ◽  
Vol 29 (7) ◽  
pp. 1902-1918 ◽  
Author(s):  
De Cheng ◽  
Feiping Nie ◽  
Jiande Sun ◽  
Yihong Gong

Graph-based clustering methods perform clustering on a fixed input data graph. Thus such clustering results are sensitive to the particular graph construction. If this initial construction is of low quality, the resulting clustering may also be of low quality. We address this drawback by allowing the data graph itself to be adaptively adjusted in the clustering procedure. In particular, our proposed weight adaptive Laplacian (WAL) method learns a new data similarity matrix that can adaptively adjust the initial graph according to the similarity weight in the input data graph. We develop three versions of these methods based on the L2-norm, fuzzy entropy regularizer, and another exponential-based weight strategy, that yield three new graph-based clustering objectives. We derive optimization algorithms to solve these objectives. Experimental results on synthetic data sets and real-world benchmark data sets exhibit the effectiveness of these new graph-based clustering methods.


2011 ◽  
Vol 148-149 ◽  
pp. 258-261
Author(s):  
Zhi Kai Zhao ◽  
Jian Sheng Qian

A special kind of data is considered in this paper called multimodal data. It has the property that samples in a class are from several separate clusters. Locality Preserving Projection (LPP) can work well with multimodal data due to its locality preserving property. However, the label information is not used to improve the learning performance due to the unsupervised character of LPP. In this paper, we propose a method called Locality Sensitive Semi-Supervised Dimensionality Reduction (semi-LSDR). It takes both the discriminant information and geometry structure into account. Specifically, we construct a between-class graph on labeled samples and a nearest neighbor graph both from the perspective of locality. A directly mapping can be achieved by solving a generalized eigenvalue problem. Effectiveness of the proposed method is showed through simulations with benchmark data sets.


2015 ◽  
Vol 11 (3) ◽  
pp. 26-48 ◽  
Author(s):  
Guilherme Moreira ◽  
Maribel Yasmina Santos ◽  
João Moura Pires ◽  
João Galvão

Huge amounts of data are available for analysis in nowadays organizations, which are facing several challenges when trying to analyze the generated data with the aim of extracting useful information. This analytical capability needs to be enhanced with tools capable of dealing with big data sets without making the analytical process an arduous task. Clustering is usually used in the data analysis process, as this technique does not require any prior knowledge about the data. However, clustering algorithms usually require one or more input parameters that influence the clustering process and the results that can be obtained. This work analyses the relation between the three input parameters of the SNN (Shared Nearest Neighbor) clustering algorithm, providing a comprehensive understanding of the relationships that were identified between k, Eps and MinPts, the algorithm's input parameters. Moreover, this work also proposes specific guidelines for the definition of the appropriate input parameters, optimizing the processing time, as the number of trials needed to achieve appropriate results can be substantial reduced.


Sign in / Sign up

Export Citation Format

Share Document