scholarly journals Clustering of fMRI data: the elusive optimal number of clusters

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5416 ◽  
Author(s):  
Mohamed L. Seghier

Model-free methods are widely used for the processing of brain fMRI data collected under natural stimulations, sleep, or rest. Among them is the popular fuzzy c-mean algorithm, commonly combined with cluster validity (CV) indices to identify the ‘true’ number of clusters (components), in an unsupervised way. CV indices may however reveal different optimal c-partitions for the same fMRI data, and their effectiveness can be hindered by the high data dimensionality, the limited signal-to-noise ratio, the small proportion of relevant voxels, and the presence of artefacts or outliers. Here, the author investigated the behaviour of seven robust CV indices. A new CV index that incorporates both compactness and separation measures is also introduced. Using both artificial and real fMRI data, the findings highlight the importance of looking at the behavior of different compactness and separation measures, defined here as building blocks of CV indices, to depict a full description of the data structure, in particular when no agreement is found between CV indices. Overall, for fMRI, it makes sense to relax the assumption that only one unique c-partition exists, and appreciate that different c-partitions (with different optimal numbers of clusters) can be useful explanations of the data, given the hierarchical organization of many brain networks.

Knowing exact number of clusters in a digital image significantly facilitates in precisely clustering an image. This paper proposes a new technique for extracting exact number of clusters from grey scale images. It analyzes the contents of the input image and adaptively reserves one distinct cluster for one distinct grey value. The total count of the grey values found in an image determines the exact number of clusters. Based on the contents of image, this number of clusters keeps on changing from image to image. After obtaining this number, it is given as an input to Gaussian Mixture Model (GMM) which clusters the image.GMM works with finite number of clusters and forms mixture of various spectral densities contained in that image. The proposed method facilitates GMM to adapt itself according to the changing number of clusters. Therefore, the proposed model along with the inclusion of GMM, is named as Adaptive Finite Gaussian Mixture Model (AFGMM). The clustering performance of AFGMM is evaluated through Mean Squared Error (MSE) and Peak Signal to Noise Ratio (PSNR). Both of these performance measuring methods confirmed that exact number of clusters is essentially important for reliably analyzing an image.


Author(s):  
Hussain A. Jaber ◽  
Ilyas Çankaya ◽  
Hadeel K. Aljobouri ◽  
Orhan M. Koçak ◽  
Oktay Algin

Background: Cluster analysis is a robust tool for exploring the underlining structures in data and grouping them with similar objects. In the researches of Functional Magnetic Resonance Imaging (fMRI), clustering approaches attempt to classify voxels depending on their time-course signals into a similar hemodynamic response over time. Objective: In this work, a novel unsupervised learning approach is proposed that relies on using Enhanced Neural Gas (ENG) algorithm in fMRI data for comparison with Neural Gas (NG) method, which has yet to be utilized for that aim. The ENG algorithm depends on the network structure of the NG and concentrates on an efficacious prototype-based clustering approach. Methods: The comparison outcomes on real auditory fMRI data show that ENG outperforms the NG and statistical parametric mapping (SPM) methods due to its insensitivity to the ordering of input data sequence, various initializations for selecting a set of neurons, and the existence of extreme values (outliers). The findings also prove its capability to discover the exact and real values of a cluster number effectively. Results: Four validation indices are applied to evaluate the performance of the proposed ENG method with fMRI and compare it with a clustering approach (NG algorithm) and model-based data analysis (SPM). These validation indices include the Jaccard Coefficient (JC), Receiver Operating Characteristic (ROC), Minimum Description Length (MDL) value, and Minimum Square Error (MSE). Conclusion: The ENG technique can tackle all shortcomings of NG application with fMRI data, identify the active area of the human brain effectively, and determine the locations of the cluster center based on the MDL value during the process of network learning.


2018 ◽  
Vol 14 (1) ◽  
pp. 11-23 ◽  
Author(s):  
Lin Zhang ◽  
Yanling He ◽  
Huaizhi Wang ◽  
Hui Liu ◽  
Yufei Huang ◽  
...  

Background: RNA methylome has been discovered as an important layer of gene regulation and can be profiled directly with count-based measurements from high-throughput sequencing data. Although the detailed regulatory circuit of the epitranscriptome remains uncharted, clustering effect in methylation status among different RNA methylation sites can be identified from transcriptome-wide RNA methylation profiles and may reflect the epitranscriptomic regulation. Count-based RNA methylation sequencing data has unique features, such as low reads coverage, which calls for novel clustering approaches. <P><P> Objective: Besides the low reads coverage, it is also necessary to keep the integer property to approach clustering analysis of count-based RNA methylation sequencing data. <P><P> Method: We proposed a nonparametric generative model together with its Gibbs sampling solution for clustering analysis. The proposed approach implements a beta-binomial mixture model to capture the clustering effect in methylation level with the original count-based measurements rather than an estimated continuous methylation level. Besides, it adopts a nonparametric Dirichlet process to automatically determine an optimal number of clusters so as to avoid the common model selection problem in clustering analysis. <P><P> Results: When tested on the simulated system, the method demonstrated improved clustering performance over hierarchical clustering, K-means, MClust, NMF and EMclust. It also revealed on real dataset two novel RNA N6-methyladenosine (m6A) co-methylation patterns that may be induced directly by METTL14 and WTAP, which are two known regulatory components of the RNA m6A methyltransferase complex. <P><P> Conclusion: Our proposed DPBBM method not only properly handles the count-based measurements of RNA methylation data from sites of very low reads coverage, but also learns an optimal number of clusters adaptively from the data analyzed. <P><P> Availability: The source code and documents of DPBBM R package are freely available through the Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/DPBBM/.


2021 ◽  
Vol 13 (3) ◽  
pp. 409
Author(s):  
Howard Zebker

Atmospheric propagational phase variations are the dominant source of error for InSAR (interferometric synthetic aperture radar) time series analysis, generally exceeding uncertainties from poor signal to noise ratio or signal correlation. The spatial properties of these errors have been well studied, but, to date, their temporal dependence and correction have received much less attention. Here, we present an evaluation of the magnitude of tropospheric artifacts in derived time series after compensation using an algorithm that requires only the InSAR data. The level of artifact reduction equals or exceeds that from many weather model-based methods, while avoiding the need to globally access fine-scale atmosphere parameters at all times. Our method consists of identifying all points in an InSAR stack with consistently high correlation and computing, and then removing, a fit of the phase at each of these points with respect to elevation. A comparison with GPS truth yields a reduction of three, from a rms misfit of 5–6 to ~2 cm over time. This algorithm can be readily incorporated into InSAR processing flows without the need for outside information.


Sign in / Sign up

Export Citation Format

Share Document