Swarm-Based Clustering for Gene Expression Data

Author(s):  
P. K. Nizar Banu ◽  
S. Andrews Samraj

Clustering is one of the most important techniques, which group genes of similar expression pattern into a small number of meaningful homogeneous groups or clusters. Gene expression data has certain special characteristics and is a challenging research problem. There are many applications for clustering gene expression data. Clustering can be applied for genes called gene clustering. Hard clustering allows a gene to get placed in exactly one cluster and converges in local optima. Soft clustering approach allows gene to get placed in all the clusters with some membership values. As the hard clustering approach converges in local optimum, an evolutionary computation technique like swarm clustering is required to find the global optimum solution. This chapter studies swarm clustering techniques such as Particle Swarm Clustering K-Means, Cuckoo Search Clustering, Cuckoo Search Clustering with levy flight, harmony search, Fuzzy PSO and Ant Colony Optimization based Clustering for clustering gene expression data. Evaluation measures for clustering gene expression data are also discussed.

2005 ◽  
Vol 14 (04) ◽  
pp. 577-597 ◽  
Author(s):  
CHUN TANG ◽  
AIDONG ZHANG

Microarray technologies are capable of simultaneously measuring the signals for thousands of messenger RNAs and large numbers of proteins from single samples. Arrays are now widely used in basic biomedical research for mRNA expression profiling and are increasingly being used to explore patterns of gene expression in clinical research. Most research has focused on the interpretation of the meaning of the microarray data which are transformed into gene expression matrices where usually the rows represent genes, the columns represent various samples. Clustering samples can be done by analyzing and eliminating of irrelevant genes. However, majority methods are supervised (or assisted by domain knowledge), less attention has been paid on unsupervised approaches which are important when little domain knowledge is available. In this paper, we present a new framework for unsupervised analysis of gene expression data, which applies an interrelated two-way clustering approach on the gene expression matrices. The goal of clustering is to identify important genes and perform cluster discovery on samples. The advantage of this approach is that we can dynamically manipulate the relationship between the gene clusters and sample groups while conducting an iterative clustering through both of them. The performance of the proposed method with various gene expression data sets is also illustrated.


Sign in / Sign up

Export Citation Format

Share Document