scholarly journals Model-based clustering and data transformations for gene expression data

2001 ◽  
Vol 17 (10) ◽  
pp. 977-987 ◽  
Author(s):  
K. Y. Yeung ◽  
C. Fraley ◽  
A. Murua ◽  
A. E. Raftery ◽  
W. L. Ruzzo
2005 ◽  
Vol 03 (04) ◽  
pp. 821-836 ◽  
Author(s):  
FANG-XIANG WU ◽  
W. J. ZHANG ◽  
ANTHONY J. KUSALIK

Microarray technology has produced a huge body of time-course gene expression data. Such gene expression data has proved useful in genomic disease diagnosis and genomic drug design. The challenge is how to uncover useful information in such data. Cluster analysis has played an important role in analyzing gene expression data. Many distance/correlation- and static model-based clustering techniques have been applied to time-course expression data. However, these techniques are unable to account for the dynamics of such data. It is the dynamics that characterize the data and that should be considered in cluster analysis so as to obtain high quality clustering. This paper proposes a dynamic model-based clustering method for time-course gene expression data. The proposed method regards a time-course gene expression dataset as a set of time series, generated by a number of stochastic processes. Each stochastic process defines a cluster and is described by an autoregressive model. A relocation-iteration algorithm is proposed to identity the model parameters and posterior probabilities are employed to assign each gene to an appropriate cluster. A bootstrapping method and an average adjusted Rand index (AARI) are employed to measure the quality of clustering. Computational experiments are performed on a synthetic and three real time-course gene expression datasets to investigate the proposed method. The results show that our method allows the better quality clustering than other clustering methods (e.g. k-means) for time-course gene expression data, and thus it is a useful and powerful tool for analyzing time-course gene expression data.


Sign in / Sign up

Export Citation Format

Share Document