EFFICIENTLY FINDING REGULATORY ELEMENTS USING CORRELATION WITH GENE EXPRESSION

2004 ◽  
Vol 02 (02) ◽  
pp. 273-288 ◽  
Author(s):  
HIDEO BANNAI ◽  
SHUNSUKE INENAGA ◽  
AYUMI SHINOHARA ◽  
MASAYUKI TAKEDA ◽  
SATORU MIYANO

We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.

2018 ◽  
Vol 7 (2.21) ◽  
pp. 201 ◽  
Author(s):  
K Yuvaraj ◽  
D Manjula

Current advancements in microarray technology permit simultaneous observing of the expression levels of huge number of genes over various time points. Microarrays have obtained amazing implication in the field of bioinformatics. It includes an ordered set of huge different Deoxyribonucleic Acid (DNA) sequences that can be used to measure both DNA as well as Ribonucleic Acid (RNA) dissimilarities. The Gene Expression (GE) summary aids in understanding the basic cause of gene activities, the growth of genes, determining recent disorders like cancer and as well analysing their molecular pharmacology. Clustering is a significant tool applied for analyzing such microarray gene expression data.  It has developed into a greatest part of gene expression analysis. Grouping the genes having identical expression patterns is known as gene clustering. A number of clustering algorithms have been applied for the analysis of microarray gene expression data. The aim of this paper is to analyze the precision level of the microarray data by using various clustering algorithms. 


2010 ◽  
Vol 08 (04) ◽  
pp. 645-659 ◽  
Author(s):  
YICHUAN ZHAO ◽  
GUOSHEN WANG

In order to predict future patients' survival time based on their microarray gene expression data, one interesting question is how to relate genes to survival outcomes. In this paper, by applying a semi-parametric additive risk model in survival analysis, we propose a new approach to conduct a careful analysis of gene expression data with the focus on the model's predictive ability. In the proposed method, we apply the correlation principal component regression to deal with right censoring survival data under the semi-parametric additive risk model frame with high-dimensional covariates. We also employ the time-dependent area under the receiver operating characteristic curve and root mean squared error for prediction to assess how well the model can predict the survival time. Furthermore, the proposed method is able to identify significant genes, which are significantly related to the disease. Finally, the proposed useful approach is illustrated by the diffuse large B-cell lymphoma data set and breast cancer data set. The results show that the model fits the data sets very well.


Author(s):  
Qiang Zhao ◽  
Jianguo Sun

Statistical analysis of microarray gene expression data has recently attracted a great deal of attention. One problem of interest is to relate genes to survival outcomes of patients with the purpose of building regression models for the prediction of future patients' survival based on their gene expression data. For this, several authors have discussed the use of the proportional hazards or Cox model after reducing the dimension of the gene expression data. This paper presents a new approach to conduct the Cox survival analysis of microarray gene expression data with the focus on models' predictive ability. The method modifies the correlation principal component regression (Sun, 1995) to handle the censoring problem of survival data. The results based on simulated data and a set of publicly available data on diffuse large B-cell lymphoma show that the proposed method works well in terms of models' robustness and predictive ability in comparison with some existing partial least squares approaches. Also, the new approach is simpler and easy to implement.


Sign in / Sign up

Export Citation Format

Share Document