scholarly journals Article Commentary: Microarray Data Analysis of Gene Expression Evolution

2009 ◽  
Vol 3 ◽  
pp. GRSB.S2997
Author(s):  
Honghuang Lin
2003 ◽  
Vol 01 (03) ◽  
pp. 541-586 ◽  
Author(s):  
Tero Aittokallio ◽  
Markus Kurki ◽  
Olli Nevalainen ◽  
Tuomas Nikula ◽  
Anne West ◽  
...  

Microarray analysis has become a widely used method for generating gene expression data on a genomic scale. Microarrays have been enthusiastically applied in many fields of biological research, even though several open questions remain about the analysis of such data. A wide range of approaches are available for computational analysis, but no general consensus exists as to standard for microarray data analysis protocol. Consequently, the choice of data analysis technique is a crucial element depending both on the data and on the goals of the experiment. Therefore, basic understanding of bioinformatics is required for optimal experimental design and meaningful interpretation of the results. This review summarizes some of the common themes in DNA microarray data analysis, including data normalization and detection of differential expression. Algorithms are demonstrated by analyzing cDNA microarray data from an experiment monitoring gene expression in T helper cells. Several computational biology strategies, along with their relative merits, are overviewed and potential areas for additional research discussed. The goal of the review is to provide a computational framework for applying and evaluating such bioinformatics strategies. Solid knowledge of microarray informatics contributes to the implementation of more efficient computational protocols for the given data obtained through microarray experiments.


Author(s):  
Lei Yu ◽  
Huan Liu

The advent of gene expression microarray technology enables the simultaneous measurement of expression levels for thousands or tens of thousands of genes in a single experiment (Schena, et al., 1995). Analysis of gene expression microarray data presents unprecedented opportunities and challenges for data mining in areas such as gene clustering (Eisen, et al., 1998; Tamayo, et al., 1999), sample clustering and class discovery (Alon, et al., 1999; Golub, et al., 1999), sample class prediction (Golub, et al., 1999; Wu, et al., 2003), and gene selection (Xing, Jordan, & Karp, 2001; Yu & Liu, 2004). This article introduces the basic concepts of gene expression microarray data and describes relevant data-mining tasks. It briefly reviews the state-of-the-art methods for each data-mining task and identifies emerging challenges and future research directions in microarray data analysis.


2011 ◽  
pp. 877-884
Author(s):  
Amira Djebbari ◽  
Aedín C. Culhane ◽  
Alice J. Armstrong ◽  
John Quackenbush

Biological systems can be viewed as information management systems, with a basic instruction set stored in each cell’s DNA as “genes.” For most genes, their information is enabled when they are transcribed into RNA which is subsequently translated into the proteins that form much of a cell’s machinery. Although details of the process for individual genes are known, more complex interactions between elements are yet to be discovered. What we do know is that diseases can result if there are changes in the genes themselves, in the proteins they encode, or if RNAs or proteins are made at the wrong time or in the wrong quantities. Recent advances in biotechnology led to the development of DNA microarrays, which quantitatively measure the expression of thousands of genes simultaneously and provide a snapshot of a cell’s response to a particular condition. Finding patterns of gene expression that provide insight into biological endpoints offers great opportunities for revolutionizing diagnostic and prognostic medicine and providing mechanistic insight in data-driven research in the life sciences, an area with a great need for advances, given the urgency associated with diseases. However, microarray data analysis presents a number of challenges, from noisy data to the curse of dimensionality (large number of features, small number of instances) to problems with no clear solutions (e.g. real world mappings of genes to traits or diseases that are not yet known). Finding patterns of gene expression in microarray data poses problems of class discovery, comparison, prediction, and network analysis which are often approached with AI methods. Many of these methods have been successfully applied to microarray data analysis in a variety of applications ranging from clustering of yeast gene expression patterns (Eisen et al., 1998) to classification of different types of leukemia (Golub et al., 1999). Unsupervised learning methods (e.g. hierarchical clustering) explore clusters in data and have been used for class discovery of distinct forms of diffuse large B-cell lymphoma (Alizadeh et al., 2000). Supervised learning methods (e.g. artificial neural networks) utilize a previously determined mapping between biological samples and classes (i.e. labels) to generate models for class prediction. A k-nearest neighbor (k-NN) approach was used to train a gene expression classifier of different forms of brain tumors and its predictions were able to distinguish biopsy samples with different prognosis suggesting that microarray profiles can predict clinical outcome and direct treatment (Nutt et al., 2003). Bayesian networks constructed from microarray data hold promise for elucidating the underlying biological mechanisms of disease (Friedman et al., 2000).


2002 ◽  
Vol 18 (9) ◽  
pp. 1207-1215 ◽  
Author(s):  
Y. Chen ◽  
V. Kamat ◽  
E. R. Dougherty ◽  
M. L. Bittner ◽  
P. S. Meltzer ◽  
...  

2006 ◽  
Vol 87 (3) ◽  
pp. 195-206 ◽  
Author(s):  
DABAO ZHANG ◽  
MIN ZHANG ◽  
MARTIN T. WELLS

We propose a simple approach, the multiplicative background correction, to solve a perplexing problem in spotted microarray data analysis: correcting the foreground intensities for the background noise, especially for spots with genes that are weakly expressed or not at all. The conventional approach, the additive background correction, directly subtracts the background intensities from foreground intensities. When the foreground intensities marginally dominate the background intensities, the additive background correction provides unreliable estimates of the differential gene expression levels and usually presents M–A plots with ‘fishtails’ or fans. Unreliable additive background correction makes it preferable to ignore the background noise, which may increase the number of false positives. Based on the more realistic multiplicative assumption instead of the conventional additive assumption, we propose to logarithmically transform the intensity readings before the background correction, with the logarithmic transformation symmetrizing the skewed intensity readings. This approach not only precludes the ‘fishtails’ and fans in the M–A plots, but provides highly reproducible background-corrected intensities for both strongly and weakly expressed genes. The superiority of the multiplicative background correction to the additive one as well as the no background correction is justified by publicly available self-hybridization datasets.


Sign in / Sign up

Export Citation Format

Share Document