Microarray Probe Expression Measures, Data Normalization and Statistical Validation

DNA microarray technology is a high-throughput method for gaining information on gene function. Microarray technology is based on deposition/synthesis, in an ordered manner, on a solid surface, of thousands of EST sequences/genes/oligonucleotides. Due to the high number of generated datapoints, computational tools are essential in microarray data analysis and mining to grasp knowledge from experimental results. In this review, we will focus on some of the methodologies actually available to define gene expression intensity measures, microarray data normalization, and statistical validation of differential expression.

Download Full-text

CIDA: An integrated software for the design, characterisation and global comparison of microarrays

Journal of Integrative Bioinformatics ◽

10.1515/jib-2007-78 ◽

2007 ◽

Vol 4 (3) ◽

pp. 224-242

Author(s):

Sabah Khalid ◽

Mohsin Khan ◽

Alistair Symonds ◽

Karl Fraser ◽

Ping Wang ◽

...

Keyword(s):

Human Genome ◽

Microarray Data ◽

Expression Profiles ◽

Human Life ◽

Microarray Data Analysis ◽

Microarray Technology ◽

Gene Chips ◽

Global Comparison ◽

Effective Manner ◽

Microarray Gene

Abstract Microarray technology has had a significant impact in the field of systems biology involving the investigation into the biological systems that regulate human life. Identifying genes of significant interest within any given disease on an individual basis is no doubt time consuming and inefficient when considering the complexity of the human genome. Thus, the genetic profiling of the entire human genome in a single experiment has resulted in microarray technology becoming a widely used experimental tool. However, without the use of tools for several aspects of microarray data analysis the technology is limited. To date, no such tool has been developed that allows the integration of numerous microarray results from different research laboratories as well as the design of customised gene chips in a cost-effective manner. In light of this, we have designed the first integrated and automated software called Chip Integration, Design and Annotation (CIDA) for the cross comparison, design and functional annotation of microarray gene chips. The software provides molecular biologists with the control to cross compare the biological signatures generated from multiple microarray studies, design custom microarray gene chips based on their research requirements and lastly characterise microarray data in the context of immunogenomics. Through the relative comparison of related microarray experiments we have identified 258 genes with common gene expression profiles that are not only upregulated in anergic T cells, but also in cells over-expressing the transcription factor Egr2, that has been identified to play a role in T cell anergy. Using the gene chip design aspect of CIDA we have designed and subsequently fabricate immuno-tolerance gene chips consisting of 1758 genes for further research.The software and database schema is freely available at ftp://ftp.brunel.ac.uk/cspgssk/CIDA/. Additional material is available online at http://www.brunel.ac.uk/about/acad/health/healthres/researchgroups/mi/publications/supplementary/cida

Download Full-text

Faculty Opinions recommendation of Resampling-based multiple testing for microarray data analysis.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726695013.793522568 ◽

2016 ◽

Author(s):

Tian Zheng

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Multiple Testing ◽

Microarray Data Analysis

Download Full-text

SEMIPARAMETRIC CLUSTERING METHOD FOR MICROARRAY DATA ANALYSIS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972000800345x ◽

2008 ◽

Vol 06 (02) ◽

pp. 261-282 ◽

Cited By ~ 2

Author(s):

AO YUAN ◽

WENQING HE

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Mixture Distribution ◽

Information Criterion ◽

Optimal Number ◽

Microarray Data Analysis ◽

Parametric Methods ◽

Clustering Methods ◽

Microarray Gene Expression ◽

Data Set

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.

Download Full-text

Methods of Microarray Data Analysis III

Journal of Microbiological Methods ◽

10.1016/j.mimet.2004.02.001 ◽

2004 ◽

Vol 57 (2) ◽

pp. 293

Author(s):

Mareike Viebahn

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Microarray Data Analysis

Download Full-text

Multiclass Decision Forest—A Novel Pattern Recognition Method for Multiclass Classification in Microarray Data Analysis

DNA and Cell Biology ◽

10.1089/dna.2004.23.685 ◽

2004 ◽

Vol 23 (10) ◽

pp. 685-694 ◽

Cited By ~ 30

Author(s):

Huixiao Hong ◽

Weida Tong ◽

Roger Perkins ◽

Hong Fang ◽

Qian Xie ◽

...

Keyword(s):

Pattern Recognition ◽

Data Analysis ◽

Microarray Data ◽

Multiclass Classification ◽

Microarray Data Analysis ◽

Pattern Recognition Method ◽

Recognition Method ◽

Decision Forest

Download Full-text

Computational Strategies for Analyzing Data in Gene Expression Microarray Experiments

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720003000319 ◽

2003 ◽

Vol 01 (03) ◽

pp. 541-586 ◽

Cited By ~ 33

Author(s):

Tero Aittokallio ◽

Markus Kurki ◽

Olli Nevalainen ◽

Tuomas Nikula ◽

Anne West ◽

...

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Microarray Data ◽

Microarray Data Analysis ◽

Biological Research ◽

Microarray Experiments ◽

Dna Microarray Data ◽

Open Questions ◽

Analysis Technique ◽

Wide Range

Microarray analysis has become a widely used method for generating gene expression data on a genomic scale. Microarrays have been enthusiastically applied in many fields of biological research, even though several open questions remain about the analysis of such data. A wide range of approaches are available for computational analysis, but no general consensus exists as to standard for microarray data analysis protocol. Consequently, the choice of data analysis technique is a crucial element depending both on the data and on the goals of the experiment. Therefore, basic understanding of bioinformatics is required for optimal experimental design and meaningful interpretation of the results. This review summarizes some of the common themes in DNA microarray data analysis, including data normalization and detection of differential expression. Algorithms are demonstrated by analyzing cDNA microarray data from an experiment monitoring gene expression in T helper cells. Several computational biology strategies, along with their relative merits, are overviewed and potential areas for additional research discussed. The goal of the review is to provide a computational framework for applying and evaluating such bioinformatics strategies. Solid knowledge of microarray informatics contributes to the implementation of more efficient computational protocols for the given data obtained through microarray experiments.

Download Full-text

Microarray data analysis to identify differentially expressed genes and biological pathways associated with asthma

Experimental and Therapeutic Medicine ◽

10.3892/etm.2018.6366 ◽

2018 ◽

Author(s):

Shanshan Qi ◽

Guanghui Liu ◽

Xiang Dong ◽

Nan Huang ◽

Wenjing Li ◽

...

Keyword(s):

Data Analysis ◽

Differentially Expressed Genes ◽

Microarray Data ◽

Biological Pathways ◽

Differentially Expressed ◽

Microarray Data Analysis

Download Full-text

Coex-Rank: An approach incorporating co-expression information for combined analysis of microarray data

Journal of Integrative Bioinformatics ◽

10.1515/jib-2012-208 ◽

2012 ◽

Vol 9 (1) ◽

pp. 32-43 ◽

Cited By ~ 1

Author(s):

Jinlu Cai ◽

Henry L. Keen ◽

Curt D. Sigmund ◽

Thomas L. Casavant

Keyword(s):

Microarray Data ◽

Statistical Power ◽

Meta Analysis ◽

Rank Aggregation ◽

Microarray Data Analysis ◽

Combined Approach ◽

Linear Modeling ◽

Modeling Process ◽

Genome Wide ◽

Gene Level

Summary Microarrays have been widely used to study differential gene expression at the genomic level. They can also provide genome-wide co-expression information. Biologically related datasets from independent studies are publicly available, which requires robust combined approaches for integration and validation. Previously, meta-analysis has been adopted to solve this problem.As an alternative to meta-analysis, for microarray data with high similarity in biological experimental design, a more direct combined approach is possible. Gene-level normalization across datasets is motivated by the different scale and distribution of data due to separate origins. However, there has been limited discussion about this point in the past. Here we describe a combined approach for microarray analysis, including gene-level normalization and Coex-Rank approach. After normalization, a linear modeling process is used to identify lists of differentially expressed genes. The Coex-Rank approach incorporates co-expression information into a rank-aggregation procedure. We applied this computational approach to our data, which illustrated an improvement in statistical power and a complementary advantage of the Coex-Rank approach from a biological perspective.Our combined approach for microarray data analysis (Coex-rank) is based on normalization, which is naturally driven. The Coex-rank process not only takes advantage of merging the power of multiple methods regarding normalization but also assists in the discovery of functional clusters of genes.

Download Full-text