Data Mining and Meta-Analysis on DNA Microarray Data

Microarray technology enables high-throughput parallel gene expression analysis, and use has grown exponentially thanks to the development of a variety of applications for expression, genetics and epigenetic studies. A wealth of data is now available from public repositories, providing unprecedented opportunities for meta-analysis approaches, which could generate new biological information, unrelated to the original scope of individual studies. This study provides a guideline for identification of biological significance of the statistically-selected differentially-expressed genes derived from gene expression arrays as well as to suggest further analysis pathways. The authors review the prerequisites for data-mining and meta-analysis, summarize the conceptual methods to derive biological information from microarray data and suggest software for each category of data mining or meta-analysis.

Download Full-text

Data mining methods for gene selection on the basis of gene expression arrays

International Journal of Applied Mathematics and Computer Science ◽

10.2478/amcs-2014-0048 ◽

2014 ◽

Vol 24 (3) ◽

pp. 657-668 ◽

Cited By ~ 4

Author(s):

Michał Muszyński ◽

Stanisław Osowski

Keyword(s):

Gene Expression ◽

Prostate Cancer ◽

Data Mining ◽

Support Vector Machine ◽

Gene Selection ◽

Support Vector ◽

Expression Arrays ◽

Gene Expression Arrays ◽

Statistical Hypotheses ◽

Mining Methods

Abstract The paper presents data mining methods applied to gene selection for recognition of a particular type of prostate cancer on the basis of gene expression arrays. Several chosen methods of gene selection, including the Fisher method, correlation of gene with a class, application of the support vector machine and statistical hypotheses, are compared on the basis of clustering measures. The results of applying these individual selection methods are combined together to identify the most often selected genes forming the required pattern, best associated with the cancerous cases. This resulting pattern of selected gene lists is treated as the input data to the classifier, performing the task of the final recognition of the patterns. The numerical results of the recognition of prostate cancer from normal (reference) cases using the selected genes and the support vector machine confirm the good performance of the proposed gene selection approach

Download Full-text

Immune modulators in disease: integrating knowledge from the biomedical literature and gene expression

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocv166 ◽

2015 ◽

Vol 23 (3) ◽

pp. 617-626 ◽

Cited By ~ 1

Author(s):

Nophar Geifman ◽

Sanchita Bhattacharya ◽

Atul J Butte

Keyword(s):

Gene Expression ◽

Large Scale ◽

Biomedical Literature ◽

Cytokine Gene Expression ◽

Future Research ◽

Cytokine Gene ◽

Medical Subject Headings ◽

Expression Arrays ◽

Gene Expression Arrays ◽

Subject Headings

Abstract Objective Cytokines play a central role in both health and disease, modulating immune responses and acting as diagnostic markers and therapeutic targets. This work takes a systems-level approach for integration and examination of immune patterns, such as cytokine gene expression with information from biomedical literature, and applies it in the context of disease, with the objective of identifying potentially useful relationships and areas for future research. Results We present herein the integration and analysis of immune-related knowledge, namely, information derived from biomedical literature and gene expression arrays. Cytokine-disease associations were captured from over 2.4 million PubMed records, in the form of Medical Subject Headings descriptor co-occurrences, as well as from gene expression arrays. Clustering of cytokine-disease co-occurrences from biomedical literature is shown to reflect current medical knowledge as well as potentially novel relationships between diseases. A correlation analysis of cytokine gene expression in a variety of diseases revealed compelling relationships. Finally, a novel analysis comparing cytokine gene expression in different diseases to parallel associations captured from the biomedical literature was used to examine which associations are interesting for further investigation. Discussion We demonstrate the usefulness of capturing Medical Subject Headings descriptor co-occurrences from biomedical publications in the generation of valid and potentially useful hypotheses. Furthermore, integrating and comparing descriptor co-occurrences with gene expression data was shown to be useful in detecting new, potentially fruitful, and unaddressed areas of research. Conclusion Using integrated large-scale data captured from the scientific literature and experimental data, a better understanding of the immune mechanisms underlying disease can be achieved and applied to research.

Download Full-text

Computational Strategies for Analyzing Data in Gene Expression Microarray Experiments

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720003000319 ◽

2003 ◽

Vol 01 (03) ◽

pp. 541-586 ◽

Cited By ~ 33

Author(s):

Tero Aittokallio ◽

Markus Kurki ◽

Olli Nevalainen ◽

Tuomas Nikula ◽

Anne West ◽

...

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Microarray Data ◽

Microarray Data Analysis ◽

Biological Research ◽

Microarray Experiments ◽

Dna Microarray Data ◽

Open Questions ◽

Analysis Technique ◽

Wide Range

Microarray analysis has become a widely used method for generating gene expression data on a genomic scale. Microarrays have been enthusiastically applied in many fields of biological research, even though several open questions remain about the analysis of such data. A wide range of approaches are available for computational analysis, but no general consensus exists as to standard for microarray data analysis protocol. Consequently, the choice of data analysis technique is a crucial element depending both on the data and on the goals of the experiment. Therefore, basic understanding of bioinformatics is required for optimal experimental design and meaningful interpretation of the results. This review summarizes some of the common themes in DNA microarray data analysis, including data normalization and detection of differential expression. Algorithms are demonstrated by analyzing cDNA microarray data from an experiment monitoring gene expression in T helper cells. Several computational biology strategies, along with their relative merits, are overviewed and potential areas for additional research discussed. The goal of the review is to provide a computational framework for applying and evaluating such bioinformatics strategies. Solid knowledge of microarray informatics contributes to the implementation of more efficient computational protocols for the given data obtained through microarray experiments.

Download Full-text

Transcriptomic responses to wounding: meta-analysis of gene expression microarray data

BMC Genomics ◽

10.1186/s12864-017-4202-8 ◽

2017 ◽

Vol 18 (1) ◽

Cited By ~ 3

Author(s):

Piotr Andrzej Sass ◽

Michał Dąbrowski ◽

Agata Charzyńska ◽

Paweł Sachadyn

Keyword(s):

Gene Expression ◽

Microarray Data ◽

Meta Analysis ◽

Gene Expression Microarray ◽

Expression Microarray ◽

Gene Expression Microarray Data ◽

Transcriptomic Responses

Download Full-text

A comparison of Affymetrix gene expression arrays

BMC Bioinformatics ◽

10.1186/1471-2105-8-449 ◽

2007 ◽

Vol 8 (1) ◽

Cited By ~ 40

Author(s):

Mark D Robinson ◽

Terence P Speed

Keyword(s):

Gene Expression ◽

Expression Arrays ◽

Gene Expression Arrays ◽

Affymetrix Gene Expression

Download Full-text

Gene Expression Arrays

Encyclopedia of Database Systems ◽

10.1007/978-1-4899-7993-3_1435-2 ◽

2016 ◽

pp. 1-4

Author(s):

Mehmet M. Dalkiliç

Keyword(s):

Gene Expression ◽

Expression Arrays ◽

Gene Expression Arrays

Download Full-text

RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays

Genome Research ◽

10.1101/gr.079558.108 ◽

2008 ◽

Vol 18 (9) ◽

pp. 1509-1517 ◽

Cited By ~ 1817

Author(s):

J. C. Marioni ◽

C. E. Mason ◽

S. M. Mane ◽

M. Stephens ◽

Y. Gilad

Keyword(s):

Gene Expression ◽

Rna Seq ◽

Expression Arrays ◽

Gene Expression Arrays ◽

Technical Reproducibility

Download Full-text

Data Mining in Gene Expression Analysis

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch096 ◽

2008 ◽

pp. 1643-1673

Author(s):

Jilin Han ◽

Le Gruenwald ◽

Tyrrell Conway

Keyword(s):

Gene Expression ◽

Data Mining ◽

Biological Information ◽

Future Research ◽

Gene Expression Data Analysis ◽

Experimental Conditions ◽

Single Experiment ◽

Important Approach ◽

Future Research Directions ◽

Gene Expression Levels

The study of gene expression levels under defined experimental conditions is an important approach to understand how a living cell works. High-throughput microarray technology is a very powerful tool for simultaneously studying thousands of genes in a single experiment. This revolutionary technology results in an extensive amount of data, which raises an important question: how to extract meaningful biological information from these data? In this chapter, we survey data mining techniques that have been used for clustering, classification and association rules for gene expression data analysis. In addition, we provide a comprehensive list of currently available commercial and academic data mining software together with their features. Lastly, we suggest future research directions.

Download Full-text

Data Mining Methods for Microarray Data Analysis

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch054 ◽

2011 ◽

pp. 283-287

Author(s):

Lei Yu ◽

Huan Liu

Keyword(s):

Gene Expression ◽

Data Mining ◽

Data Analysis ◽

Microarray Data ◽

Microarray Data Analysis ◽

Future Research ◽

Gene Expression Microarray ◽

Expression Microarray ◽

Class Prediction ◽

Gene Expression Microarray Data

The advent of gene expression microarray technology enables the simultaneous measurement of expression levels for thousands or tens of thousands of genes in a single experiment (Schena, et al., 1995). Analysis of gene expression microarray data presents unprecedented opportunities and challenges for data mining in areas such as gene clustering (Eisen, et al., 1998; Tamayo, et al., 1999), sample clustering and class discovery (Alon, et al., 1999; Golub, et al., 1999), sample class prediction (Golub, et al., 1999; Wu, et al., 2003), and gene selection (Xing, Jordan, & Karp, 2001; Yu & Liu, 2004). This article introduces the basic concepts of gene expression microarray data and describes relevant data-mining tasks. It briefly reviews the state-of-the-art methods for each data-mining task and identifies emerging challenges and future research directions in microarray data analysis.

Download Full-text