Microarray Data Mining

Nowadays, a huge amount of high throughput molecular data are available for analysis and provide novel and useful insights into complex biological systems, through the acquisition of a high-resolution picture of their molecular status in defined experimental conditions. In this context, microarrays are a powerful tool to analyze thousands of gene expression values with a single experiment. A number of approaches have been developed to detecting genes highly correlated to diseases, selecting genes that exhibit a similar behavior under specific conditions, building models to predict disease outcome based on genetic profiles, and inferring regulatory networks. This paper discusses popular and recent data mining techniques (i.e., Feature Selection, Clustering, Classification, and Association Rule Mining) applied to microarray data. The main characteristics of microarray data and preprocessing procedures are presented to understand the critical issues introduced by gene expression values analysis. Each technique is analyzed, and relevant examples of pertinent literature are reported. Moreover, real use cases exploiting analytic pipelines that use these methods are also introduced. Finally, future directions of data mining research on microarray data are envisioned.

Download Full-text

Information Extraction from Microarray Data

Business Intelligence ◽

10.4018/978-1-4666-9562-7.ch060 ◽

2016 ◽

pp. 1180-1211 ◽

Cited By ~ 1

Author(s):

Alessandro Fiori ◽

Alberto Grand ◽

Giulia Bruno ◽

Francesco Gavino Brundu ◽

Domenico Schioppa ◽

...

Keyword(s):

Gene Expression ◽

Data Mining ◽

Microarray Data ◽

Regulatory Networks ◽

Molecular Data ◽

Experimental Conditions ◽

Single Experiment ◽

Building Models ◽

Critical Issues ◽

Highly Correlated

Nowadays, a huge amount of high throughput molecular data are available for analysis and provide novel and useful insights into complex biological systems, through the acquisition of a high-resolution picture of their molecular status in defined experimental conditions. In this context, microarrays are a powerful tool to analyze thousands of gene expression values with a single experiment. A number of approaches have been developed to detecting genes highly correlated to diseases, selecting genes that exhibit a similar behavior under specific conditions, building models to predict disease outcome based on genetic profiles, and inferring regulatory networks. This paper discusses popular and recent data mining techniques (i.e., Feature Selection, Clustering, Classification, and Association Rule Mining) applied to microarray data. The main characteristics of microarray data and preprocessing procedures are presented to understand the critical issues introduced by gene expression values analysis. Each technique is analyzed, and relevant examples of pertinent literature are reported. Moreover, real use cases exploiting analytic pipelines that use these methods are also introduced. Finally, future directions of data mining research on microarray data are envisioned.

Download Full-text

Data Mining in Gene Expression Analysis

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch096 ◽

2008 ◽

pp. 1643-1673

Author(s):

Jilin Han ◽

Le Gruenwald ◽

Tyrrell Conway

Keyword(s):

Gene Expression ◽

Data Mining ◽

Biological Information ◽

Future Research ◽

Gene Expression Data Analysis ◽

Experimental Conditions ◽

Single Experiment ◽

Important Approach ◽

Future Research Directions ◽

Gene Expression Levels

The study of gene expression levels under defined experimental conditions is an important approach to understand how a living cell works. High-throughput microarray technology is a very powerful tool for simultaneously studying thousands of genes in a single experiment. This revolutionary technology results in an extensive amount of data, which raises an important question: how to extract meaningful biological information from these data? In this chapter, we survey data mining techniques that have been used for clustering, classification and association rules for gene expression data analysis. In addition, we provide a comprehensive list of currently available commercial and academic data mining software together with their features. Lastly, we suggest future research directions.

Download Full-text

Data Mining Methods for Microarray Data Analysis

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch054 ◽

2011 ◽

pp. 283-287

Author(s):

Lei Yu ◽

Huan Liu

Keyword(s):

Gene Expression ◽

Data Mining ◽

Data Analysis ◽

Microarray Data ◽

Microarray Data Analysis ◽

Future Research ◽

Gene Expression Microarray ◽

Expression Microarray ◽

Class Prediction ◽

Gene Expression Microarray Data

The advent of gene expression microarray technology enables the simultaneous measurement of expression levels for thousands or tens of thousands of genes in a single experiment (Schena, et al., 1995). Analysis of gene expression microarray data presents unprecedented opportunities and challenges for data mining in areas such as gene clustering (Eisen, et al., 1998; Tamayo, et al., 1999), sample clustering and class discovery (Alon, et al., 1999; Golub, et al., 1999), sample class prediction (Golub, et al., 1999; Wu, et al., 2003), and gene selection (Xing, Jordan, & Karp, 2001; Yu & Liu, 2004). This article introduces the basic concepts of gene expression microarray data and describes relevant data-mining tasks. It briefly reviews the state-of-the-art methods for each data-mining task and identifies emerging challenges and future research directions in microarray data analysis.

Download Full-text

Oligonucleotide microarray data mining: search for age-dependent gene expression

Biochemical and Biophysical Research Communications ◽

10.1016/s0006-291x(02)02563-9 ◽

2002 ◽

Vol 298 (5) ◽

pp. 772-778 ◽

Cited By ~ 15

Author(s):

Marc Kirschner ◽

Gemma Pujol ◽

Aurelian Radu

Keyword(s):

Gene Expression ◽

Data Mining ◽

Microarray Data ◽

Oligonucleotide Microarray ◽

Age Dependent

Download Full-text

Data Mining and Meta-Analysis on DNA Microarray Data

International Journal of Systems Biology and Biomedical Technologies ◽

10.4018/ijsbbt.2012070101 ◽

2012 ◽

Vol 1 (3) ◽

pp. 1-39

Author(s):

Triantafyllos Paparountas ◽

Maria Nefeli Nikolaidou-Katsaridou ◽

Gabriella Rustici ◽

Vasilis Aidinis

Keyword(s):

Gene Expression ◽

Data Mining ◽

Microarray Data ◽

Meta Analysis ◽

Biological Significance ◽

Biological Information ◽

Expression Arrays ◽

Dna Microarray Data ◽

Gene Expression Arrays ◽

Expression Genetics

Microarray technology enables high-throughput parallel gene expression analysis, and use has grown exponentially thanks to the development of a variety of applications for expression, genetics and epigenetic studies. A wealth of data is now available from public repositories, providing unprecedented opportunities for meta-analysis approaches, which could generate new biological information, unrelated to the original scope of individual studies. This study provides a guideline for identification of biological significance of the statistically-selected differentially-expressed genes derived from gene expression arrays as well as to suggest further analysis pathways. The authors review the prerequisites for data-mining and meta-analysis, summarize the conceptual methods to derive biological information from microarray data and suggest software for each category of data mining or meta-analysis.

Download Full-text

Data Mining in Gene Expression Analysis

Processing and Managing Complex Data for Decision Support ◽

10.4018/978-1-59140-655-6.ch013 ◽

2011 ◽

pp. 375-418

Author(s):

Jilin Han ◽

Le Gruenwald ◽

Tyrrell Conway

Keyword(s):

Gene Expression ◽

Data Mining ◽

Biological Information ◽

Future Research ◽

Gene Expression Data Analysis ◽

Experimental Conditions ◽

Single Experiment ◽

Important Approach ◽

Future Research Directions ◽

Gene Expression Levels

The study of gene expression levels under defined experimental conditions is an important approach to understand how a living cell works. High-throughput microarray technology is a very powerful tool for simultaneously studying thousands of genes in a single experiment. This revolutionary technology results in an extensive amount of data, which raises an important question: how to extract meaningful biological information from these data? In this chapter, we survey data mining techniques that have been used for clustering, classification and association rules for gene expression data analysis. In addition, we provide a comprehensive list of currently available commercial and academic data mining software together with their features. Lastly, we suggest future research directions.

Download Full-text

Pattern Discovery in Gene Expression Data

Intelligent Data Analysis ◽

10.4018/978-1-59904-982-3.ch003 ◽

2009 ◽

pp. 45-64

Author(s):

Gráinne Kerr ◽

Heather Ruskin ◽

Martin Crane

Keyword(s):

Gene Expression ◽

Data Mining ◽

Cluster Analysis ◽

Gene Regulation ◽

Data Analysis ◽

Gene Expression Data ◽

Mrna Levels ◽

Expression Data ◽

Single Experiment ◽

Gene Regulation Networks

Microarray technology1 provides an opportunity to monitor mRNA levels of expression of thousands of genes simultaneously in a single experiment. The enormous amount of data produced by this high throughput approach presents a challenge for data analysis: to extract meaningful patterns, to evaluate its quality, and to interpret the results. The most commonly used method of identifying such patterns is cluster analysis. Common and sufficient approaches to many data-mining problems, for example, Hierarchical, K-means, do not address well the properties of “typical” gene expression data and fail, in significant ways, to account for its profile. This chapter clarifies some of the issues and provides a framework to evaluate clustering in gene expression analysis. Methods are categorised explicitly in the context of application to data of this type, providing a basis for reverse engineering of gene regulation networks. Finally, areas for possible future development are highlighted.

Download Full-text

Text Mining Perspectives in Microarray Data Mining

ISRN Computational Biology ◽

10.1155/2013/159135 ◽

2013 ◽

Vol 2013 ◽

pp. 1-5 ◽

Cited By ~ 1

Author(s):

Jeyakumar Natarajan

Keyword(s):

Gene Expression ◽

Data Mining ◽

Text Mining ◽

Gene Expression Data ◽

Microarray Data ◽

Machine Learning Algorithms ◽

Microarray Data Analysis ◽

Expression Data ◽

Related Data ◽

Mining Methods

Current microarray data mining methods such as clustering, classification, and association analysis heavily rely on statistical and machine learning algorithms for analysis of large sets of gene expression data. In recent years, there has been a growing interest in methods that attempt to discover patterns based on multiple but related data sources. Gene expression data and the corresponding literature data are one such example. This paper suggests a new approach to microarray data mining as a combination of text mining (TM) and information extraction (IE). TM is concerned with identifying patterns in natural language text and IE is concerned with locating specific entities, relations, and facts in text. The present paper surveys the state of the art of data mining methods for microarray data analysis. We show the limitations of current microarray data mining methods and outline how text mining could address these limitations.

Download Full-text