Gene Expression Analysis using Markov Chains extracted from RNNs

Igor Lorenzato Almeida; Denise Regina Pechmann; Adelmo Luis Cechin

doi:10.19153/cleiej.10.2.8

Gene Expression Analysis using Markov Chains extracted from RNNs

CLEI electronic journal ◽

10.19153/cleiej.10.2.8 ◽

2007 ◽

Vol 10 (2) ◽

Author(s):

Igor Lorenzato Almeida ◽

Denise Regina Pechmann ◽

Adelmo Luis Cechin

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Markov Chains ◽

Microarray Data ◽

Recurrent Neural Networks ◽

Gene Expression Analysis ◽

Data Sets ◽

New Approach ◽

Several Variables ◽

Gene Expression Levels

This paper present a new approach for the analysis of gene expres- sion, by extracting a Markov Chain from trained Recurrent Neural Networks (RNNs). A lot of microarray data is being generated, since array technologies have been widely used to monitor simultaneously the expression pattern of thou- sands of genes. Microarray data is highly specialized, involves several variables in which are complex to express and analyze. The challenge is to discover how to extract useful information from these data sets. So this work proposes the use of RNNs for data modeling, due to their ability to learn complex temporal non-linear data. Once a model is obtained for the data, it is possible to ex- tract the acquired knowledge and to represent it through Markov Chains model. Markov Chains are easily visualized in the form of states graphs, which show the influences among the gene expression levels and their changes in time

Download Full-text

Integrated gene expression analysis of multiple microarray data sets based on a normalization technique and on adaptive connectionist model

Proceedings of the International Joint Conference on Neural Networks, 2003. ◽

10.1109/ijcnn.2003.1223667 ◽

2004 ◽

Cited By ~ 1

Author(s):

Liang Goh ◽

N. Kasabov

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Microarray Data ◽

Gene Expression Analysis ◽

Data Sets ◽

Connectionist Model

Download Full-text

Predicting Gene Expression Levels from Histone Modification Signals with Convolutional Recurrent Neural Networks

EMBEC & NBC 2017 - IFMBE Proceedings ◽

10.1007/978-981-10-5122-7_139 ◽

2017 ◽

pp. 555-558

Author(s):

Lingyu Zhu ◽

Juha Kesseli ◽

Matti Nykter ◽

Heikki Huttunen

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Histone Modification ◽

Recurrent Neural Networks ◽

Expression Levels ◽

Gene Expression Levels

Download Full-text

Neural networks for gene expression analysis and gene selection from DNA microarray

Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. ◽

10.1109/ijcnn.2005.1555883 ◽

2006 ◽

Cited By ~ 7

Author(s):

J.C. Patra ◽

Qin Zhen ◽

Ee Luang Ang ◽

Amitabha Das

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Dna Microarray ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Gene Selection

Download Full-text

Recurrent Neural Networks for Learning Mixed k th -Order Markov Chains

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-540-30499-9_73 ◽

2004 ◽

pp. 477-482

Author(s):

Wang Xiangrui ◽

Narendra S. Chaudhari

Keyword(s):

Neural Networks ◽

Markov Chains ◽

Recurrent Neural Networks

Download Full-text

High-throughput single-cell RNA-seq data imputation and characterization with surrogate-assisted automated deep learning

Briefings in Bioinformatics ◽

10.1093/bib/bbab368 ◽

2021 ◽

Author(s):

Xiangtao Li ◽

Shaochuan Li ◽

Lei Huang ◽

Shixiong Zhang ◽

Ka-chun Wong

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Single Cell ◽

Deep Neural Networks ◽

Expression Profiles ◽

Marker Gene ◽

Gene Expression Profiles ◽

Underlying Mechanisms ◽

Cell Data ◽

Gene Expression Levels

Abstract Single-cell RNA sequencing (scRNA-seq) technologies have been heavily developed to probe gene expression profiles at single-cell resolution. Deep imputation methods have been proposed to address the related computational challenges (e.g. the gene sparsity in single-cell data). In particular, the neural architectures of those deep imputation models have been proven to be critical for performance. However, deep imputation architectures are difficult to design and tune for those without rich knowledge of deep neural networks and scRNA-seq. Therefore, Surrogate-assisted Evolutionary Deep Imputation Model (SEDIM) is proposed to automatically design the architectures of deep neural networks for imputing gene expression levels in scRNA-seq data without any manual tuning. Moreover, the proposed SEDIM constructs an offline surrogate model, which can accelerate the computational efficiency of the architectural search. Comprehensive studies show that SEDIM significantly improves the imputation and clustering performance compared with other benchmark methods. In addition, we also extensively explore the performance of SEDIM in other contexts and platforms including mass cytometry and metabolic profiling in a comprehensive manner. Marker gene detection, gene ontology enrichment and pathological analysis are conducted to provide novel insights into cell-type identification and the underlying mechanisms. The source code is available at https://github.com/li-shaochuan/SEDIM.

Download Full-text

Analyzing Large Gene Expression Data Sets

Computational Text Analysis ◽

10.1093/oso/9780198567400.003.0014 ◽

2006 ◽

Author(s):

Soumya Raychaudhuri

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Data Sets ◽

Expression Data ◽

Clustering Methods ◽

Biologically Relevant ◽

Large Gene ◽

Functional Coherence

The most interesting and challenging gene expression data sets to analyze are large multidimensional data sets that contain expression values for many genes across multiple conditions. In these data sets the use of scientific text can be particularly useful, since there are a myriad of genes examined under vastly different conditions, each of which may induce or repress expression of the same gene for different reasons. There is an enormous complexity to the data that we are examining—each gene is associated with dozens if not hundreds of expression values as well as multiple documents built up from vocabularies consisting of thousands of words. In Section 2.4 we reviewed common gene expression strategies, most of which revolve around defining groups of genes based on common profiles. A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present computational methods that leverage the peer-reviewed literature in the automatic analysis of gene expression data sets. Including the literature in gene expression data analysis offers an opportunity to incorporate background functional information about the genes when defining expression clusters. In Chapter 5 we saw how literature- based approaches could help in the analysis of single condition experiments. Here we will apply the strategies introduced in Chapter 6 to assess the coherence of groups of genes to enhance gene expression analysis approaches. The methods proposed here could, in fact, be applied to any multivariate genomics data type. The key concepts discussed in this chapter are listed in the frame box. We begin with a discussion of gene groups and their role in expression analysis; we briefly discuss strategies to assign keywords to groups and strategies to assess their functional coherence. We apply functional coherence measures to gene expression analysis; for examples we focus on a yeast expression data set. We first demonstrate how functional coherence can be used to focus in on the key biologically relevant gene groups derived by clustering methods such as self-organizing maps and k-means clustering.

Download Full-text

Biomedical Data Mining Using RBF Neural Networks

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch021 ◽

2011 ◽

pp. 106-111

Author(s):

Fang Chu ◽

Lipo Wang

Keyword(s):

Neural Networks ◽

Microarray Data ◽

High Accuracy ◽

Accurate Diagnosis ◽

Data Sets ◽

Biomedical Data ◽

Rbf Neural Networks ◽

Proper Treatment ◽

Pattern Recognition Problem ◽

Very High

Accurate diagnosis of cancers is of great importance for doctors to choose a proper treatment. Furthermore, it also plays a key role in the searching for the pathology of cancers and drug discovery. Recently, this problem attracts great attention in the context of microarray technology. Here, we apply radial basis function (RBF) neural networks to this pattern recognition problem. Our experimental results in some well-known microarray data sets indicate that our method can obtain very high accuracy with a small number of genes.

Download Full-text

Comparison of Linear Weighting Schemes for Perfect Match and Mismatch Gene Expression Levels from Microarray Data

American Journal of PharmacoGenomics ◽

10.2165/00129785-200505030-00006 ◽

2005 ◽

Vol 5 (3) ◽

pp. 197-205 ◽

Cited By ~ 2

Author(s):

T Mark Beasley ◽

Janet K Holt ◽

David B Allison

Keyword(s):

Gene Expression ◽

Microarray Data ◽

Perfect Match ◽

Weighting Schemes ◽

Expression Levels ◽

Gene Expression Levels

Download Full-text

SYSTEMATIC VARIATION NORMALIZATION IN MICROARRAY DATA TO GET GENE EXPRESSION COMPARISON UNBIASED

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001028 ◽

2005 ◽

Vol 03 (02) ◽

pp. 225-241 ◽

Cited By ~ 13

Author(s):

JEFF W. CHOU ◽

RICHARD S. PAULES ◽

PIERRE R. BUSHEL

Keyword(s):

Gene Expression ◽

Linear Regression ◽

Microarray Data ◽

Expression Patterns ◽

Microarray Gene Expression Data ◽

Systematic Variation ◽

Data Sets ◽

Microarray Gene Expression ◽

Pixel Intensity ◽

Non Linear

Normalization removes or minimizes the biases of systematic variation that exists in experimental data sets. This study presents a systematic variation normalization (SVN) procedure for removing systematic variation in two channel microarray gene expression data. Based on an analysis of how systematic variation contributes to variability in microarray data sets, our normalization procedure includes background subtraction determined from the distribution of pixel intensity values from each data acquisition channel and log conversion, linear or non-linear regression, restoration or transformation, and multiarray normalization. In the case when a non-linear regression is required, an empirical polynomial approximation approach is used. Either the high terminated points or their averaged values in the distributions of the pixel intensity values observed in control channels may be used for rescaling multiarray datasets. These pre-processing steps remove systematic variation in the data attributable to variability in microarray slides, assay-batches, the array process, or experimenters. Biologically meaningful comparisons of gene expression patterns between control and test channels or among multiple arrays are therefore unbiased using normalized but not unnormalized datasets.

Download Full-text

SPECTRAL CLUSTERING ON GENE EXPRESSION PROFILE TO IDENTIFY CANCER TYPES OR SUBTYPES

Jurnal Teknologi ◽

10.11113/jt.v76.4036 ◽

2015 ◽

Vol 76 (1) ◽

Author(s):

Ang Jun Chin ◽

Andri Mirzal ◽

Habibollah Haron

Keyword(s):

Gene Expression ◽

Gene Expression Profile ◽

Expression Profile ◽

Microarray Data ◽

Spectral Clustering ◽

Data Sets ◽

Clustering Methods ◽

Microarray Gene Expression ◽

Cancer Types ◽

Microarray Gene

Gene expression profile is eminent for its broad applications and achievements in disease discovery and analysis, especially in cancer research. Spectral clustering is robust to irrelevant features which are appropriated for gene expression analysis. However, previous works show that performance comparison with other clustering methods is limited and only a few microarray data sets were analyzed in each study. In this study, we demonstrate the use of spectral clustering in identifying cancer types or subtypes from microarray gene expression profiling. Spectral clustering was applied to eleven microarray data sets and its clustering performances were compared with the results in the literature. Based on the result, overall the spectral clustering slightly outperformed the corresponding results in the literature. The spectral clustering can also offer more stable clustering performances as it has smaller standard deviation value. Moreover, out of eleven data sets the spectral clustering outperformed the corresponding methods in the literature for six data sets. So, it can be stated that the spectral clustering is a promising method in identifying the cancer types or subtypes for microarray gene expression data sets.

Download Full-text