Adversarial deconfounding autoencoder for learning robust gene expression embeddings

Ayse B Dincer; Joseph D Janizek; Su-In Lee

doi:10.1093/bioinformatics/btaa796

Adversarial deconfounding autoencoder for learning robust gene expression embeddings

Bioinformatics ◽

10.1093/bioinformatics/btaa796 ◽

2020 ◽

Vol 36 (Supplement_2) ◽

pp. i573-i582

Author(s):

Ayse B Dincer ◽

Joseph D Janizek ◽

Su-In Lee

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Supplementary Information ◽

Biological Variables ◽

Large Numbers ◽

Latent Space ◽

Complex Models ◽

Unsupervised Neural Networks

Abstract Motivation Increasing number of gene expression profiles has enabled the use of complex models, such as deep unsupervised neural networks, to extract a latent space from these profiles. However, expression profiles, especially when collected in large numbers, inherently contain variations introduced by technical artifacts (e.g. batch effects) and uninteresting biological variables (e.g. age) in addition to the true signals of interest. These sources of variations, called confounders, produce embeddings that fail to transfer to different domains, i.e. an embedding learned from one dataset with a specific confounder distribution does not generalize to different distributions. To remedy this problem, we attempt to disentangle confounders from true signals to generate biologically informative embeddings. Results In this article, we introduce the Adversarial Deconfounding AutoEncoder (AD-AE) approach to deconfounding gene expression latent spaces. The AD-AE model consists of two neural networks: (i) an autoencoder to generate an embedding that can reconstruct original measurements, and (ii) an adversary trained to predict the confounder from that embedding. We jointly train the networks to generate embeddings that can encode as much information as possible without encoding any confounding signal. By applying AD-AE to two distinct gene expression datasets, we show that our model can (i) generate embeddings that do not encode confounder information, (ii) conserve the biological signals present in the original space and (iii) generalize successfully across different confounder domains. We demonstrate that AD-AE outperforms standard autoencoder and other deconfounding approaches. Availability and implementation Our code and data are available at https://gitlab.cs.washington.edu/abdincer/ad-ae. Contact Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Adversarial Deconfounding Autoencoder for Learning Robust Gene Expression Embeddings

10.1101/2020.04.28.065052 ◽

2020 ◽

Author(s):

Ayse B. Dincer ◽

Joseph D. Janizek ◽

Su-In Lee

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Batch Effects ◽

Biological Variables ◽

Large Numbers ◽

Latent Space ◽

Complex Models ◽

Unsupervised Neural Networks

AbstractMotivationIncreasing number of gene expression profiles has enabled the use of complex models, such as deep unsupervised neural networks, to extract a latent space from these profiles. However, expression profiles, especially when collected in large numbers, inherently contain variations introduced by technical artifacts (e.g., batch effects) and uninteresting biological variables (e.g., age) in addition to the true signals of interest. These sources of variations, called confounders, produce embeddings that fail to transfer to different domains, i.e., an embedding learned from one dataset with a specific confounder distribution does not generalize to different distributions. To remedy this problem, we attempt to disentangle confounders from true signals to generate biologically informative embeddings.ResultsIn this paper, we introduce the AD-AE (Adversarial Deconfounding AutoEncoder) approach to deconfounding gene expression latent spaces. The AD-AE model consists of two neural networks: (i) an autoencoder to generate an embedding that can reconstruct original measurements, and (ii) an adversary trained to predict the confounder from that embedding. We jointly train the networks to generate embeddings that can encode as much information as possible without encoding any confounding signal. By applying AD-AE to two distinct gene expression datasets, we show that our model can (1) generate embeddings that do not encode confounder information, (2) conserve the biological signals present in the original space, and (3) generalize successfully across different confounder domains. We demonstrate that AD-AE outperforms standard autoencoder and other deconfounding approaches.AvailabilityOur code and data are available at https://gitlab.cs.washington.edu/abdincer/[email protected]; [email protected]

Download Full-text

Diagnostic Prediction Based on Gene Expression Profiles and Artificial Neural Networks

Soft Computing for Biological Systems ◽

10.1007/978-981-10-7455-4_2 ◽

2018 ◽

pp. 13-22

Author(s):

Eugene Lin ◽

Shih-Jen Tsai

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Artificial Neural Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Artificial Neural

Download Full-text

High-throughput single-cell RNA-seq data imputation and characterization with surrogate-assisted automated deep learning

Briefings in Bioinformatics ◽

10.1093/bib/bbab368 ◽

2021 ◽

Author(s):

Xiangtao Li ◽

Shaochuan Li ◽

Lei Huang ◽

Shixiong Zhang ◽

Ka-chun Wong

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Single Cell ◽

Deep Neural Networks ◽

Expression Profiles ◽

Marker Gene ◽

Gene Expression Profiles ◽

Underlying Mechanisms ◽

Cell Data ◽

Gene Expression Levels

Abstract Single-cell RNA sequencing (scRNA-seq) technologies have been heavily developed to probe gene expression profiles at single-cell resolution. Deep imputation methods have been proposed to address the related computational challenges (e.g. the gene sparsity in single-cell data). In particular, the neural architectures of those deep imputation models have been proven to be critical for performance. However, deep imputation architectures are difficult to design and tune for those without rich knowledge of deep neural networks and scRNA-seq. Therefore, Surrogate-assisted Evolutionary Deep Imputation Model (SEDIM) is proposed to automatically design the architectures of deep neural networks for imputing gene expression levels in scRNA-seq data without any manual tuning. Moreover, the proposed SEDIM constructs an offline surrogate model, which can accelerate the computational efficiency of the architectural search. Comprehensive studies show that SEDIM significantly improves the imputation and clustering performance compared with other benchmark methods. In addition, we also extensively explore the performance of SEDIM in other contexts and platforms including mass cytometry and metabolic profiling in a comprehensive manner. Marker gene detection, gene ontology enrichment and pathological analysis are conducted to provide novel insights into cell-type identification and the underlying mechanisms. The source code is available at https://github.com/li-shaochuan/SEDIM.

Download Full-text

Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks

Scientific Reports ◽

10.1038/s41598-019-38798-y ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 5

Author(s):

Farzad Abdolhosseini ◽

Behrooz Azarkhalili ◽

Abbas Maazallahi ◽

Aryan Kamal ◽

Seyed Abolfazl Motahari ◽

...

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Deep Neural Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cell Identity

Download Full-text

A New Gene Expression Profiles Classifying Approach Based on Neighborhood Rough Set and Probabilistic Neural Networks Ensemble

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-42042-9_60 ◽

2013 ◽

pp. 484-489 ◽

Cited By ~ 2

Author(s):

Jiang Yun ◽

Xie Guocheng ◽

Chen Na ◽

Chen Shan

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Rough Set ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Probabilistic Neural Networks ◽

Neighborhood Rough Set ◽

New Gene ◽

Neural Networks Ensemble

Download Full-text

Cancer classification of single-cell gene expression data by neural network

Bioinformatics ◽

10.1093/bioinformatics/btz772 ◽

2019 ◽

Cited By ~ 3

Author(s):

Bong-Hyun Kim ◽

Kijin Yu ◽

Peter C W Lee

Keyword(s):

Neural Network ◽

Gene Expression ◽

Single Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cancer Classification ◽

Supplementary Information ◽

Support Vector ◽

K Nearest Neighbors ◽

Normal Tissues

Abstract Motivation Cancer classification based on gene expression profiles has provided insight on the causes of cancer and cancer treatment. Recently, machine learning-based approaches have been attempted in downstream cancer analysis to address the large differences in gene expression values, as determined by single-cell RNA sequencing (scRNA-seq). Results We designed cancer classifiers that can identify 21 types of cancers and normal tissues based on bulk RNA-seq as well as scRNA-seq data. Training was performed with 7398 cancer samples and 640 normal samples from 21 tumors and normal tissues in TCGA based on the 300 most significant genes expressed in each cancer. Then, we compared neural network (NN), support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF) methods. The NN performed consistently better than other methods. We further applied our approach to scRNA-seq transformed by kNN smoothing and found that our model successfully classified cancer types and normal samples. Availability and implementation Cancer classification by neural network. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Prediction of Lymph Node Metastasis with Use of Artificial Neural Networks Based on Gene Expression Profiles in Esophageal Squamous Cell Carcinoma

Annals of Surgical Oncology ◽

10.1245/aso.2004.03.007 ◽

2004 ◽

Vol 11 (12) ◽

pp. 1070-1078 ◽

Cited By ~ 39

Author(s):

Takatsugu Kan ◽

Yutaka Shimada ◽

Fumiaki Sato ◽

Tetsuo Ito ◽

Kan Kondo ◽

...

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Squamous Cell Carcinoma ◽

Artificial Neural Networks ◽

Lymph Node ◽

Lymph Node Metastasis ◽

Esophageal Squamous Cell Carcinoma ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Node Metastasis

Download Full-text

On transformative adaptive activation functions in neural networks for gene expression inference

PLoS ONE ◽

10.1371/journal.pone.0243915 ◽

2021 ◽

Vol 16 (1) ◽

pp. e0243915

Author(s):

Vladimír Kunc ◽

Jiří Kléma

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Mean Absolute Error ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Cost Effective ◽

Absolute Error ◽

Activation Function ◽

Training Procedure ◽

Activation Functions

Gene expression profiling was made more cost-effective by the NIH LINCS program that profiles only ∼1, 000 selected landmark genes and uses them to reconstruct the whole profile. The D–GEX method employs neural networks to infer the entire profile. However, the original D–GEX can be significantly improved. We propose a novel transformative adaptive activation function that improves the gene expression inference even further and which generalizes several existing adaptive activation functions. Our improved neural network achieves an average mean absolute error of 0.1340, which is a significant improvement over our reimplementation of the original D–GEX, which achieves an average mean absolute error of 0.1637. The proposed transformative adaptive function enables a significantly more accurate reconstruction of the full gene expression profiles with only a small increase in the complexity of the model and its training procedure compared to other methods.

Download Full-text

IRIS-DGE: An integrated RNA-seq data analysis and interpretation system for differential gene expression

10.1101/283341 ◽

2018 ◽

Cited By ~ 4

Author(s):

Brandon Monier ◽

Adam McDermaid ◽

Jing Zhao ◽

Anne Fennell ◽

Qin Ma

Keyword(s):

Gene Expression ◽

Differential Gene Expression ◽

Large Scale ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Supplementary Information ◽

Analysis Tool ◽

Rna Seq ◽

Differential Gene ◽

User Friendly

AbstractMotivationNext-Generation Sequencing has made available much more large-scale genomic and transcriptomic data. Studies with RNA-sequencing (RNA-seq) data typically involve generation of gene expression profiles that can be further analyzed, many times involving differential gene expression (DGE). This process enables comparison across samples of two or more factor levels. A recurring issue with DGE analyses is the complicated nature of the comparisons to be made, in which a variety of factor combinations, pairwise comparisons, and main or blocked main effects need to be tested.ResultsHere we present a tool called IRIS-DGE, which is a server-based DGE analysis tool developed using Shiny. It provides a straightforward, user-friendly platform for performing comprehensive DGE analysis, and crucial analyses that help design hypotheses and to determine key genomic features. IRIS-DGE integrates the three most commonly used R-based DGE tools to determine differentially expressed genes (DEGs) and includes numerous methods for performing preliminary analysis on user-provided gene expression information. Additionally, this tool integrates a variety of visualizations, in a highly interactive manner, for improved interpretation of preliminary and DGE analyses.AvailabilityIRIS-DGE is freely available at http://bmbl.sdstate.edu/IRIS/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Constructive Neural Networks to Predict Breast Cancer Outcome by Using Gene Expression Profiles

Trends in Applied Intelligent Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-642-13022-9_32 ◽

2010 ◽

pp. 317-326 ◽

Cited By ~ 3

Author(s):

Daniel Urda ◽

José Luis Subirats ◽

Leo Franco ◽

José Manuel Jerez

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Neural Networks ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Constructive Neural Networks ◽

Breast Cancer Outcome ◽

Cancer Outcome

Download Full-text