VEGA is an interpretable generative model for inferring biological network activity in single-cell transcriptomics

AbstractDeep learning architectures such as variational autoencoders have revolutionized the analysis of transcriptomics data. However, the latent space of these variational autoencoders offers little to no interpretability. To provide further biological insights, we introduce a novel sparse Variational Autoencoder architecture, VEGA (VAE Enhanced by Gene Annotations), whose decoder wiring mirrors user-provided gene modules, providing direct interpretability to the latent variables. We demonstrate the performance of VEGA in diverse biological contexts using pathways, gene regulatory networks and cell type identities as the gene modules that define its latent space. VEGA successfully recapitulates the mechanism of cellular-specific response to treatments, the status of master regulators as well as jointly revealing the cell type and cellular state identity in developing cells. We envision the approach could serve as an explanatory biological model for development and drug treatment experiments.

Download Full-text

Biological network-inspired interpretable variational autoencoder

10.1101/2020.12.17.423310 ◽

2020 ◽

Author(s):

Lucas Seninge ◽

Ioannis Anastopoulos ◽

Hongxu Ding ◽

Joshua Stuart

Keyword(s):

Latent Variables ◽

Regulatory Networks ◽

A Priori ◽

Specific Response ◽

Cell Type ◽

State Identity ◽

Latent Space ◽

Variational Autoencoder ◽

The Status ◽

Transcriptomics Data

Deep learning architectures such as variational autoencoders have revolutionized the analysis of transcriptomics data. However, the latent space of these variational autoencoders offers little to no interpretability. To provide further biological insights, we introduce a novel sparse Variational Autoencoder architecture, VEGA (Vae Enhanced by Gene Annotations), whose decoder wiring is inspired by a priori characterized biological abstractions, providing direct interpretability to the latent variables. We demonstrate the interpretability and flexibility of VEGA in diverse biological contexts, by integrating various sources of biological abstractions such as pathways, gene regulatory networks and cell type identities in the latent space of our model. We show that our model could recapitulate the mechanism of cellular-specific response to treatments, the status of master regulators as well as jointly investigate the cell type and cellular state identity in developing cells. We envision the approach could serve as an explanatory biological model in contexts such as development and drug treatment experiments.

Download Full-text

Predictive learning as a network mechanism for extracting low-dimensional latent space representations

Nature Communications ◽

10.1038/s41467-021-21696-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Stefano Recanatesi ◽

Matthew Farrell ◽

Guillaume Lajoie ◽

Sophie Deneve ◽

Mattia Rigotti ◽

...

Keyword(s):

Latent Variables ◽

Latent Structure ◽

Network Activity ◽

Semantic Organization ◽

Sequential Processing ◽

Predictive Learning ◽

Neural Representations ◽

Latent Space ◽

Low Dimensional ◽

Sensory Prediction

AbstractArtificial neural networks have recently achieved many successes in solving sequential processing and planning tasks. Their success is often ascribed to the emergence of the task’s low-dimensional latent structure in the network activity – i.e., in the learned neural representations. Here, we investigate the hypothesis that a means for generating representations with easily accessed low-dimensional latent structure, possibly reflecting an underlying semantic organization, is through learning to predict observations about the world. Specifically, we ask whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables. Using a recurrent neural network model trained to predict a sequence of observations we show that network dynamics exhibit low-dimensional but nonlinearly transformed representations of sensory inputs that map the latent structure of the sensory environment. We quantify these results using nonlinear measures of intrinsic dimensionality and linear decodability of latent variables, and provide mathematical arguments for why such useful predictive representations emerge. We focus throughout on how our results can aid the analysis and interpretation of experimental data.

Download Full-text

Predictive learning as a network mechanism for extracting low-dimensional latent space representations

10.1101/471987 ◽

2018 ◽

Cited By ~ 1

Author(s):

Stefano Recanatesi ◽

Matthew Farrell ◽

Guillaume Lajoie ◽

Sophie Deneve ◽

Mattia Rigotti ◽

...

Keyword(s):

Latent Variables ◽

Latent Structure ◽

Network Activity ◽

Semantic Organization ◽

Sequential Processing ◽

Predictive Learning ◽

Neural Representations ◽

Latent Space ◽

Low Dimensional ◽

Sensory Prediction

Artificial neural networks have recently achieved many successes in solving sequential processing and planning tasks. Their success is often ascribed to the emergence of the task’s low-dimensional latent structure in the network activity – i.e., in the learned neural representations. Here, we investigate the hypothesis that a means for generating representations with easily accessed low-dimensional latent structure, possibly reflecting an underlying semantic organization, is through learning to predict observations about the world. Specifically, we ask whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables. Using a recurrent neural network model trained to predict a sequence of observations we show that network dynamics exhibit low-dimensional but nonlinearly transformed representations of sensory inputs that map the latent structure of the sensory environment. We quantify these results using nonlinear measures of intrinsic dimensionality and linear decodability of latent variables, and provide mathematical arguments for why such useful predictive representations emerge. We focus throughout on how our results can aid the analysis and interpretation of experimental data.

Download Full-text

Delineating Psychopathy from Cognitive Empathy

European journal of analytic philosophy ◽

10.31820/ejap.14.1.3 ◽

2018 ◽

Vol 14 (1) ◽

pp. 53-62 ◽

Cited By ~ 2

Author(s):

Janko Međedović ◽

Nikola Đuričić

Keyword(s):

Latent Variables ◽

Community Sample ◽

Cognitive Empathy ◽

Psychopathic Personality ◽

Factor Solution ◽

Online Study ◽

Model Research ◽

Latent Space ◽

The Status ◽

Novel Model

There is an ongoing debate regarding the content of psychopathy, especially about the status of antisocial behavior and disinhibition characteristics as core psychopathy features. Psychopathic Personality Traits Scale (PPTS) represents a novel model of psychopathy based on core psychopathy markers such as Interpersonal manipulation, Egocentricity and Affective responsiveness. However, this model presupposes another narrow trait of psychopathy: cognitive responsiveness, which represents a lack of cognitive empathy. Since previous models of psychopathy do not depict this feature as a core psychopathy trait, the goal of this study was to empirically evaluate if the lack of cognitive empathy is a narrow psychopathy trait or its correlate. The research was conducted on a community sample via online study (N=342; Mage=23.7 years; 31% males). Results showed that the correlations between Cognitive responsiveness and other psychopathy features were significantly lower than intercorrelations of other three traits. Factor analysis, conducted on PPTS items, provided a two-factor solution, where Cognitive responsiveness was yielded as a factor separate from other psychopathy indicators. Finally, the exploration of the shared latent space of psychopathy and cognitive empathy resulted in the two-factor solution where psychopathy and the lack of cognitive empathy were extracted as correlated but separate latent variables. The data clearly supported the former model. Research results showed that the lack of cognitive empathy should not be considered an indicator of psychopathy but its correlate. The findings emphasize the need to be cautious in conceptualization of the psychopathy construct.

Download Full-text

Cell-type-specific profiling of loaded miRNAs from Caenorhabditis elegans reveals spatial and temporal flexibility in Argonaute loading

Nature Communications ◽

10.1038/s41467-021-22503-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Christopher A. Brosnan ◽

Alexander J. Palmer ◽

Steven Zuryn

Keyword(s):

Caenorhabditis Elegans ◽

Regulatory Networks ◽

Regulatory Mechanisms ◽

Cell Type ◽

A Genome ◽

Temporal Flexibility ◽

Small Regulatory Rnas ◽

Cell Type Specific ◽

Single Cell Type ◽

Regulated Gene Expression

AbstractMulticellularity has coincided with the evolution of microRNAs (miRNAs), small regulatory RNAs that are integrated into cellular differentiation and homeostatic gene-regulatory networks. However, the regulatory mechanisms underpinning miRNA activity have remained largely obscured because of the precise, and thus difficult to access, cellular contexts under which they operate. To resolve these, we have generated a genome-wide map of active miRNAs in Caenorhabditis elegans by revealing cell-type-specific patterns of miRNAs loaded into Argonaute (AGO) silencing complexes. Epitope-labelled AGO proteins were selectively expressed and immunoprecipitated from three distinct tissue types and associated miRNAs sequenced. In addition to providing information on biological function, we define adaptable miRNA:AGO interactions with single-cell-type and AGO-specific resolution. We demonstrate spatial and temporal dynamicism, flexibility of miRNA loading, and suggest miRNA regulatory mechanisms via AGO selectivity in different tissues and during ageing. Additionally, we resolve widespread changes in AGO-regulated gene expression by analysing translatomes specifically in neurons.

Download Full-text

Interpretable Variational Graph Autoencoder with Noninformative Prior

Future Internet ◽

10.3390/fi13020051 ◽

2021 ◽

Vol 13 (2) ◽

pp. 51

Author(s):

Lili Sun ◽

Xueyan Liu ◽

Min Zhao ◽

Bo Yang

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Expert Knowledge ◽

Structural Information ◽

Standard Normal Distribution ◽

Noninformative Prior ◽

Latent Space ◽

Distribution Parameters ◽

Standard Normal ◽

Low Dimensional

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.

Download Full-text

An analytical method for the identification of cell type-specific disease gene modules

Journal of Translational Medicine ◽

10.1186/s12967-020-02690-5 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Jinting Guan ◽

Yiping Lin ◽

Yang Wang ◽

Junchao Gao ◽

Guoli Ji

Keyword(s):

Disease Gene ◽

Gene Interaction ◽

Cell Types ◽

Autism Spectrum ◽

Specific Gene ◽

Cell Type ◽

Specific Disease ◽

Cell Type Specific ◽

Gene Modules ◽

Disease Associated Genes

Abstract Background Genome-wide association studies have identified genetic variants associated with the risk of brain-related diseases, such as neurological and psychiatric disorders, while the causal variants and the specific vulnerable cell types are often needed to be studied. Many disease-associated genes are expressed in multiple cell types of human brains, while the pathologic variants affect primarily specific cell types. We hypothesize a model in which what determines the manifestation of a disease in a cell type is the presence of disease module comprised of disease-associated genes, instead of individual genes. Therefore, it is essential to identify the presence/absence of disease gene modules in cells. Methods To characterize the cell type-specificity of brain-related diseases, we construct human brain cell type-specific gene interaction networks integrating human brain nucleus gene expression data with a referenced tissue-specific gene interaction network. Then from the cell type-specific gene interaction networks, we identify significant cell type-specific disease gene modules by performing statistical tests. Results Between neurons and glia cells, the constructed cell type-specific gene networks and their gene functions are distinct. Then we identify cell type-specific disease gene modules associated with autism spectrum disorder and find that different gene modules are formed and distinct gene functions may be dysregulated in different cells. We also study the similarity and dissimilarity in cell type-specific disease gene modules among autism spectrum disorder, schizophrenia and bipolar disorder. The functions of neurons-specific disease gene modules are associated with synapse for all three diseases, while those in glia cells are different. To facilitate the use of our method, we develop an R package, CtsDGM, for the identification of cell type-specific disease gene modules. Conclusions The results support our hypothesis that a disease manifests itself in a cell type through forming a statistically significant disease gene module. The identification of cell type-specific disease gene modules can promote the development of more targeted biomarkers and treatments for the disease. Our method can be applied for depicting the cell type heterogeneity of a given disease, and also for studying the similarity and dissimilarity between different disorders, providing new insights into the molecular mechanisms underlying the pathogenesis and progression of diseases.

Download Full-text

Jointly leveraging spatial transcriptomics and deep learning models for pathology image annotation improves cell type identification over either approach alone.

10.1101/2021.11.10.468082 ◽

2021 ◽

Author(s):

Asif Zubair ◽

Richard H. Chapple ◽

Sivaraman Natarajan ◽

William C. Wright ◽

Min Pan ◽

...

Keyword(s):

Immune Cell ◽

Image Annotation ◽

Cell Types ◽

Tissue Cell ◽

Cell Type ◽

Spatially Resolved ◽

Transcriptomics Data ◽

Diagnostic Applications ◽

The Many ◽

Level Performance

The disorganization of cell types within tissues underlies many human diseases and has been studied for over a century using the conventional tools of pathology, including tissue-marking dyes such as the H&E stain. Recently, spatial transcriptomics technologies were developed that can measure spatially resolved gene expression directly in pathology-stained tissues sections, revealing cell types and their dysfunction in unprecedented detail. In parallel, artificial intelligence (AI) has approached pathologist-level performance in computationally annotating H&E images of tissue sections. However, spatial transcriptomics technologies are limited in their ability to separate transcriptionally similar cell types and AI-based pathology has performed less impressively outside their training datasets. Here, we describe a methodology that can computationally integrate AI-annotated pathology images with spatial transcriptomics data to markedly improve inferences of tissue cell type composition made over either class of data alone. We show that this methodology can identify regions of clinically relevant tumor immune cell infiltration, which is predictive of response to immunotherapy and was missed by an initial pathologist's manual annotation. Thus, combining spatial transcriptomics and AI-based image annotation has the potential to exceed pathologist-level performance in clinical diagnostic applications and to improve the many applications of spatial transcriptomics that rely on accurate cell type annotations.

Download Full-text

Genomic Architecture of Cells in Tissues (GeACT): Study of Human Mid-gestation Fetus

10.1101/2020.04.12.038000 ◽

2020 ◽

Author(s):

Feng Tian ◽

Fan Zhou ◽

Xiang Li ◽

Wenping Ma ◽

Honggui Wu ◽

...

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Human Cell ◽

Expression Profiles ◽

Single Cells ◽

Cell Types ◽

List Type ◽

Cell Type ◽

Genomic Architecture ◽

Gene Modules

SummaryBy circumventing cellular heterogeneity, single cell omics have now been widely utilized for cell typing in human tissues, culminating with the undertaking of human cell atlas aimed at characterizing all human cell types. However, more important are the probing of gene regulatory networks, underlying chromatin architecture and critical transcription factors for each cell type. Here we report the Genomic Architecture of Cells in Tissues (GeACT), a comprehensive genomic data base that collectively address the above needs with the goal of understanding the functional genome in action. GeACT was made possible by our novel single-cell RNA-seq (MALBAC-DT) and ATAC-seq (METATAC) methods of high detectability and precision. We exemplified GeACT by first studying representative organs in human mid-gestation fetus. In particular, correlated gene modules (CGMs) are observed and found to be cell-type-dependent. We linked gene expression profiles to the underlying chromatin states, and found the key transcription factors for representative CGMs.HighlightsGenomic Architecture of Cells in Tissues (GeACT) data for human mid-gestation fetusDetermining correlated gene modules (CGMs) in different cell types by MALBAC-DTMeasuring chromatin open regions in single cells with high detectability by METATACIntegrating transcriptomics and chromatin accessibility to reveal key TFs for a CGM

Download Full-text

Integrative multi-omics analyses identify cell-type disease genes and regulatory networks across schizophrenia and Alzheimer’s disease

10.1101/2020.06.11.147314 ◽

2020 ◽

Author(s):

Mufang Ying ◽

Peter Rehani ◽

Panagiotis Roussos ◽

Daifeng Wang

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Gene Regulatory Networks ◽

Regulatory Networks ◽

Regulatory Elements ◽

Disease Genes ◽

Omics Data ◽

Cell Type ◽

Cellular Resolution ◽

Gene Regulatory

AbstractStrong phenotype-genotype associations have been reported across brain diseases. However, understanding underlying gene regulatory mechanisms remains challenging, especially at the cellular level. To address this, we integrated the multi-omics data at the cellular resolution of the human brain: cell-type chromatin interactions, epigenomics and single cell transcriptomics, and predicted cell-type gene regulatory networks linking transcription factors, distal regulatory elements and target genes (e.g., excitatory and inhibitory neurons, microglia, oligodendrocyte). Using these cell-type networks and disease risk variants, we further identified the cell-type disease genes and regulatory networks for schizophrenia and Alzheimer’s disease. The celltype regulatory elements (e.g., enhancers) in the networks were also found to be potential pleiotropic regulatory loci for a variety of diseases. Further enrichment analyses including gene ontology and KEGG pathways revealed potential novel cross-disease and disease-specific molecular functions, advancing knowledge on the interplays among genetic, transcriptional and epigenetic risks at the cellular resolution between neurodegenerative and neuropsychiatric diseases. Finally, we summarized our computational analyses as a general-purpose pipeline for predicting gene regulatory networks via multi-omics data.

Download Full-text