scholarly journals Cell type-specific interpretation of noncoding variants using deep learning-based methods

2022 ◽  
Author(s):  
Maria Sindeeva ◽  
Nikolay Chekanov ◽  
Manvel Avetisian ◽  
Nikita Baranov ◽  
Elian Malkin ◽  
...  

Interpretation of non-coding genomic variants is one of the most important challenges in human genetics. Machine learning methods have emerged recently as a powerful tool to solve this problem. State-of-the-art approaches allow prediction of transcriptional and epigenetic effects caused by non-coding mutations. However, these approaches require specific experimental data for training and can not generalize across cell types where required features were not experimentally measured. We show here that available epigenetic characteristics of human cell types are extremely sparse, limiting those approaches that rely on specific epigenetic input. We propose a new neural network architecture, DeepCT, which can learn complex interconnections of epigenetic features and infer unmeasured data from any available input. Furthermore, we show that DeepCT can learn cell type-specific properties, build biologically meaningful vector representations of cell types and utilize these representations to generate cell type-specific predictions of the effects of non-coding variations in the human genome.

2019 ◽  
Author(s):  
Hongyang Li ◽  
Yuanfang Guan

AbstractDecoding the cell type-specific transcription factor (TF) binding landscape at single-nucleotide resolution is crucial for understanding the regulatory mechanisms underlying many fundamental biological processes and human diseases. However, limits on time and resources restrict the high-resolution experimental measurements of TF binding profiles of all possible TF-cell type combinations. Previous computational approaches either can not distinguish the cell-context-dependent TF binding profiles across diverse cell types, or only provide a relatively low-resolution prediction. Here we present a novel deep learning approach, Leopard, for predicting TF-binding sites at single-nucleotide resolution, achieving the median area under receiver operating characteristic curve (AUROC) of 0.994. Our method substantially outperformed state-of-the-art methods Anchor and FactorNet, improving the performance by 19% and 27% respectively despite evaluated at a lower resolution. Meanwhile, by leveraging a many-to-many neural network architecture, Leopard features hundred-fold to thousand-fold speedup compared to current many-to-one machine learning methods.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Houri Hintiryan ◽  
Ian Bowman ◽  
David L. Johnson ◽  
Laura Korobkova ◽  
Muye Zhu ◽  
...  

AbstractThe basolateral amygdalar complex (BLA) is implicated in behaviors ranging from fear acquisition to addiction. Optogenetic methods have enabled the association of circuit-specific functions to uniquely connected BLA cell types. Thus, a systematic and detailed connectivity profile of BLA projection neurons to inform granular, cell type-specific interrogations is warranted. Here, we apply machine-learning based computational and informatics analysis techniques to the results of circuit-tracing experiments to create a foundational, comprehensive BLA connectivity map. The analyses identify three distinct domains within the anterior BLA (BLAa) that house target-specific projection neurons with distinguishable morphological features. We identify brain-wide targets of projection neurons in the three BLAa domains, as well as in the posterior BLA, ventral BLA, posterior basomedial, and lateral amygdalar nuclei. Inputs to each nucleus also are identified via retrograde tracing. The data suggests that connectionally unique, domain-specific BLAa neurons are associated with distinct behavior networks.


Author(s):  
Hee-Dae Kim ◽  
Jing Wei ◽  
Tanessa Call ◽  
Nicole Teru Quintus ◽  
Alexander J. Summers ◽  
...  

AbstractDepression is the leading cause of disability and produces enormous health and economic burdens. Current treatment approaches for depression are largely ineffective and leave more than 50% of patients symptomatic, mainly because of non-selective and broad action of antidepressants. Thus, there is an urgent need to design and develop novel therapeutics to treat depression. Given the heterogeneity and complexity of the brain, identification of molecular mechanisms within specific cell-types responsible for producing depression-like behaviors will advance development of therapies. In the reward circuitry, the nucleus accumbens (NAc) is a key brain region of depression pathophysiology, possibly based on differential activity of D1- or D2- medium spiny neurons (MSNs). Here we report a circuit- and cell-type specific molecular target for depression, Shisa6, recently defined as an AMPAR component, which is increased only in D1-MSNs in the NAc of susceptible mice. Using the Ribotag approach, we dissected the transcriptional profile of D1- and D2-MSNs by RNA sequencing following a mouse model of depression, chronic social defeat stress (CSDS). Bioinformatic analyses identified cell-type specific genes that may contribute to the pathogenesis of depression, including Shisa6. We found selective optogenetic activation of the ventral tegmental area (VTA) to NAc circuit increases Shisa6 expression in D1-MSNs. Shisa6 is specifically located in excitatory synapses of D1-MSNs and increases excitability of neurons, which promotes anxiety- and depression-like behaviors in mice. Cell-type and circuit-specific action of Shisa6, which directly modulates excitatory synapses that convey aversive information, identifies the protein as a potential rapid-antidepressant target for aberrant circuit function in depression.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
John A. Halsall ◽  
Simon Andrews ◽  
Felix Krueger ◽  
Charlotte E. Rutledge ◽  
Gabriella Ficz ◽  
...  

AbstractChromatin configuration influences gene expression in eukaryotes at multiple levels, from individual nucleosomes to chromatin domains several Mb long. Post-translational modifications (PTM) of core histones seem to be involved in chromatin structural transitions, but how remains unclear. To explore this, we used ChIP-seq and two cell types, HeLa and lymphoblastoid (LCL), to define how changes in chromatin packaging through the cell cycle influence the distributions of three transcription-associated histone modifications, H3K9ac, H3K4me3 and H3K27me3. We show that chromosome regions (bands) of 10–50 Mb, detectable by immunofluorescence microscopy of metaphase (M) chromosomes, are also present in G1 and G2. They comprise 1–5 Mb sub-bands that differ between HeLa and LCL but remain consistent through the cell cycle. The same sub-bands are defined by H3K9ac and H3K4me3, while H3K27me3 spreads more widely. We found little change between cell cycle phases, whether compared by 5 Kb rolling windows or when analysis was restricted to functional elements such as transcription start sites and topologically associating domains. Only a small number of genes showed cell-cycle related changes: at genes encoding proteins involved in mitosis, H3K9 became highly acetylated in G2M, possibly because of ongoing transcription. In conclusion, modified histone isoforms H3K9ac, H3K4me3 and H3K27me3 exhibit a characteristic genomic distribution at resolutions of 1 Mb and below that differs between HeLa and lymphoblastoid cells but remains remarkably consistent through the cell cycle. We suggest that this cell-type-specific chromosomal bar-code is part of a homeostatic mechanism by which cells retain their characteristic gene expression patterns, and hence their identity, through multiple mitoses.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Jinting Guan ◽  
Yiping Lin ◽  
Yang Wang ◽  
Junchao Gao ◽  
Guoli Ji

Abstract Background Genome-wide association studies have identified genetic variants associated with the risk of brain-related diseases, such as neurological and psychiatric disorders, while the causal variants and the specific vulnerable cell types are often needed to be studied. Many disease-associated genes are expressed in multiple cell types of human brains, while the pathologic variants affect primarily specific cell types. We hypothesize a model in which what determines the manifestation of a disease in a cell type is the presence of disease module comprised of disease-associated genes, instead of individual genes. Therefore, it is essential to identify the presence/absence of disease gene modules in cells. Methods To characterize the cell type-specificity of brain-related diseases, we construct human brain cell type-specific gene interaction networks integrating human brain nucleus gene expression data with a referenced tissue-specific gene interaction network. Then from the cell type-specific gene interaction networks, we identify significant cell type-specific disease gene modules by performing statistical tests. Results Between neurons and glia cells, the constructed cell type-specific gene networks and their gene functions are distinct. Then we identify cell type-specific disease gene modules associated with autism spectrum disorder and find that different gene modules are formed and distinct gene functions may be dysregulated in different cells. We also study the similarity and dissimilarity in cell type-specific disease gene modules among autism spectrum disorder, schizophrenia and bipolar disorder. The functions of neurons-specific disease gene modules are associated with synapse for all three diseases, while those in glia cells are different. To facilitate the use of our method, we develop an R package, CtsDGM, for the identification of cell type-specific disease gene modules. Conclusions The results support our hypothesis that a disease manifests itself in a cell type through forming a statistically significant disease gene module. The identification of cell type-specific disease gene modules can promote the development of more targeted biomarkers and treatments for the disease. Our method can be applied for depicting the cell type heterogeneity of a given disease, and also for studying the similarity and dissimilarity between different disorders, providing new insights into the molecular mechanisms underlying the pathogenesis and progression of diseases.


1989 ◽  
Vol 92 (2) ◽  
pp. 231-239
Author(s):  
P.I. Francz ◽  
K. Bayreuther ◽  
H.P. Rodemann

Methods for the selective enrichment of various subpopulations of the human skin fibroblast cell line HH-8 have been developed. These methods permit the selection of homogeneous populations of the three mitotic fibroblast cell types MF I, II and III, and the four postmitotic cell types PMF IV, V, VI and VII. These seven cell types exhibit differentiation-dependent and cell-type-specific patterns of [35S]methionine-labelled polypeptides in total soluble cytoplasmic and nuclear proteins, also in membrane-bound proteins, and in secreted proteins. In the differentiation sequence MF II-MF III-PMF IV - PMF V - PMF VI 14 cell-type-specific marker proteins have been found in the cytoplasmic and nuclear fraction, also 24 cell-type-specific marker proteins have been found in the membrane-bound protein fraction, and 11 cell-type-specific marker proteins in the secreted protein fraction. Markers in spontaneously arising and experimentally selected or induced populations of a single fibroblast cell type were found to be identical.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Sinisa Hrvatin ◽  
Christopher P Tzeng ◽  
M Aurel Nagy ◽  
Hume Stroud ◽  
Charalampia Koutsioumpa ◽  
...  

Enhancers are the primary DNA regulatory elements that confer cell type specificity of gene expression. Recent studies characterizing individual enhancers have revealed their potential to direct heterologous gene expression in a highly cell-type-specific manner. However, it has not yet been possible to systematically identify and test the function of enhancers for each of the many cell types in an organism. We have developed PESCA, a scalable and generalizable method that leverages ATAC- and single-cell RNA-sequencing protocols, to characterize cell-type-specific enhancers that should enable genetic access and perturbation of gene function across mammalian cell types. Focusing on the highly heterogeneous mammalian cerebral cortex, we apply PESCA to find enhancers and generate viral reagents capable of accessing and manipulating a subset of somatostatin-expressing cortical interneurons with high specificity. This study demonstrates the utility of this platform for developing new cell-type-specific viral reagents, with significant implications for both basic and translational research.


2020 ◽  
Author(s):  
Emily A. McGlade ◽  
Gerardo G. Herrera ◽  
Kalli K. Stephens ◽  
Sierra L. W. Olsen ◽  
Sarayut Winuthayanon ◽  
...  

AbstractOne of the endogenous estrogens, 17β-estradiol (E2) is a female steroid hormone secreted from the ovary. It is well established that E2 causes biochemical and histological changes in the uterus. The oviduct response to E2 is virtually unknown in an in vivo environment. In this study, we assessed the effect of E2 on each oviductal cell type, using an ovariectomized-hormone-replacement mouse model, single cell RNA-sequencing (scRNA-seq), in situ hybridization, and cell-type-specific deletion in mice. We found that each cell type in the oviduct responded to E2 distinctively, especially ciliated and secretory epithelial cells. The treatment of exogenous E2 did not drastically alter the transcriptomic profile from that of endogenous E2 produced during estrus. Moreover, we have identified and validated genes of interest in our datasets that may be used as cell- and region-specific markers in the oviduct. Insulin-like growth factor 1 (Igf1) was characterized as an E2-target gene in the mouse oviduct and was also expressed in human Fallopian tubes. Deletion of Igf1 in progesterone receptor (Pgr)-expressing cells resulted in female subfertility, partially due to an embryo developmental defect and embryo retention within the oviduct. In summary, we have shown that oviductal cell types are differentially regulated by E2 and support gene expression changes that are required for normal embryo development and transport in mouse models.


2021 ◽  
Author(s):  
Guoxun Wang ◽  
Christina Zarek ◽  
Tyron Chang ◽  
Lili Tao ◽  
Alexandria Lowe ◽  
...  

Gammaherpesviruses, such as Epstein-Barr virus (EBV), Kaposi’s sarcoma associated virus (KSHV), and murine γ-herpesvirus 68 (MHV68), establish latent infection in B cells, macrophages, and non-lymphoid cells, and can induce both lymphoid and non-lymphoid cancers. Research on these viruses has relied heavily on immortalized B cell and endothelial cell lines. Therefore, we know very little about the cell type specific regulation of virus infection. We have previously shown that treatment of MHV68-infected macrophages with the cytokine interleukin-4 (IL-4) or challenge of MHV68-infected mice with an IL-4-inducing parasite leads to virus reactivation. However, we do not know if all latent reservoirs of the virus, including B cells, reactivate the virus in response to IL-4. Here we used an in vivo approach to address the question of whether all latently infected cell types reactivate MHV68 in response to a particular stimulus. We found that IL-4 receptor expression on macrophages was required for IL-4 to induce virus reactivation, but that it was dispensable on B cells. We further demonstrated that the transcription factor, STAT6, which is downstream of the IL-4 receptor and binds virus gene 50 N4/N5 promoter in macrophages, did not bind to the virus gene 50 N4/N5 promoter in B cells. These data suggest that stimuli that promote herpesvirus reactivation may only affect latent virus in particular cell types, but not in others. Importance Herpesviruses establish life-long quiescent infections in specific cells in the body, and only reactivate to produce infectious virus when precise signals induce them to do so. The signals that induce herpesvirus reactivation are often studied only in one particular cell type infected with the virus. However, herpesviruses establish latency in multiple cell types in their hosts. Using murine gammaherpesvirus-68 (MHV68) and conditional knockout mice, we examined the cell type specificity of a particular reactivation signal, interleukin-4 (IL-4). We found that IL-4 only induced herpesvirus reactivation from macrophages, but not from B cells. This work indicates that regulation of virus latency and reactivation is cell type specific. This has important implications for therapies aimed at either promoting or inhibiting reactivation for the control or elimination of chronic viral infections.


2020 ◽  
Author(s):  
Mohit Goyal ◽  
Guillermo Serrano ◽  
Ilan Shomorony ◽  
Mikel Hernaez ◽  
Idoia Ochoa

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.


Sign in / Sign up

Export Citation Format

Share Document