scholarly journals EAGLE: an algorithm that utilizes a small number of genomic features to predict tissue/cell type-specific enhancer-gene interactions

2019 ◽  
Author(s):  
Tianshun Gao ◽  
Jiang Qian

AbstractLong-range regulation by distal enhancers is crucial for many biological processes. The existing methods for enhancer-target gene prediction often require many genomic features. This makes them difficult to be applied to many cell types, in which the relevant datasets are not always available. Here, we design a tool EAGLE, an enhancer and gene learning ensemble method for identification of Enhancer-Gene (EG) interactions. Unlike existing tools, EAGLE used only six features derived from the genomic features of enhancers and gene expression datasets. Cross-validation revealed that EAGLE outperformed other existing methods. Enrichment analyses on special transcriptional factors, epigenetic modifications, and eQTLs demonstrated that EAGLE could distinguish the interacting pairs from non- interacting ones. Finally, EAGLE was applied to mouse and human genomes and identified 7,680,203 and 7,437,255 EG interactions involving 31,375 and 43,724 genes, 138,547 and 177,062 enhancers across 89 and 110 tissue/cell types in mouse and human, respectively. The obtained interactions are accessible through an interactive database enhanceratlas.org. The EAGLE method is available at https://github.com/EvansGao/EAGLE and the predicted datasets are available in http://www.enhanceratlas.org/.Author summaryEnhancers are DNA sequences that interact with promoters and activate target genes. Since enhancers often located far from the target genes and the nearest genes are not always the targets of the enhancers, the prediction of enhancer-target gene relationships is a big challenge. Although a few computational tools are designed for the prediction of enhancer-target genes, it’s difficult to apply them in most tissue/cell types due to a lack of enough genomic datasets. Here we proposed a new method, EAGLE, which utilizes a small number of genomic features to predict tissue/cell type-specific enhancer-gene interactions. Comparing with other existing tools, EAGLE displayed a better performance in the 10-fold cross-validation and cross-sample test. Moreover, the predictions by EAGLE were validated by other independent evidence such as the enrichment of relevant transcriptional factors, epigenetic modifications, and eQTLs.Finally, we integrated the enhancer-target relationships obtained from human and mouse genomes into an interactive database EnhancerAtlas, http://www.enhanceratlas.org/.

2018 ◽  
Author(s):  
Xuran Wang ◽  
Jihwan Park ◽  
Katalin Susztak ◽  
Nancy R. Zhang ◽  
Mingyao Li

AbstractWe present MuSiC, a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. When applied to pancreatic islet and whole kidney expression data in human, mouse, and rats, MuSiC outperformed existing methods, especially for tissues with closely related cell types. MuSiC enables characterization of cellular heterogeneity of complex tissues for identification of disease mechanisms.


2019 ◽  
Author(s):  
Tom Aharon Hait ◽  
Ran Elkon ◽  
Ron Shamir

AbstractSpatiotemporal gene expression patterns are governed to a large extent by enhancer elements, typically located distally from their target genes. Identification of enhancer-promoter (EP) links that are specific and functional in individual cell types is a key challenge in understanding gene regulation. We introduce CT-FOCS, a new statistical inference method that utilizes multiple replicates per cell type to infer cell type-specific EP links. Computationally predicted EP links are usually benchmarked against experimentally determined chromatin interactions measured by ChIA-PET and promoter-capture HiC techniques. We expand this validation scheme by using also loops that overlap in their anchor sites. In analyzing 1,366 samples from ENCODE, Roadmap epigenomics and FANTOM5, CT-FOCS inferred highly cell type-specific EP links more accurately than state-of-the-art methods. We illustrate how our inferred EP links drive cell type-specific gene expression and regulation.


2017 ◽  
Author(s):  
Jimmy Vandel ◽  
Océane Cassan ◽  
Sophie Lèbre ◽  
Charles-Henri Lecellier ◽  
Laurent Bréhélin

In eukaryotic cells, transcription factors (TFs) are thought to act in a combinatorial way, by competing and collaborating to regulate common target genes. However, several questions remain regarding the conservation of these combina-tions among different gene classes, regulatory regions and cell types. We propose a new approach named TFcoop to infer the TF combinations involved in the binding of a tar-get TF in a particular cell type. TFcoop aims to predict the binding sites of the target TF upon the binding affinity of all identified cooperating TFs. The set of cooperating TFs and model parameters are learned from ChIP-seq data of the target TF. We used TFcoop to investigate the TF combina-tions involved in the binding of 106 TFs on 41 cell types and in four regulatory regions: promoters of mRNAs, lncRNAs and pri-miRNAs, and enhancers. We first assess that TFcoop is accurate and outperforms simple PWM methods for pre-dicting TF binding sites. Next, analysis of the learned models sheds light on important properties of TF combinations in different promoter classes and in enhancers. First, we show that combinations governing TF binding on enhancers are more cell-type specific than that governing binding in pro-moters. Second, for a given TF and cell type, we observe that TF combinations are different between promoters and en-hancers, but similar for promoters of mRNAs, lncRNAs and pri-miRNAs. Analysis of the TFs cooperating with the dif-ferent targets show over-representation of pioneer TFs and a clear preference for TFs with binding motif composition similar to that of the target. Lastly, our models accurately dis-tinguish promoters associated with specific biological processes.


2019 ◽  
Vol 217 (1) ◽  
Author(s):  
Hiroyuki Hosokawa ◽  
Maile Romero-Wolf ◽  
Qi Yang ◽  
Yasutaka Motomura ◽  
Ditsa Levanon ◽  
...  

The zinc finger transcription factor, Bcl11b, is expressed in T cells and group 2 innate lymphoid cells (ILC2s) among hematopoietic cells. In early T-lineage cells, Bcl11b directly binds and represses the gene encoding the E protein antagonist, Id2, preventing pro-T cells from adopting innate-like fates. In contrast, ILC2s co-express both Bcl11b and Id2. To address this contradiction, we have directly compared Bcl11b action mechanisms in pro-T cells and ILC2s. We found that Bcl11b binding to regions across the genome shows distinct cell type–specific motif preferences. Bcl11b occupies functionally different sites in lineage-specific patterns and controls totally different sets of target genes in these cell types. In addition, Bcl11b bears cell type–specific post-translational modifications and organizes different cell type–specific protein complexes. However, both cell types use the same distal enhancer region to control timing of Bcl11b activation. Therefore, although pro-T cells and ILC2s both need Bcl11b for optimal development and function, Bcl11b works substantially differently in these two cell types.


Author(s):  
Tianshun Gao ◽  
Jiang Qian

Abstract Enhancers are distal cis-regulatory elements that activate the transcription of their target genes. They regulate a wide range of important biological functions and processes, including embryogenesis, development, and homeostasis. As more and more large-scale technologies were developed for enhancer identification, a comprehensive database is highly desirable for enhancer annotation based on various genome-wide profiling datasets across different species. Here, we present an updated database EnhancerAtlas 2.0 (http://www.enhanceratlas.org/indexv2.php), covering 586 tissue/cell types that include a large number of normal tissues, cancer cell lines, and cells at different development stages across nine species. Overall, the database contains 13 494 603 enhancers, which were obtained from 16 055 datasets using 12 high-throughput experiment methods (e.g. H3K4me1/H3K27ac, DNase-seq/ATAC-seq, P300, POLR2A, CAGE, ChIA-PET, GRO-seq, STARR-seq and MPRA). The updated version is a huge expansion of the first version, which only contains the enhancers in human cells. In addition, we predicted enhancer–target gene relationships in human, mouse and fly. Finally, the users can search enhancers and enhancer–target gene relationships through five user-friendly, interactive modules. We believe the new annotation of enhancers in EnhancerAtlas 2.0 will facilitate users to perform useful functional analysis of enhancers in various genomes.


2019 ◽  
Author(s):  
Ashley G. Anderson ◽  
Ashwinikumar Kulkarni ◽  
Matthew Harper ◽  
Genevieve Konopka

AbstractThe striatum is a critical forebrain structure for integrating cognitive, sensory, and motor information from diverse brain regions into meaningful behavioral output. However, the transcriptional mechanisms that underlie striatal development and organization at single-cell resolution remain unknown. Here, we show that Foxp1, a transcription factor strongly linked to autism and intellectual disability, regulates organizational features of striatal circuitry in a cell-type-dependent fashion. Using single-cell RNA-sequencing, we examine the cellular diversity of the early postnatal striatum and find that cell-type-specific deletion ofFoxp1in striatal projection neurons alters the cellular composition and neurochemical architecture of the striatum. Importantly, using this approach, we identify the non-cell autonomous effects produced by disruptingFoxp1in one cell-type and the molecular compensation that occurs in other populations. Finally, we identify Foxp1-regulated target genes within distinct cell-types and connect these molecular changes to functional and behavioral deficits relevant to phenotypes described in patients withFOXP1loss-of-function mutations. These data reveal cell-type-specific transcriptional mechanisms underlying distinct features of striatal circuitry and identify Foxp1 as a key regulator of striatal development.


2019 ◽  
Author(s):  
Xin Wang ◽  
Lingling Ye ◽  
Robertas Ursache ◽  
Ari Pekka Mähönen

ABSTRACTConditional manipulation of gene expression is a key approach to investigating the primary function of a gene in a biological process. While conditional and cell-type specific overexpression systems exist for plants, there are currently no systems available to disable a gene completely and conditionally. Here, we present a novel tool with which target genes can be efficiently conditionally knocked out at any developmental stage. The target gene is manipulated using the CRISPR-Cas9 genome editing technology, and conditionality is achieved with the well-established estrogen-inducible XVE system. Target genes can also be knocked-out in a cell-type specific manner. Our tool is easy to construct and will be particularly useful for studying genes which have null-alleles that are non-viable or show strong developmental defects.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Suvi Linna Kuosmanen ◽  
Eloi Schmauch ◽  
Kyriakitsa Galani ◽  
Carles Boix ◽  
Yongjin P Park ◽  
...  

Genome-wide association studies have uncovered over 200 genetic loci underlying coronary artery disease (CAD), providing great hope for a deeper understanding of the causal mechanisms leading to this disease. However, in order to understand CAD at the molecular level, it is necessary to uncover cell-type-specific circuits and to use these circuits to dissect driver variants, genes, pathways, and cell types, in normal and diseased tissues. Here, we provide the most detailed single-cell dissection of human heart cell types, using cardiac biopsies collected during open-heart surgery from healthy, CAD, and CAD-related heart failure donors, and profiling both transcriptional (scRNA-seq) and epigenomic (scATAC-seq) changes. Using this approach, we identify 12 major heart cell types, including typical cardiovascular cells (cardiomyocytes, endothelial cells, fibroblasts), rarer cell types (B cells, neurons, Schwann cells), and previously-unrecognized layer-specific epithelial and endothelial cell types. We define markers for each cell type, providing the first extensive reference set for the living human heart. In addition, we define differential gene expression patterns in CAD relative to control samples, revealing substantial differences in cell-type-specific expression of disease-related genes, emphasizing, for example, the importance of the vascular endothelium in the pathogenesis of CAD. Strikingly, further clustering of the cell types based on specific subtypes revealed important differences in their expression patterns of disease-associated genes. These changes enrich in known CAD genetic loci, enabling us to recognize their likely target genes from scRNA-seq expression changes, candidate driver variants based on scATAC-seq localization and differential DNA accessibility, and candidate upstream regulators based on their enriched motif occurrences in scATAC loci. Overall, our results highlight the relevance and potential of single-cell transcriptional and epigenomic analyses to gain new biological insights into cardiovascular disease, and to recognize novel therapeutic target genes, pathways, and the cell types where they act.


2019 ◽  
Author(s):  
Martin Jinye Zhang ◽  
Angela Oliveira Pisco ◽  
Spyros Darmanis ◽  
James Zou

ABSTRACTAging is associated with complex molecular and cellular processes that are poorly understood. Here we leveraged the Tabula Muris Senis single-cell RNA-seq dataset to systematically characterize gene expression changes during aging across diverse cell types in the mouse. We identified aging-dependent genes in 76 tissue-cell types from 23 tissues and characterized both shared and tissue-cell-specific aging behaviors. We found that the aging-related genes shared by multiple tissue-cell types also change their expression congruently in the same direction during aging in most tissue-cell types, suggesting a coordinated global aging behavior at the organismal level. Scoring cells based on these shared aging genes allowed us to contrast the aging status of different tissues and cell types from a transcriptomic perspective. In addition, we identified genes that exhibit age-related expression changes specific to each functional category of tissue-cell types. All together, our analyses provide one of the most comprehensive and systematic characterizations of the molecular signatures of aging across diverse tissue-cell types in a mammalian system.


Sign in / Sign up

Export Citation Format

Share Document