scholarly journals Cross-species analysis of melanoma enhancer logic using deep learning

Author(s):  
Liesbeth Minnoye ◽  
Ibrahim Ihsan Taskiran ◽  
David Mauduit ◽  
Maurizio Fazio ◽  
Linde Van Aerschot ◽  
...  

AbstractGenomic enhancers form the central nodes of gene regulatory networks by harbouring combinations of transcription factor binding sites. Deciphering the combinatorial code by which these binding sites are assembled within enhancers is indispensable to understand their regulatory involvement in establishing a cell’s phenotype, especially within biological systems with dysregulated gene regulatory networks, such as melanoma. In order to unravel the enhancer logic of the two most common melanoma cell states, namely the melanocytic and mesenchymal-like state, we combined comparative epigenomics with machine learning. By profiling chromatin accessibility using ATAC-seq on a cohort of 27 melanoma cell lines across six different species, we demonstrate the conservation of the two main melanoma states and their underlying master regulators. To perform an in-depth analysis of the enhancer architecture, we trained a deep neural network, called DeepMEL, to classify melanoma enhancers not only in the human genome, but also in other species. DeepMEL revealed the presence, organisation and positional specificity of important transcription factor binding sites. Together, this extensive analysis of the melanoma enhancer code allowed us to propose the concept of a core regulatory complex binding to melanocytic enhancers, consisting of SOX10, TFAP2A, MITF and RUNX, and to disentangle their individual roles in regulating enhancer accessibility and activity.

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Abhijeet Rajendra Sonawane ◽  
Dawn L. DeMeo ◽  
John Quackenbush ◽  
Kimberly Glass

AbstractThe biological processes that drive cellular function can be represented by a complex network of interactions between regulators (transcription factors) and their targets (genes). A cell’s epigenetic state plays an important role in mediating these interactions, primarily by influencing chromatin accessibility. However, how to effectively use epigenetic data when constructing a gene regulatory network remains an open question. Almost all existing network reconstruction approaches focus on estimating transcription factor to gene connections using transcriptomic data. In contrast, computational approaches for analyzing epigenetic data generally focus on improving transcription factor binding site predictions rather than deducing regulatory network relationships. We bridged this gap by developing SPIDER, a network reconstruction approach that incorporates epigenetic data into a message-passing framework to estimate gene regulatory networks. We validated SPIDER’s predictions using ChIP-seq data from ENCODE and found that SPIDER networks are both highly accurate and include cell-line-specific regulatory interactions. Notably, SPIDER can recover ChIP-seq verified transcription factor binding events in the regulatory regions of genes that do not have a corresponding sequence motif. The networks estimated by SPIDER have the potential to identify novel hypotheses that will allow us to better characterize cell-type and phenotype specific regulatory mechanisms.


2020 ◽  
Author(s):  
Abhijeet Rajendra Sonawane ◽  
Dawn L. DeMeo ◽  
John Quackenbush ◽  
Kimberly Glass

AbstractThe biological processes that drive cellular function can be represented by a complex network of interactions between regulators (transcription factors) and their targets (genes). A cell’s epigenetic state plays an important role in mediating these interactions, primarily by influencing chromatin accessibility. However, effectively leveraging epigenetic information when constructing regulatory networks remains a challenge. We developed SPIDER, which incorporates epigenetic information (DNase-Seq) into a message passing framework in order to estimate gene regulatory networks. We validated SPIDER’s predictions using ChlP-Seq data from ENCODE and found that SPIDER networks were more accurate than other publicly available, epigenetically informed regulatory networks as well as networks based on methods that leverage epigenetic data to predict transcription factor binding sites. SPIDER was also able to improve the detection of cell line specific regulatory interactions. Notably, SPIDER can recover ChlP-seq verified transcription factor binding events in the regulatory regions of genes that do not have a corresponding sequence motif. Constructing biologically interpretable, epigenetically informed networks using SPIDER will allow us to better understand gene regulation as well as aid in the identification of cell-specific drivers and biomarkers of cellular phenotypes.


2018 ◽  
Author(s):  
Ali Shariati ◽  
Antonia Dominguez ◽  
Marius Wernig ◽  
Lei S. Qi ◽  
Jan M. Skotheim

AbstractThe control of gene expression by transcription factor binding sites frequently determines phenotype. However, it has been difficult to assay the function of single transcription factor binding sites within larger transcription networks. Here, we developed such a method by using deactivated Cas9 to disrupt binding to specific sites on the genome. Since CRISPR guide RNAs are longer than transcription factor binding sites, flanking sequence can be used to target specific sites. Targeting deactivated Cas9 to a specific Oct4 binding site in the Nanog promoter blocked Oct4 binding, reduced Nanog expression, and slowed division. Multiple guide RNAs allows simultaneous inhibition of multiple binding sites and conditionally-destabilized dCas9 allows rapid reversibility. The method is a novel high-throughput approach to systematically interrogate cis-regulatory function within complex regulatory networks.


2019 ◽  
Author(s):  
Ningxin Ouyang ◽  
Alan P. Boyle

AbstractTranscription is tightly regulated by cis-regulatory DNA elements where transcription factors can bind. Thus, identification of transcription factor binding sites is key to understanding gene expression and whole regulatory networks within a cell. The standard approaches for transcription factor binding sites (TFBSs) prediction such as position weight matrices (PWMs) and chromatin immunoprecipitation followed by sequencing (ChIP-seq) are widely used but have their drawbacks such as high false positive rates and limited antibody availability, respectively. Several computational footprinting algorithms have been developed to detect TFBSs by investigating chromatin accessibility patterns, but also have their limitations. To improve on these methods, we have developed a footprinting method to predict Transcription factor footpRints in Active Chromatin Elements (TRACE). Trace incorporates DNase-seq data and PWMs within a multivariate Hidden Markov Model (HMM) to detect footprint-like regions with matching motifs. Trace is an unsupervised method that accurately annotates binding sites for specific TFs automatically with no requirement on pre-generated candidate binding sites or ChIP-seq training data. Compared to published footprinting algorithms, TRACE has the best overall performance with the distinct advantage of targeting multiple motifs in a single model.


Sign in / Sign up

Export Citation Format

Share Document