regulatory genome
Recently Published Documents


TOTAL DOCUMENTS

56
(FIVE YEARS 26)

H-INDEX

9
(FIVE YEARS 4)

2021 ◽  
Author(s):  
Rohan Singh Ghotra ◽  
Nicholas Keone Lee ◽  
Rohit Tripathy ◽  
Peter K Koo

Hybrid networks that build upon convolutional layers with attention mechanisms have demonstrated improved performance relative to pure convolutional networks across many regulatory genome analysis tasks. Their inductive bias to learn long-range interactions provides an avenue to identify learned motif-motif interactions. For attention maps to be interpretable, the convolutional layer(s) must learn identifiable motifs. Here we systematically investigate the extent that architectural choices in convolution-based hybrid networks influence learned motif representations in first layer filters, as well as the reliability of their attribution maps generated by saliency analysis. We find that design principles previously identified in standard convolutional networks also generalize to hybrid networks. This work provides an avenue to narrow the spectrum of architectural choices when designing hybrid networks such that they are amenable to commonly used interpretability methods in genomics.


Insects ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 591
Author(s):  
Hasiba Asma ◽  
Marc S. Halfon

An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.


2021 ◽  
Vol 33 (2) ◽  
pp. 167-177
Author(s):  
Samuele Garda ◽  
Jana Marie Schwarz ◽  
Markus Schuelke ◽  
Ulf Leser ◽  
Dominik Seelow

Abstract High-throughput technologies have led to a continuously growing amount of information about regulatory features in the genome. A wealth of data generated by large international research consortia is available from online databases. Disease-driven studies provide details on specific DNA elements or epigenetic modifications regulating gene expression in specific cellular and developmental contexts, but these results are usually only published in scientific articles. All this information can be helpful in interpreting variants in the regulatory genome. This review describes a selection of high-profile data sources providing information on the non-coding genome, as well as pitfalls and techniques to search and capture information from the literature.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Pengyu Ni ◽  
Zhengchang Su

Abstract cis-regulatory modules(CRMs) formed by clusters of transcription factor (TF) binding sites (TFBSs) are as important as coding sequences in specifying phenotypes of humans. It is essential to categorize all CRMs and constituent TFBSs in the genome. In contrast to most existing methods that predict CRMs in specific cell types using epigenetic marks, we predict a largely cell type agonistic but more comprehensive map of CRMs and constituent TFBSs in the gnome by integrating all available TF ChIP-seq datasets. Our method is able to partition 77.47% of genome regions covered by available 6092 datasets into a CRM candidate (CRMC) set (56.84%) and a non-CRMC set (43.16%). Intriguingly, the predicted CRMCs are under strong evolutionary constraints, while the non-CRMCs are largely selectively neutral, strongly suggesting that the CRMCs are likely cis-regulatory, while the non-CRMCs are not. Our predicted CRMs are under stronger evolutionary constraints than three state-of-the-art predictions (GeneHancer, EnhancerAtlas and ENCODE phase 3) and substantially outperform them for recalling VISTA enhancers and non-coding ClinVar variants. We estimated that the human genome might encode about 1.47M CRMs and 68M TFBSs, comprising about 55% and 22% of the genome, respectively; for both of which, we predicted 80%. Therefore, the cis-regulatory genome appears to be more prevalent than originally thought.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008789
Author(s):  
Grace Hui Ting Yeo ◽  
Oscar Juez ◽  
Qing Chen ◽  
Budhaditya Banerjee ◽  
Lendy Chu ◽  
...  

We introduce poly-adenine CRISPR gRNA-based single-cell RNA-sequencing (pAC-Seq), a method that enables the direct observation of guide RNAs (gRNAs) in scRNA-seq. We use pAC-Seq to assess the phenotypic consequences of CRISPR/Cas9 based alterations of gene cis-regulatory regions. We show that pAC-Seq is able to detect cis-regulatory-induced alteration of target gene expression even when biallelic loss of target gene expression occurs in only ~5% of cells. This low rate of biallelic loss significantly increases the number of cells required to detect the consequences of changes to the regulatory genome, but can be ameliorated by transcript-targeted sequencing. Based on our experimental results we model the power to detect regulatory genome induced transcriptomic effects based on the rate of mono/biallelic loss, baseline gene expression, and the number of cells per target gRNA.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
William T Ireland ◽  
Suzannah M Beeler ◽  
Emanuel Flores-Bautista ◽  
Nicholas S McCarty ◽  
Tom Röschinger ◽  
...  

Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium Escherichia coli, for ≈65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than a E. coli promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.


2020 ◽  
Vol 5 (51) ◽  
pp. eabd6427
Author(s):  
Xun Wang ◽  
Ellen V. Rothenberg

E2A specifies adaptive immunity by instructing large-scale topological changes for Rag gene super-enhancer formation (see the related Research Article by Miyazaki et al.).


Cell ◽  
2020 ◽  
Vol 182 (6) ◽  
pp. 1674-1674.e1
Author(s):  
X.Y. Bing ◽  
P.J. Batut ◽  
M. Levo ◽  
M. Levine ◽  
J. Raimundo
Keyword(s):  

2020 ◽  
Author(s):  
William T Ireland ◽  
Suzannah M Beeler ◽  
Emanuel Flores-Bautista ◽  
Nicholas S McCarty ◽  
Tom Röschinger ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document