Nuclear organization and gene expression: homologous pairing and long-range interactions

1997 ◽  
Vol 9 (3) ◽  
pp. 388-395 ◽  
Author(s):  
Steven Henikoff
2019 ◽  
Vol 2019 ◽  
pp. 1-12
Author(s):  
Livia Eiselleova ◽  
Viktor Lukjanov ◽  
Simon Farkas ◽  
David Svoboda ◽  
Karel Stepka ◽  
...  

The eukaryotic nucleus is a highly complex structure that carries out multiple functions primarily needed for gene expression, and among them, transcription seems to be the most fundamental. Diverse approaches have demonstrated that transcription takes place at discrete sites known as transcription factories, wherein RNA polymerase II (RNAP II) is attached to the factory and immobilized while transcribing DNA. It has been proposed that transcription factories promote chromatin loop formation, creating long-range interactions in which relatively distant genes can be transcribed simultaneously. In this study, we examined long-range interactions between the POU5F1 gene and genes previously identified as being POU5F1 enhancer-interacting, namely, CDYL, TLE2, RARG, and MSX1 (all involved in transcriptional regulation), in human pluripotent stem cells (hPSCs) and their early differentiated counterparts. As a control gene, RUNX1 was used, which is expressed during hematopoietic differentiation and not associated with pluripotency. To reveal how these long-range interactions between POU5F1 and the selected genes change with the onset of differentiation and upon RNAP II inhibition, we performed three-dimensional fluorescence in situ hybridization (3D-FISH) followed by computational simulation analysis. Our analysis showed that the numbers of long-range interactions between specific genes decrease during differentiation, suggesting that the transcription of monitored genes is associated with pluripotency. In addition, we showed that upon inhibition of RNAP II, long-range associations do not disintegrate and remain constant. We also analyzed the distance distributions of these genes in the context of their positions in the nucleus and revealed that they tend to have similar patterns resembling normal distribution. Furthermore, we compared data created in vitro and in silico to assess the biological relevance of our results.


2008 ◽  
Vol 205 (4) ◽  
pp. 747-750 ◽  
Author(s):  
Adam Williams ◽  
Richard A. Flavell

The spatial organization of the genome is thought to play an important part in the coordination of gene regulation. New techniques have been used to identify specific long-range interactions between distal DNA sequences, revealing an ever-increasing complexity to nuclear organization. CCCTC-binding factor (CTCF) is a versatile zinc finger protein with diverse regulatory functions. New data now help define how CTCF mediates both long-range intrachromosomal and interchromosomal interactions, and highlight CTCF as an important factor in determining the three-dimensional structure of the genome.


2020 ◽  
Author(s):  
Jeremy Bigness ◽  
Xavi Loinaz ◽  
Shalin Patel ◽  
Erica Larschan ◽  
Ritambhara Singh

Long-range spatial interactions among genomic regions are critical for regulating gene expression and their disruption has been associated with a host of diseases. However, when modeling the effects of regulatory factors on gene expression, most deep learning models either neglect long-range interactions or fail to capture the inherent 3D structure of the underlying biological system. This prevents the field from obtaining a more comprehensive understanding of gene regulation and from fully leveraging the structural information present in the data sets. Here, we propose a graph convolutional neural network (GCNN) framework to integrate measurements probing spatial genomic organization and measurements of local regulatory factors, specifically histone modifications, to predict gene expression. This formulation enables the model to incorporate crucial information about long-range interactions via a natural encoding of spatial interaction relationships into a graph representation. Furthermore, we show that our model is interpretable in terms of the observed biological regulatory factors, highlighting both the histone modifications and the interacting genomic regions that contribute to a gene's predicted expression. We apply our GCNN model to datasets for GM12878 (lymphoblastoid) and K562 (myelogenous leukemia) cell lines and demonstrate its state-of-the-art prediction performance. We also obtain importance scores corresponding to the histone mark features and interacting regions for some exemplar genes and validate them with evidence from the literature. Our model presents a novel setup for predicting gene expression by integrating multimodal datasets.


2017 ◽  
Author(s):  
Yan Kai ◽  
Jaclyn Andricovich ◽  
Zhouhao Zeng ◽  
Jun Zhu ◽  
Alexandros Tzatsos ◽  
...  

AbstractThe CCCTC-binding zinc finger protein (CTCF)-mediated network of long-range chromatin interactions is important for genome organization and function. Although this network has been considered largely invariant, we found that it exhibits extensive cell-type-specific interactions that contribute to cell identity. Here we present Lollipop—a machine-learning framework—which predicts CTCF-mediated long-range interactions using genomic and epigenomic features. Using ChIA-PET data as benchmark, we demonstrated that Lollipop accurately predicts CTCF-mediated chromatin interactions both within and across cell-types, and outperforms other methods based only on CTCF motif orientation. Predictions were confirmed computationally and experimentally by Chromatin Conformation Capture (3C). Moreover, our approach reveals novel determinants of CTCF-mediated chromatin wiring, such as gene expression within the loops. Our study contributes to a better understanding about the underlying principles of CTCF-mediated chromatin interactions and their impact on gene expression.


2021 ◽  
Author(s):  
Žiga Avsec ◽  
Vikram Agarwal ◽  
Daniel Visentin ◽  
Joseph R. Ledsam ◽  
Agnieszka Grabska-Barwinska ◽  
...  

AbstractThe next phase of genome biology research requires understanding how DNA sequence encodes phenotypes, from the molecular to organismal levels. How noncoding DNA determines gene expression in different cell types is a major unsolved problem, and critical downstream applications in human genetics depend on improved solutions. Here, we report substantially improved gene expression prediction accuracy from DNA sequence through the use of a new deep learning architecture called Enformer that is able to integrate long-range interactions (up to 100 kb away) in the genome. This improvement yielded more accurate variant effect predictions on gene expression for both natural genetic variants and saturation mutagenesis measured by massively parallel reporter assays. Notably, Enformer outperformed the best team on the critical assessment of genome interpretation (CAGI5) challenge for noncoding variant interpretation with no additional training. Furthermore, Enformer learned to predict promoter-enhancer interactions directly from DNA sequence competitively with methods that take direct experimental data as input. We expect that these advances will enable more effective fine-mapping of growing human disease associations to cell-type-specific gene regulatory mechanisms and provide a framework to interpret cis-regulatory evolution. To foster these downstream applications, we have made the pre-trained Enformer model openly available, and provide pre-computed effect predictions for all common variants in the 1000 Genomes dataset.One-sentence summaryImproved noncoding variant effect prediction and candidate enhancer prioritization from a more accurate sequence to expression model driven by extended long-range interaction modelling.


2021 ◽  
Vol 18 (10) ◽  
pp. 1196-1203 ◽  
Author(s):  
Žiga Avsec ◽  
Vikram Agarwal ◽  
Daniel Visentin ◽  
Joseph R. Ledsam ◽  
Agnieszka Grabska-Barwinska ◽  
...  

AbstractHow noncoding DNA determines gene expression in different cell types is a major unsolved problem, and critical downstream applications in human genetics depend on improved solutions. Here, we report substantially improved gene expression prediction accuracy from DNA sequences through the use of a deep learning architecture, called Enformer, that is able to integrate information from long-range interactions (up to 100 kb away) in the genome. This improvement yielded more accurate variant effect predictions on gene expression for both natural genetic variants and saturation mutagenesis measured by massively parallel reporter assays. Furthermore, Enformer learned to predict enhancer–promoter interactions directly from the DNA sequence competitively with methods that take direct experimental data as input. We expect that these advances will enable more effective fine-mapping of human disease associations and provide a framework to interpret cis-regulatory evolution.


Sign in / Sign up

Export Citation Format

Share Document