Identification of Enhancers and Promoters in the Genome by Multidimensional Scaling

The positions of enhancers and promoters on genomic DNA remain poorly understood. Chromosomes cannot be observed during the cell division cycle because the genome forms a chromatin structure and spreads within the nucleus. However, high-throughput chromosome conformation capture (Hi-C) measures the physical interactions of genomes. In previous studies, DNA extrusion loops were directly derived from Hi-C heat maps. Multidimensional Scaling (MDS) is used in this assessment to more precisely locate enhancers and promoters. MDS is a multivariate analysis method that reproduces the original coordinates from the distance matrix between elements. We used Hi-C data of cultured osteosarcoma cells and applied MDS as the distance matrix of the genome. In addition, we selected columns 2 and 3 of the orthogonal matrix U as the desired structure. Overall, the DNA loops from the reconstructed genome structure contained bioprocesses involved in transcription, such as the pre-transcriptional initiation complex and RNA polymerase II initiation complex, and transcription factors involved in cancer, such as Foxm1 and CREB3. Therefore, our results are consistent with the biological findings. Our method is suitable for identifying enhancers and promoters in the genome.

Download Full-text

Chromatin 3D structure reconstruction with consideration of adjacency relationship among genomic loci

10.1101/741447 ◽

2019 ◽

Author(s):

Fang-Zhen Li ◽

Zhi-E Liu ◽

Xiu-Yuan Li ◽

Li-Mei Bu ◽

Hong-Xia Bu ◽

...

Keyword(s):

Multidimensional Scaling ◽

High Speed ◽

Optimization Problem ◽

Genome Structure ◽

3D Structure ◽

Restriction Enzymes ◽

Contact Map ◽

Scaling Method ◽

Chromosome Conformation ◽

Contact Frequency

AbstractChromatin 3D conformation plays important roles in regulating gene or protein functions. High-throughout chromosome conformation capture (3C)-based technologies, such as Hi-C, have been exploited to acquire the contact frequencies among genomic loci at genome-scale. Various computational tools have been proposed to recover the underlying chromatin 3D structures from in situ Hi-C contact map data. As connected residuals in a polymer, neighboring genomic loci have intrinsic mutual dependencies in building a 3D conformation. However, current methods seldom take this feature into account. We present a method called ShNeigh, which combines the classical MDS technique with local dependence of neighboring loci modelled by a Gaussian formula, to infer the best 3D structure from noisy and incomplete contact frequency matrices. The results obtained on simulations and real Hi-C data showed, while keeping the high-speed nature of classical MDS, ShNeigh is more accurate and robust than existing methods, especially for sparse contact maps. A Matlab implementation of the proposed method is available at https://github.com/fangzhen-li/ShNeigh.Author summaryWe propose a new method to infer a consensus 3D genome structure from a Hi-C contact map. The novelty of our method is that it takes into accounts the adjacency of genomic loci along chromosomes. Specifically, the proposed method penalizes the optimization problem of the classical multidimensional scaling method with a smoothness constraint weighted by a function of the genomic distance between the pairs of genomic loci. We demonstrate this optimization problem can still be solved efficiently by a classical multidimensional scaling method. We then show that the method can recover stable structures in high noise settings. We also show that it can reconstruct similar structures from data obtained using different restriction enzymes.

Download Full-text

Structure of human Mediator–RNA polymerase II pre-initiation complex

Nature ◽

10.1038/s41586-021-03555-7 ◽

2021 ◽

Author(s):

Srinivasan Rengachari ◽

Sandra Schilbach ◽

Shintaro Aibara ◽

Christian Dienemann ◽

Patrick Cramer

Keyword(s):

Rna Polymerase ◽

Rna Polymerase Ii ◽

Initiation Complex

Download Full-text

Enhancement of formation of the initiation complex by a factor stimulating RNA polymerase II from Ehrlich ascites tumor cells

Biochimica et Biophysica Acta (BBA) - Nucleic Acids and Protein Synthesis ◽

10.1016/0005-2787(77)90138-1 ◽

1977 ◽

Vol 479 (2) ◽

pp. 180-187 ◽

Cited By ~ 4

Author(s):

Kazuhisa Sekimizu ◽

Den'Ichi Mizuno ◽

Shunji Natori

Keyword(s):

Rna Polymerase ◽

Rna Polymerase Ii ◽

Tumor Cells ◽

Ehrlich Ascites Tumor ◽

Ehrlich Ascites Tumor Cells ◽

Ascites Tumor ◽

Initiation Complex ◽

Ehrlich Ascites

Download Full-text

Promoter-dependent phosphorylation of RNA polymerase II by a template-bound kinase. Association with transcriptional initiation

Journal of Biological Chemistry ◽

10.1016/s0021-9258(18)92939-x ◽

1991 ◽

Vol 266 (13) ◽

pp. 8055-8061

Author(s):

J.A. Arias ◽

S.R. Peterson ◽

W.S. Dynan

Keyword(s):

Rna Polymerase ◽

Rna Polymerase Ii ◽

Transcriptional Initiation

Download Full-text

3D genome structure reconstruction from chromosomal contact data

10.32469/10355/67541 ◽

2017 ◽

Author(s):

◽

Tuan Anh Trieu

Keyword(s):

High Resolution ◽

Genome Structure ◽

Cell Types ◽

3D Models ◽

Graphic User Interface ◽

Soft Constraints ◽

Chromosome Conformation ◽

3D Genome ◽

Manual Adjustment ◽

The Relationship

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Different cell types of an organism have the same DNA sequence, but they can function differently because their difference in 3D organization allows them to express different genes and has different cellular functions. Understanding the 3D organization of the genome is the key to understand functions of the cell. Chromosome conformation capture techniques like Hi-C and TCC that can capture interactions between proximal chromosome fragments have allowed the study of 3D genome organization in high resolution and high through-put. My work focuses on developing computational methods to reconstruct 3D genome structures from Hi-C data. I presented three methods to reconstruct 3D genome and chromosome structures. The first method can build 3D genome models from soft constraints of contacts and non-contacts. This method utilizes the concept of contact and non-contact to reconstruct 3D models without translating interaction frequencies into physical distances. The translation is commonly used by other methods even though it makes a strong assumption about the relationship between interaction frequencies and physical distances. In synthetic dataset, when the relationship was known, my method performed comparably with other methods assuming the relationship. This shows the potential of my method for real Hi-C datasets where the relationship is unknown. The limitation of the method is that it has parameters requiring manual adjustment. I developed the second method to reconstruct 3D genome models. This method utilizes a commonly used function to translate interaction frequencies to physical distances to build 3D models. I proposed a novel way to derive soft constraints to handle inconsistency in the data and to make the method robust. Building 3D models at high resolution is a more challenging problem as the number of constraints is small and the feasible space is larger. I introduced a third method to build 3D chromosome models at high resolution. The method reconstructs models at low resolution and then uses them to guide the reconstruction of models at high resolution. The last part of my work is the development of a comprehensive tool with intuitive graphic user interface to analyze Hi-C data, reconstruct and analyze 3D models.

Download Full-text

Genome-wide regulations of the pre-initiation complex formation and elongating RNA polymerase II by an E3 ubiquitin ligase, San1.

Molecular and Cellular Biology ◽

10.1128/mcb.00368-21 ◽

2021 ◽

Author(s):

Priyanka Barman ◽

Rwik Sen ◽

Amala Kaja ◽

Jannatul Ferdoush ◽

Shalini Guha ◽

...

Keyword(s):

Quality Control ◽

Rna Polymerase ◽

Rna Polymerase Ii ◽

Ubiquitin Ligase ◽

Nuclear Protein ◽

Protein Quality ◽

Initiation Complex ◽

Pol Ii ◽

Intrinsically Disordered ◽

Genome Wide

San1 ubiquitin ligase is involved in nuclear protein quality control via its interaction with intrinsically disordered proteins for ubiquitylation and proteasomal degradation. Since several transcription/chromatin regulatory factors contain intrinsically disordered domains and can be inhibitory to transcription when in excess, San1 might be involved in transcription regulation. To address this, we analyzed the role of San1 in genome-wide association of TBP [that nucleates pre-initiation complex (PIC) formation for transcription initiation] and RNA polymerase II (Pol II). Our results reveal the roles of San1 in regulating TBP recruitment to the promoters and Pol II association with the coding sequences, and hence PIC formation and coordination of elongating Pol II, respectively. Consistently, transcription is altered in the absence of San1. Such transcriptional alteration is associated with impaired ubiquitylation and proteasomal degradation of Spt16 and gene association of Paf1, but not the incorporation of centromeric histone, Cse4, into the active genes in Δsan1 . Collectively, our results demonstrate distinct functions of a nuclear protein quality control factor in regulating the genome-wide PIC formation and elongating Pol II (and hence transcription), thus unraveling new gene regulatory mechanisms.

Download Full-text

A downstream-element-binding factor facilitates assembly of a functional preinitiation complex at the simian virus 40 major late promoter

Molecular and Cellular Biology ◽

10.1128/mcb.10.7.3635-3645.1990 ◽

1990 ◽

Vol 10 (7) ◽

pp. 3635-3645

Author(s):

D E Ayer ◽

W S Dynan

Keyword(s):

Rna Polymerase Ii ◽

Simian Virus 40 ◽

Initiation Site ◽

Simian Virus ◽

Recognition Sequence ◽

Base Pairs ◽

Transcriptional Initiation ◽

Nuclear Extracts ◽

Major Late Promoter ◽

Time Required

Recent work has shown that many promoters recognized by eucaryotic RNA polymerase II contain essential sequences located downstream of the transcriptional initiation site. We show here that the activity of a promoter element centered 28 base pairs downstream of the simian virus 40 major late initiation site appears to be mediated by a DNA-binding protein, which was isolated by affinity chromatography from HeLa cell nuclear extracts. In the absence of the other components of the transcriptional machinery, the protein bound specifically but weakly to its recognition sequence, with a Kd of approximately 10(-8) M. Analysis of kinetic data showed that mutation of the downstream element decreased the number of functional preinitiation complexes assembled at the promoter without significantly altering the time required for half the complexes to assemble. This suggests that in the absence of the downstream activating protein, preinitiation complexes are at least partially assembled but are not transcriptionally competent.

Download Full-text

N-myc mRNA forms an RNA-RNA duplex with endogenous antisense transcripts

Molecular and Cellular Biology ◽

10.1128/mcb.10.8.4180-4191.1990 ◽

1990 ◽

Vol 10 (8) ◽

pp. 4180-4191

Author(s):

G W Krystal ◽

B C Armstrong ◽

J F Battey

Keyword(s):

Rna Polymerase Ii ◽

Antisense Transcription ◽

Initiation Site ◽

Antisense Transcripts ◽

Rnase Protection ◽

Transcriptional Initiation ◽

Intron 1 ◽

Exon 1 ◽

Duplex Formation

Nuclear runoff transcription studies revealed nearly equivalent sense and antisense transcription across exon 1 of the N-myc locus. Antisense primary transcription initiates at multiple sites in intron 1 and gives rise to stable polyadenylated and nonpolyadenylated transcripts. This pattern of antisense transcription, which is directed by RNA polymerase II, is independent of gene amplification and cell type. The nonpolyadenylated antisense transcripts have 5' ends which are complementary to the 5' ends of the N-myc sense mRNA. We determined, by using an RNase protection technique designed to detect in vivo duplexes, that most of the cytoplasmic nonpolyadenylated antisense RNA exists in an RNA-RNA duplex with approximately 5% of the sense N-myc mRNA. Duplex formation appeared to occur with only a subset of the multiple forms of the N-myc mRNA, with the precise transcriptional initiation site of the RNA playing a role in determining this selectivity. Cloning of each strand of the RNA-RNA duplex revealed that most duplexes included both exon 1 and intron 1 sequences, suggesting that duplex formation could modulate RNA processing by preserving a population of N-myc mRNA which retains intron 1.

Download Full-text

Functions of the N- and C-Terminal Domains of Human RAP74 in Transcriptional Initiation, Elongation, and Recycling of RNA Polymerase II

Molecular and Cellular Biology ◽

10.1128/mcb.18.4.2130 ◽

1998 ◽

Vol 18 (4) ◽

pp. 2130-2142 ◽

Cited By ~ 30

Author(s):

Lei Lei ◽

Delin Ren ◽

Ann Finkelstein ◽

Zachary F. Burton

Keyword(s):

Rna Polymerase ◽

Rna Polymerase Ii ◽

Transcriptional Initiation ◽

Pol Ii ◽

Terminal Domain ◽

Multiple Round ◽

Major Late Promoter ◽

Central Loop ◽

Transcription Cycle ◽

Terminal Domains

ABSTRACT Transcription factor IIF (TFIIF) cooperates with RNA polymerase II (pol II) during multiple stages of the transcription cycle including preinitiation complex assembly, initiation, elongation, and possibly termination and recycling. Human TFIIF appears to be an α2β2 heterotetramer of RNA polymerase II-associating protein 74- and 30-kDa subunits (RAP74 and RAP30). From inspection of its 517-amino-acid (aa) sequence, the RAP74 subunit appears to comprise separate N- and C-terminal domains connected by a flexible loop. In this study, we present functional data that strongly support this model for RAP74 architecture and further show that the N- and C-terminal domains and the central loop of RAP74 have distinct roles during separate phases of the transcription cycle. The N-terminal domain of RAP74 (minimally aa 1 to 172) is sufficient to deliver pol II into a complex formed on the adenovirus major late promoter with the TATA-binding protein, TFIIB, and RAP30. A more complete N-terminal domain fragment (aa 1 to 217) strongly stimulates both accurate initiation and elongation by pol II. The region of RAP74 between aa 172 and 205 and a subregion between aa 170 and 178 are critical for both accurate initiation and elongation, and mutations in these regions have similar effects on initiation and elongation. Based on these observations, RAP74 appears to have similar functions in initiation and elongation. The central region and the C-terminal domain of RAP74 do not contribute strongly to single-round accurate initiation or elongation stimulation but do stimulate multiple-round transcription in an extract system.

Download Full-text

The RNA Polymerase II Kinase Ctk1 Regulates Positioning of a 5′ Histone Methylation Boundary along Genes

Molecular and Cellular Biology ◽

10.1128/mcb.01628-06 ◽

2006 ◽

Vol 27 (2) ◽

pp. 721-731 ◽

Cited By ~ 36

Author(s):

Tiaojiang Xiao ◽

Yoichiro Shibata ◽

Bhargavi Rao ◽

R. Nicholas Laribee ◽

Rose O'Rourke ◽

...

Keyword(s):

Rna Polymerase ◽

Rna Polymerase Ii ◽

Transcription Initiation ◽

Histone Deacetylation ◽

H3k4 Methylation ◽

H3k36 Methylation ◽

Transcriptional Initiation ◽

Spurious Transcription ◽

Transcriptional Stress ◽

Rnap Ii

ABSTRACT In yeast and other eukaryotes, the histone methyltransferase Set1 mediates methylation of lysine 4 on histone H3 (H3K4me). This modification marks the 5′ end of transcribed genes in a 5′-to-3′ tri- to di- to monomethyl gradient and promotes association of chromatin-remodeling and histone-modifying enzymes. Here we show that Ctk1, the serine 2 C-terminal domain (CTD) kinase for RNA polymerase II (RNAP II), regulates H3K4 methylation. We found that CTK1 deletion nearly abolished H3K4 monomethylation yet caused a significant increase in H3K4 di- and trimethylation. Both in individual genes and genome-wide, loss of CTK1 disrupted the H3K4 methylation patterns normally observed. H3K4me2 and H3K4me3 spread 3′ into the bodies of genes, while H3K4 monomethylation was diminished. These effects were dependent on the catalytic activity of Ctk1 but are independent of Set2-mediated H3K36 methylation. Furthermore, these effects are not due to spurious transcription initiation in the bodies of genes, to changes in RNAP II occupancy, to changes in serine 5 CTD phosphorylation patterns, or to “transcriptional stress.” These data show that Ctk1 acts to restrict the spread of H3K4 methylation through a mechanism that is independent of a general transcription defect. The evidence presented suggests that Ctk1 controls the maintenance of suppressive chromatin in the coding regions of genes by both promoting H3K36 methylation, which leads to histone deacetylation, and preventing the 3′ spread of H3K4 trimethylation, a mark associated with transcriptional initiation.

Download Full-text