scholarly journals Deciphering the mechanical code of genome and epigenome

2020 ◽  
Author(s):  
Aakash Basu ◽  
Dmitriy G. Bobrovnikov ◽  
Basilio Cieza ◽  
Zan Qureshi ◽  
Taekjip Ha

AbstractSequence features have long been known to influence the local mechanical properties and shapes of DNA. However, a mechanical code (i.e. a comprehensive mapping between DNA sequence and mechanical properties), if it exists, has been difficult to experimentally determine because direct means of measuring the mechanical properties of DNA are typically limited in throughput. Here we use Loop-seq – a recently developed technique to measure the intrinsic cyclizabilities (a proxy for bendability) of DNA fragments in genomic-scale throughput – to characterize the mechanical code. We tabulate how DNA sequence features (distribution patterns of all possible dinucleotides and dinucleotide pairs) influence intrinsic cyclizability, and build a linear model to predict intrinsic cyclizability from sequence. Using our model, we predict that DNA mechanical landscape shapes nucleosome organization around the promoters of various organisms and at the binding site of the transcription factor CTCF, and that hyperperiodic DNA in C. elegans leads to globally curved DNA segments. By performing loop-seq on random libraries in the presence or absence of CpG methylation, we show that CpG methylation leads to global stiffening of DNA in a wide sequence context, and predict based on our model that CpG methylation widely changes the mechanical landscape around mouse promoters. It suggests how epigenetic modifications of DNA might alter gene expression and mediate cellular adaptation by affecting critical processes around promoters that require mechanical deformations of DNA, such as nucleosome organization and transcription initiation. Finally, we show that the genetic code and the mechanical code are linked: sequence-dependent mechanical properties of coding DNA constrains the amino acid sequence despite the degeneracy in the genetic code. Our measurements explain why the pattern of nucleosome organization along genes influences the distribution of amino acids in the translated polypeptide.

2015 ◽  
Vol 32 (6) ◽  
pp. 835-842 ◽  
Author(s):  
Filippo Utro ◽  
Valeria Di Benedetto ◽  
Davide F.V. Corona ◽  
Raffaele Giancarlo

Abstract Motivation: Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. Results: We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al. Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter ‘encoding’. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: [email protected].


2007 ◽  
Vol 461 (1) ◽  
pp. 7-12 ◽  
Author(s):  
Hiroyuki Kamiya ◽  
Satoki Fukunaga ◽  
Takashi Ohyama ◽  
Hideyoshi Harashima

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Clara L. Essmann ◽  
Daniel Martinez-Martinez ◽  
Rosina Pryor ◽  
Kit-Yi Leung ◽  
Kalaivani Bala Krishnan ◽  
...  

2012 ◽  
Vol 26 (21) ◽  
pp. 2374-2379 ◽  
Author(s):  
Y. Liu ◽  
H. Toh ◽  
H. Sasaki ◽  
X. Zhang ◽  
X. Cheng

Author(s):  
P. Kamala Kumari ◽  
J.B. Seventline

The application of signal processing techniques for identification of exons in Deoxyribonucleic acid (DNA) sequence is a challenging task. The objective of this paper is to introduce a combinational window approach for locating exons in DNA sequence. In contrast to the traditional single window function for evaluation of short time Fourier transform (STFT), this work proposes a novel method for evaluating STFT coefficients using a combinational window function comprising of Gaussian, Lanczos and Chebyshev (GLC) windows. The chosen combinational window GLC has the highest relative side lobe attenuation values compared to other window functions introduced by various researchers. The proposed algorithm incorporates GLC window function for evaluating STFT coefficients and in the design of FIR bandpass filter. Simulation results revealed its effectiveness in improving the evaluation parameters like Sensitivity, Specificity, Accuracy, Area under curve (AUC), Discrimination Measure (DM). Furthermore, the proposed algorithm has been applied successfully to some universal benchmark datasets like C. elegans, Homosapiens, etc., The proposed method has shown to be an efficient approach for the prediction of protein coding regions compared to other existing methods. All the simulations are done using the MATLAB 2016a.


2010 ◽  
Vol 32 (5) ◽  
pp. 6-7
Author(s):  
Jane Mellor

To correctly read the information stored in our DNA genomes (the genetic code), cells must read another language that overlays it, the epigenetic code, which controls access to that information. A process such as transcription can only retrieve this information according to the access granted by the epigenome. The term epigenetics was coined in the 1940s by British embryologist and geneticist Conrad Waddington to describe “the interaction of genes with their environment, which bring the phenotype into being”. Now the term epigenetics (literally over or above genetics) refers to the extra layers of instructions that influence gene activity without altering the DNA sequence. There are three main components to the epigenetic code: (i) methylated cytosine residues in DNA1; (ii) the range of post-translational modifications to the core histone proteins within the nucleosomes (referred to as the histone code)2,3; and (iii) RNA molecules, often non-coding RNA4.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 165 ◽  
Author(s):  
Anatoliy Zubritskiy ◽  
Yulia A. Medvedeva

The presence of H3K27me3 has been demonstrated to correlate with the CpG content. In this work, we tested whether H3K27ac has similar sequence preferences. We performed a translocation of DNA sequences with various properties into a beta-globin locus to control for the local chromatin environment. Our results suggest that in contrast to H3K27me3, H3K27ac gain is unlikely affected by the CpG content of the underlying DNA sequence, while extremely high GC-content might contribute to the gain of the H3K27ac.


Sign in / Sign up

Export Citation Format

Share Document