scholarly journals A biophysical approach to predicting protein-DNA binding energetics

2014 ◽  
Author(s):  
George Locke ◽  
Alexandre V Morozov

Sequence-specific interactions between proteins and DNA play a central role in DNA replication, repair, recombination, and control of gene expression. These interactions can be studied in vitro using microfluidics, protein-binding microarrays (PBMs), and other high-throughput techniques. Here we develop a biophysical approach to predicting protein-DNA binding specificities from high-throughput in vitro data. Our algorithm, called BindSter, accommodates multiple protein species competing for access to DNA and alternative binding modes of the same protein, while rigorously taking into account all sterically allowed configurations of DNA-bound particles. BindSter can be used with a hierarchy of protein-DNA interaction models of increasing complexity. We observe that the quality of BindSter predictions does not change significantly as some of the energy parameters vary over a sizable range. To take this degeneracy into account, we have developed a graphical representation of parameter uncertainties, called IntervalLogo. We find that our simplest model, in which each nucleotide in the binding site is treated independently, performs better than previous biophysical approaches. The extensions of this model, in which contributions of longer words are also considered, result in further improvements, underscoring the importance of higher-order effects in protein-DNA energetics. In contrast, we find little evidence for multiple binding modes for the transcription factors (TFs) in our dataset. Furthermore, there is limited consistency in predictions for the same TF utilizing microfluidics and PBM experimental platforms.


eLife ◽  
2015 ◽  
Vol 4 ◽  
Author(s):  
Todd R Riley ◽  
Allan Lazarovici ◽  
Richard S Mann ◽  
Harmen J Bussemaker

Transcription factors are crucial regulators of gene expression. Accurate quantitative definition of their intrinsic DNA binding preferences is critical to understanding their biological function. High-throughput in vitro technology has recently been used to deeply probe the DNA binding specificity of hundreds of eukaryotic transcription factors, yet algorithms for analyzing such data have not yet fully matured. Here, we present a general framework (FeatureREDUCE) for building sequence-to-affinity models based on a biophysically interpretable and extensible model of protein-DNA interaction that can account for dependencies between nucleotides within the binding interface or multiple modes of binding. When training on protein binding microarray (PBM) data, we use robust regression and modeling of technology-specific biases to infer specificity models of unprecedented accuracy and precision. We provide quantitative validation of our results by comparing to gold-standard data when available.



1989 ◽  
Vol 9 (6) ◽  
pp. 2464-2476
Author(s):  
M Cockell ◽  
B J Stevenson ◽  
M Strubin ◽  
O Hagenbüchle ◽  
P K Wellauer

Footprint analysis of the 5'-flanking regions of the alpha-amylase 2, elastase 2, and trypsina genes, which are expressed in the acinar pancreas, showed multiple sites of protein-DNA interaction for each gene. Competition experiments demonstrated that a region from each 5'-flanking region interacted with the same cell-specific DNA-binding activity. We show by in vitro binding assays that this DNA-binding activity also recognizes a sequence within the 5'-flanking regions of elastase 1, chymotrypsinogen B, carboxypeptidase A, and trypsind genes. Methylation interference and protection studies showed that the DNA-binding activity recognized a bipartite motif, the subelements of which were separated by integral helical turns of DNA. The alpha-amylase 2 cognate sequence was found to enhance in vivo transcription of its own promoter in a cell-specific manner, which identified the DNA-binding activity as a transcription factor (PTF 1). The observation that PTF 1 bound to DNA sequences that have been defined as transcriptional enhancers by others suggests that this factor is involved in the coordinate expression of genes transcribed in the acinar pancreas.



1989 ◽  
Vol 9 (6) ◽  
pp. 2464-2476 ◽  
Author(s):  
M Cockell ◽  
B J Stevenson ◽  
M Strubin ◽  
O Hagenbüchle ◽  
P K Wellauer

Footprint analysis of the 5'-flanking regions of the alpha-amylase 2, elastase 2, and trypsina genes, which are expressed in the acinar pancreas, showed multiple sites of protein-DNA interaction for each gene. Competition experiments demonstrated that a region from each 5'-flanking region interacted with the same cell-specific DNA-binding activity. We show by in vitro binding assays that this DNA-binding activity also recognizes a sequence within the 5'-flanking regions of elastase 1, chymotrypsinogen B, carboxypeptidase A, and trypsind genes. Methylation interference and protection studies showed that the DNA-binding activity recognized a bipartite motif, the subelements of which were separated by integral helical turns of DNA. The alpha-amylase 2 cognate sequence was found to enhance in vivo transcription of its own promoter in a cell-specific manner, which identified the DNA-binding activity as a transcription factor (PTF 1). The observation that PTF 1 bound to DNA sequences that have been defined as transcriptional enhancers by others suggests that this factor is involved in the coordinate expression of genes transcribed in the acinar pancreas.



2006 ◽  
Vol 26 (7) ◽  
pp. 2467-2478 ◽  
Author(s):  
Sungeun Kim ◽  
Christopher T. Denny ◽  
Ron Wisdom

ABSTRACT A key molecular event in the genesis of Ewing's sarcoma is the consistent presence of chromosomal translocations that result in the formation of proteins in which the amino terminus of EWS is fused to the carboxyl terminus, including the DNA binding domain, of one of five different Ets family proteins. These fusion proteins function as deregulated transcription factors, resulting in aberrant control of gene expression. Recent data indicate that some EWS-Ets target promoters, including the uridine phosphorylase (UPP) promoter, harbor tandem binding sites for Ets and AP-1 proteins. Here we show that those Ets family proteins that participate in Ewing's sarcoma, including Fli1, ERG, and ETV1, cooperatively bind these tandem elements with Fos-Jun while other Ets family members do not. Analysis of this cooperativity in vitro shows that (i) many different spatial arrangements of the Ets and AP-1 sites support cooperative binding, (ii) the bZIP motifs of Fos and Jun are sufficient to support this cooperativity, and (iii) both the Ets domain and carboxy-terminal sequences of Fli1 are important for cooperative DNA binding. EWS-Fli1 activates the expression of UPP mRNA, is directly bound to the UPP promoter, and transforms 3T3 fibroblasts; in contrast, a C-terminally truncated mutant form of EWS-Fli1 that cannot cooperatively bind DNA with Fos-Jun is defective in all of these properties. The results show that the ability of EWS-Ets proteins to cooperatively bind DNA with Fos-Jun is critical to the biologic activities of these proteins. The results have implications for understanding the pathogenesis of Ewing's sarcoma. In addition, they may be relevant to the mechanisms of Ras-dependent activation of genes that harbor tandem Ets and AP-1 binding sites.



2018 ◽  
Author(s):  
Naomi Yamada ◽  
William K.M. Lai ◽  
Nina Farrell ◽  
B. Franklin Pugh ◽  
Shaun Mahony

AbstractMotivationRegulatory proteins associate with the genome either by directly binding cognate DNA motifs or via protein-protein interactions with other regulators. Each recruitment mechanism may be associated with distinct motifs and may also result in distinct characteristic patterns in high-resolution protein-DNA binding assays. For example, the ChIP-exo protocol precisely characterizes protein-DNA crosslinking patterns by combining chromatin immunoprecipitation (ChIP) with 5’ → 3’ exonuclease digestion. Since different regulatory complexes will result in different protein-DNA crosslinking signatures, analysis of ChIP-exo tag enrichment patterns should enable detection of multiple protein-DNA binding modes for a given regulatory protein. However, current ChIP-exo analysis methods either treat all binding events as being of a uniform type or rely on motifs to cluster binding events into subtypes.ResultsTo systematically detect multiple protein-DNA interaction modes in a single ChIP-exo experiment, we introduce the ChIP-exo mixture model (ChExMix). ChExMix probabilistically models the genomic locations and subtype memberships of binding events using both ChIP-exo tag distribution patterns and DNA motifs. We demonstrate that ChExMix achieves accurate detection and classification of binding event subtypes using in silico mixed ChIP-exo data. We further demonstrate the unique analysis abilities of ChExMix using a collection of ChIP-exo experiments that profile the binding of key transcription factors in MCF-7 cells. In these data, ChExMix identifies possible recruitment mechanisms of FoxA1 and ERα, thus demonstrating that ChExMix can effectively stratify ChIP-exo binding events into biologically meaningful subtypes.AvailabilityChExMix is available from https://github.com/seqcode/[email protected]



2021 ◽  
Author(s):  
Sankar Adhya ◽  
Subhash Verma

Conserved in bacteria, the histone-like protein HU is crucial for genome organization and expression of many genes. It binds DNA regardless of the sequence and exhibits two binding affinities in vitro, low-affinity to any B-DNA (non-specific) and high-affinity to DNA with distortions like kinks and cruciforms (structure-specific), but the physiological relevance of the two binding modes needed further investigation. We validated and defined the three conserved lysine residues, K3, K18, and K83, in Escherichia coli HU as critical amino acid residues for both non-specific and structure-specific binding and the conserved proline residue P63 additionally for only the structure-specific binding. By mutating these residues in vivo, we showed that two DNA binding modes of HU play separate physiological roles. The DNA structure-specific binding, occurring at specific sites in the E. coli genome, promotes higher-order DNA structure formation, regulating the expression of many genes, including those involved in chromosome maintenance and segregation. The non-specific binding participates in numerous associations of HU with the chromosomal DNA, dictating chromosome structure and organization. Our findings underscore the importance of DNA structure in transcription regulation and promiscuous DNA-protein interactions in a dynamic organization of a bacterial genome.



2018 ◽  
Vol 115 (16) ◽  
pp. E3692-E3701 ◽  
Author(s):  
Chaitanya Rastogi ◽  
H. Tomas Rube ◽  
Judith F. Kribelbauer ◽  
Justin Crocker ◽  
Ryan E. Loker ◽  
...  

Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes.



Applied Nano ◽  
2022 ◽  
Vol 3 (1) ◽  
pp. 16-41
Author(s):  
Aurimas Kopūstas ◽  
Mindaugas Zaremba ◽  
Marijonas Tutkus

Protein-DNA interactions are the core of the cell’s molecular machinery. For a long time, conventional biochemical methods served as a powerful investigatory basis of protein-DNA interactions and target search mechanisms. Currently single-molecule (SM) techniques have emerged as a complementary tool for studying these interactions and have revealed plenty of previously obscured mechanistic details. In comparison to the traditional ones, SM methods allow direct monitoring of individual biomolecules. Therefore, SM methods reveal reactions that are otherwise hidden by the ensemble averaging observed in conventional bulk-type methods. SM biophysical techniques employing various nanobiotechnology methods for immobilization of studied molecules grant the possibility to monitor individual reaction trajectories of biomolecules. Next-generation in vitro SM biophysics approaches enabling high-throughput studies are characterized by much greater complexity than the ones developed previously. Currently, several high-throughput DNA flow-stretch assays have been published and have shown many benefits for mechanistic target search studies of various DNA-binding proteins, such as CRISPR-Cas, Argonaute, various ATP-fueled helicases and translocases, and others. This review focuses on SM techniques employing surface-immobilized and relatively long DNA molecules for studying protein-DNA interaction mechanisms.





Sign in / Sign up

Export Citation Format

Share Document