scholarly journals MotifMark: Finding Regulatory Motifs in DNA Sequences

2017 ◽  
Author(s):  
Hamid Reza Hassanzadeh ◽  
Pushkar Kolhe ◽  
Charles L. Isbell ◽  
May D. Wang

AbstractThe interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.

2017 ◽  
Author(s):  
Hamid Reza Hassanzadeh ◽  
May D. Wang

AbstractTranscription factors (TFs) are macromolecules that bind to cis-regulatory specific sub-regions of DNA promoters and initiate transcription. Finding the exact location of these binding sites (aka motifs) is important in a variety of domains such as drug design and development. To address this need, several in vivo and in vitro techniques have been developed so far that try to characterize and predict the binding specificity of a protein to different DNA loci. The major problem with these techniques is that they are not accurate enough in prediction of the binding affinity and characterization of the corresponding motifs. As a result, downstream analysis is required to uncover the locations where proteins of interest bind. Here, we propose DeeperBind, a long short term recurrent convolutional network for prediction of protein binding specificities with respect to DNA probes. DeeperBind can model the positional dynamics of probe sequences and hence reckons with the contributions made by individual sub-regions in DNA sequences, in an effective way. Moreover, it can be trained and tested on datasets containing varying-length sequences. We apply our pipeline to the datasets derived from protein binding microarrays (PBMs), an in-vitro high-throughput technology for quantification of protein-DNA binding preferences, and present promising results. To the best of our knowledge, this is the most accurate pipeline that can predict binding specificities of DNA sequences from the data produced by high-throughput technologies through utilization of the power of deep learning for feature generation and positional dynamics modeling.


2017 ◽  
Vol 114 (29) ◽  
pp. E5995-E6004 ◽  
Author(s):  
Yan O. Zubo ◽  
Ivory Clabaugh Blakley ◽  
Maria V. Yamburenko ◽  
Jennifer M. Worthen ◽  
Ian H. Street ◽  
...  

The plant hormone cytokinin affects a diverse array of growth and development processes and responses to the environment. How a signaling molecule mediates such a diverse array of outputs and how these response pathways are integrated with other inputs remain fundamental questions in plant biology. To this end, we characterized the transcriptional network initiated by the type-B ARABIDOPSIS RESPONSE REGULATORs (ARRs) that mediate the cytokinin primary response, making use of chromatin immunoprecipitation sequencing (ChIP-seq), protein-binding microarrays, and transcriptomic approaches. By ectopic overexpression of ARR10, Arabidopsis lines hypersensitive to cytokinin were generated and used to clarify the role of cytokinin in regulation of various physiological responses. ChIP-seq was used to identify the cytokinin-dependent targets for ARR10, thereby defining a crucial link between the cytokinin primary-response pathway and the transcriptional changes that mediate physiological responses to this phytohormone. Binding of ARR10 was induced by cytokinin with binding sites enriched toward the transcriptional start sites for both induced and repressed genes. Three type-B ARR DNA-binding motifs, determined by use of protein-binding microarrays, were enriched at ARR10 binding sites, confirming their physiological relevance. WUSCHEL was identified as a direct target of ARR10, with its cytokinin-enhanced expression resulting in enhanced shooting in tissue culture. Results from our analyses shed light on the physiological role of the type-B ARRs in regulating the cytokinin response, mechanism of type-B ARR activation, and basis by which cytokinin regulates diverse aspects of growth and development as well as responses to biotic and abiotic factors.


1998 ◽  
Vol 330 (1) ◽  
pp. 335-343 ◽  
Author(s):  
M. Bahaa FADEL ◽  
C. Stephane BOUTET ◽  
Thomas QUERTERMOUS

To investigate the molecular basis of endothelial cell-specific gene expression, we have examined the DNA sequences and the cognate DNA-binding proteins that mediate transcription of the murine tie2/tek gene. Reporter transfection experiments conformed with earlier findings in transgenic mice, indicating that the upstream promoter of Tie2/Tek is capable of activating transcription in an endothelial cell-specific fashion. These experiments have also allowed the identification of a single upstream inhibitory region (region I) and two positive regulatory regions (regions U and A) in the proximal promoter. Electrophoretic mobility-shift assays have allowed further characterization of three novel DNA-binding sequences associated with these regions and have provided preliminary characterization of the protein factors binding to these elements. Two of the elements (U and A) confer increased transcription on a heterologous promoter, with element U functioning in an endothelial-cell-selective manner. By employing embryonic endothelial-like yolk sac cells in parallel with adult-derived endothelial cells, we have identified differences in functional activity and protein binding that may reflect mechanisms for specifying developmental regulation of tie2/tek expression. Further study of the DNA and protein elements characterized in these experiments is likely to provide new insight into the molecular basis of developmental- and cell-specific gene expression in the endothelium.


Sign in / Sign up

Export Citation Format

Share Document