DNA shape complements sequence-based representations of transcription factor binding sites

2019 ◽  
Author(s):  
Peter DeFord ◽  
James Taylor

AbstractThe position weight matrix (PWM) has long been a useful tool for describing variation in the composition of regions of DNA such as transcription factor (TF) binding sites. It is difficult, however, to relate the sequence-based representation of a DNA motif to the biological features of the interaction of a TF with its binding site. Here we present an alternative strategy for representing DNA motifs – called Structural Motif (StruM) – that can easily represent different sets of structural features. Structural features are inferred from dinucleotide properties listed in the Dinucleotide Property Database. StruMs are able to specifically model TF binding sites, using an encoding strategy that is distinct from sequence-based models. This difference in encoding strategies makes StruMs complementary to sequence-based methods of TF binding site identification.

Sign in / Sign up

Export Citation Format

Share Document