scholarly journals 3MOTIF: visualizing conserved protein sequence motifs in the protein structure database

2003 ◽  
Vol 19 (4) ◽  
pp. 541-542 ◽  
Author(s):  
S. P. Bennett ◽  
C. G. Nevill-Manning ◽  
D. L. Brutlag
2006 ◽  
Vol 52 (3-4) ◽  
pp. 375-387 ◽  
Author(s):  
Edward N. Trifonov

Four fundamentally novel, recent developments make a basis for the Theory of Early Molecular Evolution. The theory outlines the molecular events from the onset of the triplet code to the formation of the earliest sequence/structure/function modules of proteins. These developments are: (1) Reconstruction of the evolutionary chart of codons; (2) Discovery of omnipresent protein sequence motifs, apparently conserved since the last common ancestor; (3) Discovery of closed loops—standard structural modules of modern proteins; (4) Construction of protein sequence space of module size fragments, with far-reaching evolutionary implications. The theory generates numerous predictions, confirmed by massive nucleotide and protein sequence analyses, such as existence of two distinct classes of amino acids, and their periodical distribution along the sequences. The emerging picture of the earliest molecular evolutionary events is outlined: consecutive engagement of codons, formation of the earliest short peptides, and growth of the polypeptide chains to the size of loop closure, 25-30 residues.


2012 ◽  
Vol 86 (17) ◽  
pp. 9163-9174 ◽  
Author(s):  
R. Popa-Wagner ◽  
M. Porwal ◽  
M. Kann ◽  
M. Reuss ◽  
M. Weimer ◽  
...  

Author(s):  
Janice Glasgow ◽  
Evan Steeg

The field of knowledge discovery is concerned with the theory and processes involved in the representation and extraction of patterns or motifs from large databases. Discovered patterns can be used to group data into meaningful classes, to summarize data, or to reveal deviant entries. Motifs stored in a database can be brought to bear on difficult instances of structure prediction or determination from X-ray crystallography or nuclear magnetic resonance (NMR) experiments. Automated discovery techniques are central to understanding and analyzing the rapidly expanding repositories of protein sequence and structure data. This chapter deals with the discovery of protein structure motifs. A motif is an abstraction over a set of recurring patterns observed in a dataset; it captures the essential features shared by a set of similar or related objects. In many domains, such as computer vision and speech recognition, there exist special regularities that permit such motif abstraction. In the protein science domain, the regularities derive from evolutionary and biophysical constraints on amino acid sequences and structures. The identification of a known pattern in a new protein sequence or structure permits the immediate retrieval and application of knowledge obtained from the analysis of other proteins. The discovery and manipulation of motifs—in DNA, RNA, and protein sequences and structures—is thus an important component of computational molecular biology and genome informatics. In particular, identifying protein structure classifications at varying levels of abstraction allows us to organize and increase our understanding of the rapidly growing protein structure datasets. Discovered motifs are also useful for improving the efficiency and effectiveness of X-ray crystallographic studies of proteins, for drug design, for understanding protein evolution, and ultimately for predicting the structure of proteins from sequence data. Motifs may be designed by hand, based on expert knowledge. For example, the Chou-Fasman protein secondary structure prediction program (Chou and Fasman, 1978), which dominated the field for many years, depended on the recognition of predefined, user-encoded sequence motifs for α-helices and β-sheets. Several hundred sequence motifs have been cataloged in PROSITE (Bairoch, 1992); the identification of one of these motifs in a novel protein often allows for immediate function interpretation.


Sign in / Sign up

Export Citation Format

Share Document