Extraction of Protein Sequence Motif Information using Bio-Inspired Computing

2020 ◽  
pp. 1306-1327
Author(s):  
Gowri Rajasekaran ◽  
Rathipriya R

Nowadays there are many people affected by the genetic disorder, hereditary diseases, etc. The protein complexes and their functions are detected, in order to find the irregularity in the gene expression. In a group of related proteins, there exist some conserved sequence patterns (motifs) either functionally or structurally similar. The main objective of this work is to find the motif information from the given protein sequence dataset. The functionalities of the proteins are ideally found from their motif information. Clustering approach is a main data mining technique. Besides the clustering approach, the biclustering is also used in many Bioinformatics related research works. The PSO K-Means clustering and biclustering approach is proposed in this work to extract the motif information. The Motif is extracted based on the structure homogeneity of the protein sequence. In this work, the clusters and biclusters are compared based on homogeneity and motif information extracted. This study shows that biclustering approach yields better result than the clustering approach.

Author(s):  
Gowri Rajasekaran ◽  
Rathipriya R

Nowadays there are many people affected by the genetic disorder, hereditary diseases, etc. The protein complexes and their functions are detected, in order to find the irregularity in the gene expression. In a group of related proteins, there exist some conserved sequence patterns (motifs) either functionally or structurally similar. The main objective of this work is to find the motif information from the given protein sequence dataset. The functionalities of the proteins are ideally found from their motif information. Clustering approach is a main data mining technique. Besides the clustering approach, the biclustering is also used in many Bioinformatics related research works. The PSO K-Means clustering and biclustering approach is proposed in this work to extract the motif information. The Motif is extracted based on the structure homogeneity of the protein sequence. In this work, the clusters and biclusters are compared based on homogeneity and motif information extracted. This study shows that biclustering approach yields better result than the clustering approach.


Author(s):  
Pedro Gabriel Ferreira ◽  
Paulo Jorge Azevedo

Protein sequence motifs describe, through means of enhanced regular expression syntax, regions of amino-acids that have been conserved across several functionally related proteins. These regions may have an implication at the structural and functional level of the proteins. Sequence motif analysis can bring significant improvements towards a better understanding of the protein sequence-structure-function relation. In this chapter we review the subject of mining deterministic motifs from protein sequence databases. We start by giving a formal definition of the different types of motifs and the respective specificities. Then, we explore the methods available to evaluate the quality and interest of such patterns. Examples of applications and motif repositories are described. We discuss the algorithmic aspects and different methodologies for motif extraction. A briefly description on how sequence motifs can be used to extract structural level information patterns is also provided.


2009 ◽  
pp. 2632-2656
Author(s):  
Pedro Gabriel Ferreira ◽  
Paulo Jorge Azevedo

Protein sequence motifs describe, through means of enhanced regular expression syntax, regions of amino acids that have been conserved across several functionally related proteins. These regions may have an implication at the structural and functional level of the proteins. Sequence motif analysis can bring significant improvements towards a better understanding of the protein sequence- structure-function relation. In this chapter, we review the subject of mining deterministic motifs from protein sequence databases. We start by giving a formal definition of the different types of motifs and the respective specificities. Then, we explore the methods available to evaluate the quality and interest of such patterns. Examples of applications and motif repositories are described. We discuss the algorithmic aspects and different methodologies for motif extraction. A brief description on how sequence motifs can be used to extract structural level information patterns is also provided.


2008 ◽  
pp. 1722-1746
Author(s):  
Pedro Gabriel Ferreira ◽  
Paulo Jorge Azevedo

Protein sequence motifs describe, through means of enhanced regular expression syntax, regions of amino-acids that have been conserved across several functionally related proteins. These regions may have an implication at the structural and functional level of the proteins. Sequence motif analysis can bring significant improvements towards a better understanding of the protein sequence-structure-function relation. In this chapter we review the subject of mining deterministic motifs from protein sequence databases. We start by giving a formal definition of the different types of motifs and the respective specificities. Then, we explore the methods available to evaluate the quality and interest of such patterns. Examples of applications and motif repositories are described. We discuss the algorithmic aspects and different methodologies for motif extraction. A briefly description on how sequence motifs can be used to extract structural level information patterns is also provided.


2007 ◽  
Vol 2007 ◽  
pp. 1-23 ◽  
Author(s):  
G. R. Hemalatha ◽  
D. Satyanarayana Rao ◽  
L. Guruprasad

We have identified four repeats and ten domains that are novel in proteins encoded by theBacillus anthracisstr.Amesproteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure.


2003 ◽  
Vol 185 (14) ◽  
pp. 4144-4151 ◽  
Author(s):  
Sheng Ye ◽  
Frank von Delft ◽  
Alexei Brooun ◽  
Mark W. Knuth ◽  
Ronald V. Swanson ◽  
...  

ABSTRACT Shikimate dehydrogenase catalyzes the NADPH-dependent reversible reduction of 3-dehydroshikimate to shikimate. We report the first X-ray structure of shikimate dehydrogenase from Haemophilus influenzae to 2.4-Å resolution and its complex with NADPH to 1.95-Å resolution. The molecule contains two domains, a catalytic domain with a novel open twisted α/β motif and an NADPH binding domain with a typical Rossmann fold. The enzyme contains a unique glycine-rich P-loop with a conserved sequence motif, GAGGXX, that results in NADPH adopting a nonstandard binding mode with the nicotinamide and ribose moieties disordered in the binary complex. A deep pocket with a narrow entrance between the two domains, containing strictly conserved residues primarily contributed by the catalytic domain, is identified as a potential 3-dehydroshikimate binding pocket. The flexibility of the nicotinamide mononucleotide portion of NADPH may be necessary for the substrate 3-dehydroshikimate to enter the pocket and for the release of the product shikimate.


Sign in / Sign up

Export Citation Format

Share Document