Search for Highly Divergent Tandem Repeats in Amino Acid Sequences

We report a Method to Search for Highly Divergent Tandem Repeats (MSHDTR) in protein sequences which considers pairwise correlations between adjacent residues. MSHDTR was compared with some previously developed methods for searching for tandem repeats (TRs) in amino acid sequences, such as T-REKS and XSTREAM, which focus on the identification of TRs with significant sequence similarity, whereas MSHDTR detects repeats that significantly diverged during evolution, accumulating deletions, insertions, and substitutions. The application of MSHDTR to a search of the Swiss-Prot databank revealed over 15 thousand TR-containing amino acid sequences that were difficult to find using the other methods. Among the detected TRs, the most representative were those with consensus lengths of two and seven residues; these TRs were subjected to cluster analysis and the classes of patterns were identified. All TRs detected in this study have been combined into a databank accessible over the WWW.

Download Full-text

Diversity of Primary Structures of the Carboxy-terminal Regions of Mammalian Fibrinogen Aα-Chains

Thrombosis and Haemostasis ◽

10.1055/s-0038-1651611 ◽

1993 ◽

Vol 69 (04) ◽

pp. 351-360 ◽

Cited By ~ 26

Author(s):

Masahiro Murakawa ◽

Takashi Okamura ◽

Takumi Kamura ◽

Tsunefumi Shibuya ◽

Mine Harada ◽

...

Keyword(s):

Amino Acid ◽

Rhesus Monkey ◽

Syrian Hamster ◽

Tandem Repeats ◽

Mammalian Species ◽

Amino Acid Sequences ◽

Point Of View ◽

Cross Linking ◽

Amino Terminal ◽

Rgd Sequence

SummaryThe partial amino acid sequences of fibrinogen Aα-chains from five mammalian species have been inferred by means of the polymerase chain reaction (PCR). From the genomic DNA of the rhesus monkey, pig, dog, mouse and Syrian hamster, the DNA fragments coding for α-C domains in the Aα-chains were amplified and sequenced. In all species examined, four cysteine residues were always conserved at the homologous positions. The carboxy- and amino-terminal portions of the α-C domains showed a considerable homology among the species. However, the sizes of the middle portions, which corresponded to the internal repeat structures, showed an apparent variability because of several insertions and/or deletions. In the rhesus monkey, pig, mouse and Syrian hamster, 13 amino acid tandem repeats fundamentally similar to those in humans and the rat were identified. In the dog, however, tandem repeats were found to consist of 18 amino acids, suggesting an independent multiplication of the canine repeats. The sites of the α-chain cross-linking acceptor and α2-plasmin inhibitor cross-linking donor were not always evolutionally conserved. The arginyl-glycyl-aspartic acid (RGD) sequence was not found in the amplified region of either the rhesus monkey or the pig. In the canine α-C domain, two RGD sequences were identified at the homologous positions to both rat and human RGD S. In the Syrian hamster, a single RGD sequence was found at the same position to that of the rat. Triplication of the RGD sequences was seen in the murine fibrinogen α-C domain around the homologous site to the rat RGDS sequence. These findings are of some interest from the point of view of structure-function and evolutionary relationships in the mammalian fibrinogen Aα-chains.

Download Full-text

Computational Analysis of Therapeutic Enzyme Uricase from Different Source Organisms

Current Proteomics ◽

10.2174/1570164616666190617165107 ◽

2020 ◽

Vol 17 (1) ◽

pp. 59-77

Author(s):

Anand Kumar Nelapati ◽

JagadeeshBabu PonnanEttiyappan

Keyword(s):

Uric Acid ◽

Amino Acid ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Protein Sequences ◽

Amino Acid Sequences ◽

Amino Acid Residues ◽

Multiple Sequence ◽

Physiochemical Properties ◽

Pharmaceutical Industries

Background:Hyperuricemia and gout are the conditions, which is a response of accumulation of uric acid in the blood and urine. Uric acid is the product of purine metabolic pathway in humans. Uricase is a therapeutic enzyme that can enzymatically reduces the concentration of uric acid in serum and urine into more a soluble allantoin. Uricases are widely available in several sources like bacteria, fungi, yeast, plants and animals.Objective:The present study is aimed at elucidating the structure and physiochemical properties of uricase by insilico analysis.Methods:A total number of sixty amino acid sequences of uricase belongs to different sources were obtained from NCBI and different analysis like Multiple Sequence Alignment (MSA), homology search, phylogenetic relation, motif search, domain architecture and physiochemical properties including pI, EC, Ai, Ii, and were performed.Results:Multiple sequence alignment of all the selected protein sequences has exhibited distinct difference between bacterial, fungal, plant and animal sources based on the position-specific existence of conserved amino acid residues. The maximum homology of all the selected protein sequences is between 51-388. In singular category, homology is between 16-337 for bacterial uricase, 14-339 for fungal uricase, 12-317 for plants uricase, and 37-361 for animals uricase. The phylogenetic tree constructed based on the amino acid sequences disclosed clusters indicating that uricase is from different source. The physiochemical features revealed that the uricase amino acid residues are in between 300- 338 with a molecular weight as 33-39kDa and theoretical pI ranging from 4.95-8.88. The amino acid composition results showed that valine amino acid has a high average frequency of 8.79 percentage compared to different amino acids in all analyzed species.Conclusion:In the area of bioinformatics field, this work might be informative and a stepping-stone to other researchers to get an idea about the physicochemical features, evolutionary history and structural motifs of uricase that can be widely used in biotechnological and pharmaceutical industries. Therefore, the proposed in silico analysis can be considered for protein engineering work, as well as for gout therapy.

Download Full-text

Independent pseudogenizations and losses of sox15 during amniote diversification following asymmetric ohnolog evolution

BMC Ecology and Evolution ◽

10.1186/s12862-021-01864-z ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Yusaku Ogita ◽

Kei Tamura ◽

Shuuji Mawaribuchi ◽

Nobuhiko Takamatsu ◽

Michihiko Ito

Keyword(s):

Sequence Similarity ◽

The Other ◽

Vertebrate Evolution ◽

Skeletal Muscle Regeneration ◽

Cns Development ◽

Placental Development ◽

Relaxed Selection ◽

Significant Sequence Similarity ◽

Anuran Amphibians ◽

Divergent Gene

Abstract Background Four ohnologous genes (sox1, sox2, sox3, and sox15) were generated by two rounds of whole-genome duplication in a vertebrate ancestor. In eutherian mammals, Sox1, Sox2, and Sox3 participate in central nervous system (CNS) development. Sox15 has a function in skeletal muscle regeneration and has little functional overlap with the other three ohnologs. In contrast, the frog Xenopus laevis and zebrafish orthologs of sox15 as well as sox1-3 function in CNS development. We previously reported that Sox15 is involved in mouse placental development as neofunctionalization, but is pseudogenized in the marsupial opossum. These findings suggest that sox15 might have evolved with divergent gene fates during vertebrate evolution. However, knowledge concerning sox15 in other vertebrate lineages than therian mammals, anuran amphibians, and teleost fish is scarce. Our purpose in this study was to clarify the fate and molecular evolution of sox15 during vertebrate evolution. Results We searched for sox15 orthologs in all vertebrate classes from agnathans to mammals by significant sequence similarity and synteny analyses using vertebrate genome databases. Interestingly, sox15 was independently pseudogenized at least twice during diversification of the marsupial mammals. Moreover, we observed independent gene loss of sox15 at least twice during reptile evolution in squamates and crocodile-bird diversification. Codon-based phylogenetic tree and selective analyses revealed an increased dN/dS ratio for sox15 compared to the other three ohnologs during jawed vertebrate evolution. Conclusions The findings revealed an asymmetric evolution of sox15 among the four ohnologs during vertebrate evolution, which was supported by the increased dN/dS values in cartilaginous fishes, anuran amphibians, and amniotes. The increased dN/dS value of sox15 may have been caused mainly by relaxed selection. Notably, independent pseudogenizations and losses of sox15 were observed during marsupial and reptile evolution, respectively. Both might have been caused by strong relaxed selection. The drastic gene fates of sox15, including neofunctionalization and pseudogenizations/losses during amniote diversification, might be caused by a release from evolutionary constraints.

Download Full-text

A novel giant secretion polypeptide in Chironomus salivary glands: implications for another Balbiani ring gene.

The Journal of Cell Biology ◽

10.1083/jcb.101.3.1044 ◽

1985 ◽

Vol 101 (3) ◽

pp. 1044-1051 ◽

Cited By ~ 27

Author(s):

W Y Kao ◽

S T Case

Keyword(s):

Amino Acid ◽

Nucleotide Sequence ◽

Salivary Glands ◽

Cyanogen Bromide ◽

Tryptic Peptide ◽

Amino Acid Sequences ◽

The Other ◽

Balbiani Ring ◽

Sequence Organization ◽

Peptide Maps

Chironomus salivary glands contain a family of high Mr (approximately 1,000 X 10(3)) secretion polypeptides thought to consist of three components: sp-Ia, sp-Ib, and sp-Ic. The use of a new extraction protocol revealed a novel high Mr component, sp-Id. Results of a survey of individual salivary glands indicated that sp-Id was widespread in more than a dozen strains of C. tentans and C. pallidivittatus. Sp-Id was phosphorylated at Ser residues, and a comparison of cyanogen bromide and tryptic peptide maps of 32P-labeled polypeptides suggested that sp-Ia, sp-Ib, and sp-Id are comprised of similar but nonidentical tandemly repeated amino acid sequences. We concluded that sp-Id is encoded by an mRNA whose size and nucleotide sequence organization are similar to Balbiani ring (BR) mRNAs that code for the other sp-I components. Furthermore, parallel repression of sp-Ib and sp-Id synthesis by galactose led us to hypothesize that both of their genes exist within Balbiani ring 2.

Download Full-text

An intergenic G-rich region in Leishmania tarentolae kinetoplast maxicircle DNA is a pan-edited cryptogene encoding ribosomal protein S12

Molecular and Cellular Biology ◽

10.1128/mcb.12.1.56-67.1992 ◽

1992 ◽

Vol 12 (1) ◽

pp. 56-67

Author(s):

D A Maslov ◽

N R Sturm ◽

B M Niner ◽

E S Gruszynski ◽

M Peris ◽

...

Keyword(s):

Amino Acid ◽

Ribosomal Protein ◽

Ribosomal Proteins ◽

Sequence Similarity ◽

Rich Region ◽

Leishmania Tarentolae ◽

Significant Sequence Similarity ◽

The Family ◽

Intergenic Regions ◽

Ribosomal Protein S12

Six short G-rich intergenic regions in the maxicircle of Leishmania tarentolae are conserved in location and polarity in two other kinetoplastid species. We show here that G-rich region 6 (G6) represents a pan-edited cryptogene which contains at least two domains edited independently in a 3'-to-5' manner connected by short unedited regions. In the completely edited RNA, 117 uridines are added at 49 sites and 32 uridines are deleted at 13 sites, creating a translated 85-amino-acid polypeptide. Similar polypeptides are probably encoded by pan-edited G6 transcripts in two other species. The G6 polypeptide has significant sequence similarity to the family of S12 ribosomal proteins. A minicircle-encoded gRNA overlaps 12 editing sites in G6 mRNA, and chimeric gRNA/mRNA molecules were shown to exist, in agreement with the transesterification model for editing.

Download Full-text

Techniques for the verification of minimal phylogenetic trees illustrated with ten mammalian haemoglobin sequences

Biochemical Journal ◽

10.1042/bj1870065 ◽

1980 ◽

Vol 187 (1) ◽

pp. 65-74 ◽

Cited By ~ 12

Author(s):

D Penny ◽

M D Hendy ◽

L R Foulds

Keyword(s):

Amino Acid ◽

Phylogenetic Tree ◽

Protein Sequence ◽

Phylogenetic Trees ◽

Sequence Data ◽

Protein Sequences ◽

Nucleotide Sequences ◽

Amino Acid Sequences ◽

Minimal Tree ◽

Protein Sequence Data

We have recently reported a method to identify the shortest possible phylogenetic tree for a set of protein sequences [Foulds Hendy & Penny (1979) J. Mol. Evol. 13. 127–150; Foulds, Penny & Hendy (1979) J. Mol. Evol. 13, 151–166]. The present paper discusses issues that arise during the construction of minimal phylogenetic trees from protein-sequence data. The conversion of the data from amino acid sequences into nucleotide sequences is shown to be advantageous. A new variation of a method for constructing a minimal tree is presented. Our previous methods have involved first constructing a tree and then either proving that it is minimal or transforming it into a minimal tree. The approach presented in the present paper progressively builds up a tree, taxon by taxon. We illustrate this approach by using it to construct a minimal tree for ten mammalian haemoglobin alpha-chain sequences. Finally we define a measure of the complexity of the data and illustrate a method to derive a directed phylogenetic tree from the minimal tree.

Download Full-text

Molecular characterization and phylogenetic analysis of NBS-LRR genes in wild relatives of eggplant (Solanum melongena L

Indian Journal of Agricultural Research ◽

10.18805/ijare.a-4793 ◽

2018 ◽

Author(s):

Sona. S Dev ◽

P. Poornima ◽

Akhil Venu

Keyword(s):

Phylogenetic Analysis ◽

Amino Acid ◽

Sequence Similarity ◽

Interleukin 1 ◽

Preliminary Investigation ◽

Solanum Melongena ◽

Wild Relatives ◽

Amino Acid Sequences ◽

R Genes ◽

Multiple Sequence

Eggplantor brinjal (Solanum melongena L.), is highly susceptible to various soil-borne diseases. The extensive use of chemical fungicides to combat these diseases can be minimized by identification of resistance gene analogs (RGAs) in wild species of cultivated plants.In the present study, degenerate PCR primers for the conserved regions ofnucleotide binding site-leucine rich repeat (NBS-LRR) were used to amplify RGAs from wild relatives of eggplant (Black nightshade (Solanum nigrum), Indian nightshade (Solanumviolaceum)and Solanu mincanum) which showed resistance to the bacterial wilt pathogen, Ralstonia solanacearumin the preliminary investigation. The amino acid sequence of the amplicons when compared to each other and to the amino acid sequences of known RGAs deposited in Gen Bank revealed significant sequence similarity. The phylogenetic analysis indicated that they belonged to the toll interleukin-1 receptors (TIR)-NBS-LRR type R-genes. Multiple sequence alignment with other known R genes showed significant homology with P-loop, Kinase 2 and GLPL domains of NBS-LRR class genes. There has been no report on R genes from these wild eggplants and hence the diversity analysis of these novel RGAs can lead to the identification of other novel R genes within the germplasm of different brinjal plants as well as other species of Solanum.

Download Full-text

History, Diagnosis, and Treatment of Coronavirus Disease 2019 (COVID-19)

Coronaviruses ◽

10.2174/2666796702666210805101958 ◽

2021 ◽

Vol 02 ◽

Author(s):

Amaresh Mishra ◽

Nisha Nair ◽

Vishwas Tripathi ◽

Yamini Pathak ◽

Jaseela Majeed

Keyword(s):

Amino Acid ◽

Severe Acute Respiratory Syndrome ◽

Sequence Similarity ◽

Neurological Complications ◽

Amino Acid Sequences ◽

World Health ◽

Amino Acid Sequence Similarity ◽

Effective Prevention ◽

Short Period ◽

The World

: The Coronavirus Disease 2019 (COVID-19), also known as a novel coronavirus (2019-nCoV), reportedly originated from Wuhan City, Hubei Province, China. Coronavirus Disease 2019 rapidly spread all over the world within a short period. On January 30th, 2020, the World Health Organization (WHO) declared it a global epidemic. COVID-19 is a severe acute respiratory syndrome coronavirus (SARS-CoV) virus that evolves to respiratory, hepatic, gastrointestinal, and neurological complications, and eventually death. SARS-CoV and the Middle East Respiratory Syndrome coronavirus (MERS-CoV) genome sequences similar identity with 2019-nCoV or severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, few amino acid sequences of 2019-nCoV differ from SARS-CoV and MERS-CoV. COVID-19 shares about 90% amino acid sequence similarity with SARS-CoV. Effective prevention methods should be taken in order to control this pandemic situation. Till now, there are no effective treatments available to treat COVID-19. This review provides information regarding COVID-19 history, epidemiology, pathogenesis, and molecular diagnosis. Also, we focus on the development of vaccines in the management of this COVID-19 pandemic and limiting the spread of the virus.

Download Full-text

Unevolved De Novo Proteins Have Innate Tendencies to Bind Transition Metals

Life ◽

10.3390/life9010008 ◽

2019 ◽

Vol 9 (1) ◽

pp. 8 ◽

Cited By ~ 4

Author(s):

Michael S. Wang ◽

Kenric J. Hoegler ◽

Michael H. Hecht

Keyword(s):

Amino Acid ◽

Transition Metals ◽

Metal Binding ◽

Combinatorial Library ◽

De Novo ◽

Protein Sequences ◽

Amino Acid Sequences ◽

Ancestral Sequences ◽

Wide Range ◽

Catalytic Functions

Life as we know it would not exist without the ability of protein sequences to bind metal ions. Transition metals, in particular, play essential roles in a wide range of structural and catalytic functions. The ubiquitous occurrence of metalloproteins in all organisms leads one to ask whether metal binding is an evolved trait that occurred only rarely in ancestral sequences, or alternatively, whether it is an innate property of amino acid sequences, occurring frequently in unevolved sequence space. To address this question, we studied 52 proteins from a combinatorial library of novel sequences designed to fold into 4-helix bundles. Although these sequences were neither designed nor evolved to bind metals, the majority of them have innate tendencies to bind the transition metals copper, cobalt, and zinc with high nanomolar to low-micromolar affinity.

Download Full-text

BIOPEP-UWM Database of Bioactive Peptides: Current Opportunities

International Journal of Molecular Sciences ◽

10.3390/ijms20235978 ◽

2019 ◽

Vol 20 (23) ◽

pp. 5978 ◽

Cited By ~ 49

Author(s):

Minkiewicz ◽

Iwaniak ◽

Darewicz

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Chronic Diseases ◽

Bioactive Peptides ◽

Protein Sequences ◽

Batch Processing ◽

Amino Acid Sequences ◽

Quantitative Parameters ◽

New Information

The BIOPEP-UWM™ database of bioactive peptides (formerly BIOPEP) has recently become a popular tool in the research on bioactive peptides, especially on these derived from foods and being constituents of diets that prevent development of chronic diseases. The database is continuously updated and modified. The addition of new peptides and the introduction of new information about the existing ones (e.g., chemical codes and references to other databases) is in progress. New opportunities include the possibility of annotating peptides containing D-enantiomers of amino acids, batch processing option, converting amino acid sequences into SMILES code, new quantitative parameters characterizing the presence of bioactive fragments in protein sequences, and finding proteinases that release particular peptides.

Download Full-text