scholarly journals ProtParCon: A Framework for Processing Molecular Data and Identifying Parallel and Convergent Amino Acid Replacements

Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 181
Author(s):  
Fei Yuan ◽  
Hoa Nguyen ◽  
Dan Graur

Studying parallel and convergent amino acid replacements in protein evolution is frequently used to assess adaptive evolution at the molecular level. Identifying parallel and convergent replacements involves multiple steps and computational routines, such as multiple sequence alignment, phylogenetic tree inference, ancestral state reconstruction, topology tests, and simulation of sequence evolution. Here, we present ProtParCon, a Python 3 package that provides a common interface for users to process molecular data and identify parallel and convergent amino acid replacements in orthologous protein sequences. By integrating several widely used programs for computational biology, ProtParCon implements general functions for handling multiple sequence alignment, ancestral-state reconstruction, maximum-likelihood phylogenetic tree inference, and sequence simulation. ProtParCon also contains a built-in pipeline that automates all these sequential steps, and enables quick identification of observed and expected parallel and convergent amino acid replacements under different evolutionary assumptions. The most up-to-date version of ProtParCon, including scripts containing user tutorials, the full API reference and documentation are publicly and freely available under an open source MIT License via GitHub. The latest stable release is also available on PyPI (the Python Package Index).

2021 ◽  
Author(s):  
Jūlija Pečerska ◽  
Manuel Gil ◽  
Maria Anisimova

Multiple sequence alignment and phylogenetic tree inference are connected problems that are often solved as independent steps in the inference process. Several attempts at doing simultaneous inference have been made, however currently the available methods are greatly limited by their computational complexity and can only handle small datasets. In this manuscript we introduce a combinatorial optimisation approach that will allow us to resolve the circularity of the problem and efficiently infer both alignments and trees under maximum likelihood.


2020 ◽  
Vol 17 (1) ◽  
pp. 59-77
Author(s):  
Anand Kumar Nelapati ◽  
JagadeeshBabu PonnanEttiyappan

Background:Hyperuricemia and gout are the conditions, which is a response of accumulation of uric acid in the blood and urine. Uric acid is the product of purine metabolic pathway in humans. Uricase is a therapeutic enzyme that can enzymatically reduces the concentration of uric acid in serum and urine into more a soluble allantoin. Uricases are widely available in several sources like bacteria, fungi, yeast, plants and animals.Objective:The present study is aimed at elucidating the structure and physiochemical properties of uricase by insilico analysis.Methods:A total number of sixty amino acid sequences of uricase belongs to different sources were obtained from NCBI and different analysis like Multiple Sequence Alignment (MSA), homology search, phylogenetic relation, motif search, domain architecture and physiochemical properties including pI, EC, Ai, Ii, and were performed.Results:Multiple sequence alignment of all the selected protein sequences has exhibited distinct difference between bacterial, fungal, plant and animal sources based on the position-specific existence of conserved amino acid residues. The maximum homology of all the selected protein sequences is between 51-388. In singular category, homology is between 16-337 for bacterial uricase, 14-339 for fungal uricase, 12-317 for plants uricase, and 37-361 for animals uricase. The phylogenetic tree constructed based on the amino acid sequences disclosed clusters indicating that uricase is from different source. The physiochemical features revealed that the uricase amino acid residues are in between 300- 338 with a molecular weight as 33-39kDa and theoretical pI ranging from 4.95-8.88. The amino acid composition results showed that valine amino acid has a high average frequency of 8.79 percentage compared to different amino acids in all analyzed species.Conclusion:In the area of bioinformatics field, this work might be informative and a stepping-stone to other researchers to get an idea about the physicochemical features, evolutionary history and structural motifs of uricase that can be widely used in biotechnological and pharmaceutical industries. Therefore, the proposed in silico analysis can be considered for protein engineering work, as well as for gout therapy.


2020 ◽  
Vol 14 (3) ◽  
pp. 235-246
Author(s):  
Sara Abdollahi ◽  
Mohammad H. Morowvat ◽  
Amir Savardashtaki ◽  
Cambyz Irajie ◽  
Sohrab Najafipour ◽  
...  

Background: Arginine deiminase is a bacterial enzyme, which degrades L-arginine. Some human cancers such as hepatocellular carcinoma (HCC) and melanoma are auxotrophic for arginine. Therefore, PEGylated arginine deiminase (ADI-PEG20) is a good anticancer candidate with antitumor effects. It causes local depletion of L-arginine and growth inhibition in arginineauxotrophic tumor cells. The FDA and EMA have granted orphan status to this drug. Some recently published patents have dealt with this enzyme or its PEGylated form. Objective: Due to increasing attention to it, we aimed to evaluate and compare 30 arginine deiminase proteins from different bacterial species through in silico analysis. Methods: The exploited analyses included the investigation of physicochemical properties, multiple sequence alignment (MSA), motif, superfamily, phylogenetic and 3D comparative analyses of arginine deiminase proteins thorough various bioinformatics tools. Results: The most abundant amino acid in the arginine deiminase proteins is leucine (10.13%) while the least amino acid ratio is cysteine (0.98%). Multiple sequence alignment showed 47 conserved patterns between 30 arginine deiminase amino acid sequences. The results of sequence homology among 30 different groups of arginine deiminase enzymes revealed that all the studied sequences located in amidinotransferase superfamily. Based on the phylogenetic analysis, two major clusters were identified. Considering the results of various in silico studies; we selected the five best candidates for further investigations. The 3D structures of the best five arginine deiminase proteins were generated by the I-TASSER server and PyMOL. The RAMPAGE analysis revealed that 81.4%-91.4%, of the selected sequences, were located in the favored region of arginine deiminase proteins. Conclusion: The results of this study shed light on the basic physicochemical properties of thirty major arginine deiminase sequences. The obtained data could be employed for further in vivo and clinical studies and also for developing the related therapeutic enzymes.


2015 ◽  
Vol 28 (1) ◽  
pp. 46 ◽  
Author(s):  
David A. Morrison ◽  
Matthew J. Morgan ◽  
Scot A. Kelchner

Sequence alignment is just as much a part of phylogenetics as is tree building, although it is often viewed solely as a necessary tool to construct trees. However, alignment for the purpose of phylogenetic inference is primarily about homology, as it is the procedure that expresses homology relationships among the characters, rather than the historical relationships of the taxa. Molecular homology is rather vaguely defined and understood, despite its importance in the molecular age. Indeed, homology has rarely been evaluated with respect to nucleotide sequence alignments, in spite of the fact that nucleotides are the only data that directly represent genotype. All other molecular data represent phenotype, just as do morphology and anatomy. Thus, efforts to improve sequence alignment for phylogenetic purposes should involve a more refined use of the homology concept at a molecular level. To this end, we present examples of molecular-data levels at which homology might be considered, and arrange them in a hierarchy. The concept that we propose has many levels, which link directly to the developmental and morphological components of homology. Of note, there is no simple relationship between gene homology and nucleotide homology. We also propose terminology with which to better describe and discuss molecular homology at these levels. Our over-arching conceptual framework is then used to shed light on the multitude of automated procedures that have been created for multiple-sequence alignment. Sequence alignment needs to be based on aligning homologous nucleotides, without necessary reference to homology at any other level of the hierarchy. In particular, inference of nucleotide homology involves deriving a plausible scenario for molecular change among the set of sequences. Our clarifications should allow the development of a procedure that specifically addresses homology, which is required when performing alignment for phylogenetic purposes, but which does not yet exist.


Author(s):  
U. G. Adebo ◽  
J. O. Matthew

Multiple sequence analysis is one of the most widely used model in estimating similarity among genotypes. In a bid to access useful information for the utilization of bush mango genetic resources, nucleotide sequences of eight bush mango (Irvingia gabonensis) cultivars were sourced for and retrieved form NCBI data base, and evaluated for diversity and similarity using computational biology approach. The highest alignment score (26.18), depicting the highest similarity, was between two pairs of sequence combinations; BM07:BM58 and BM12:BM69 respectively, while the least score (19.43) was between BM01: BM13. The phylogenetic tree broadly divided the cultivars into four distinct groups; BM07, BM58 (cluster one), BM01 (cluster 2), BM15, BM13 and BM35 (cluster 3), and BM12, BM69 (cluster 4), while the sequences obtained from the analysis revealed only few fully conserved regions, with the single nucleotides A, and T, which were consistent throughout the evolution. Results obtained from this study indicate that the bush mango cultivars are divergent and can be useful genetic resources for bush mango improvement through breeding.


Sign in / Sign up

Export Citation Format

Share Document