Correction to: Enhancing Statistical Multiple Sequence Alignment and Tree Inference Using Structural Information

Enhancing Statistical Multiple Sequence Alignment and Tree Inference Using Structural Information

Methods in Molecular Biology - Computational Methods in Protein Evolution ◽

10.1007/978-1-4939-8736-8_10 ◽

2018 ◽

pp. 183-214 ◽

Cited By ~ 1

Author(s):

Joseph L. Herman

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structural Information ◽

Multiple Sequence ◽

Tree Inference

Download Full-text

MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information

Nucleic Acids Research ◽

10.1093/nar/gkl514 ◽

2006 ◽

Vol 34 (16) ◽

pp. 4364-4374 ◽

Cited By ~ 74

Author(s):

Jimin Pei ◽

Nick V. Grishin

Keyword(s):

Hidden Markov Models ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Markov Models ◽

Structural Information ◽

Hidden Markov ◽

Multiple Sequence

Download Full-text

Multiple Sequence Alignment Averaging Improves Phylogeny Reconstruction

Systematic Biology ◽

10.1093/sysbio/syy036 ◽

2018 ◽

Vol 68 (1) ◽

pp. 117-130 ◽

Cited By ~ 9

Author(s):

Haim Ashkenazy ◽

Itamar Sela ◽

Eli Levy Karin ◽

Giddy Landan ◽

Tal Pupko

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Sequence Data ◽

Phylogenetic Signal ◽

Large Set ◽

Multiple Sequence ◽

Extra Effort ◽

Alignment Algorithms ◽

Tree Inference ◽

Alignment Errors

Abstract The classic methodology of inferring a phylogenetic tree from sequence data is composed of two steps. First, a multiple sequence alignment (MSA) is computed. Then, a tree is reconstructed assuming the MSA is correct. Yet, inferred MSAs were shown to be inaccurate and alignment errors reduce tree inference accuracy. It was previously proposed that filtering unreliable alignment regions can increase the accuracy of tree inference. However, it was also demonstrated that the benefit of this filtering is often obscured by the resulting loss of phylogenetic signal. In this work we explore an approach, in which instead of relying on a single MSA, we generate a large set of alternative MSAs and concatenate them into a single SuperMSA. By doing so, we account for phylogenetic signals contained in columns that are not present in the single MSA computed by alignment algorithms. Using simulations, we demonstrate that this approach results, on average, in more accurate trees compared to 1) using an unfiltered MSA and 2) using a single MSA with weights assigned to columns according to their reliability. Next, we explore in which regions of the MSA space our approach is expected to be beneficial. Finally, we provide a simple criterion for deciding whether or not the extra effort of computing a SuperMSA and inferring a tree from it is beneficial. Based on these assessments, we expect our methodology to be useful for many cases in which diverged sequences are analyzed. The option to generate such a SuperMSA is available at http://guidance.tau.ac.il.

Download Full-text

T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension

Nucleic Acids Research ◽

10.1093/nar/gkr245 ◽

2011 ◽

Vol 39 (suppl) ◽

pp. W13-W17 ◽

Cited By ~ 568

Author(s):

P. Di Tommaso ◽

S. Moretti ◽

I. Xenarios ◽

M. Orobitg ◽

A. Montanyola ◽

...

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structural Information ◽

Web Server ◽

Rna Sequences ◽

Multiple Sequence

Download Full-text

ProtParCon: A Framework for Processing Molecular Data and Identifying Parallel and Convergent Amino Acid Replacements

Genes ◽

10.3390/genes10030181 ◽

2019 ◽

Vol 10 (3) ◽

pp. 181

Author(s):

Fei Yuan ◽

Hoa Nguyen ◽

Dan Graur

Keyword(s):

Amino Acid ◽

Phylogenetic Tree ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Molecular Data ◽

Ancestral State Reconstruction ◽

Ancestral State ◽

Multiple Sequence ◽

State Reconstruction ◽

Tree Inference

Studying parallel and convergent amino acid replacements in protein evolution is frequently used to assess adaptive evolution at the molecular level. Identifying parallel and convergent replacements involves multiple steps and computational routines, such as multiple sequence alignment, phylogenetic tree inference, ancestral state reconstruction, topology tests, and simulation of sequence evolution. Here, we present ProtParCon, a Python 3 package that provides a common interface for users to process molecular data and identify parallel and convergent amino acid replacements in orthologous protein sequences. By integrating several widely used programs for computational biology, ProtParCon implements general functions for handling multiple sequence alignment, ancestral-state reconstruction, maximum-likelihood phylogenetic tree inference, and sequence simulation. ProtParCon also contains a built-in pipeline that automates all these sequential steps, and enables quick identification of observed and expected parallel and convergent amino acid replacements under different evolutionary assumptions. The most up-to-date version of ProtParCon, including scripts containing user tutorials, the full API reference and documentation are publicly and freely available under an open source MIT License via GitHub. The latest stable release is also available on PyPI (the Python Package Index).

Download Full-text

Joint Alignment and Tree Inference

10.1101/2021.09.28.462230 ◽

2021 ◽

Author(s):

Jūlija Pečerska ◽

Manuel Gil ◽

Maria Anisimova

Keyword(s):

Computational Complexity ◽

Maximum Likelihood ◽

Phylogenetic Tree ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Combinatorial Optimisation ◽

Simultaneous Inference ◽

Inference Process ◽

Multiple Sequence ◽

Tree Inference

Multiple sequence alignment and phylogenetic tree inference are connected problems that are often solved as independent steps in the inference process. Several attempts at doing simultaneous inference have been made, however currently the available methods are greatly limited by their computational complexity and can only handle small datasets. In this manuscript we introduce a combinatorial optimisation approach that will allow us to resolve the circularity of the problem and efficiently infer both alignments and trees under maximum likelihood.

Download Full-text

Optimizing genetic algorithm parameters for multiple sequence alignment based on structural information

Advanced Studies in Biology ◽

10.12988/asb.2016.51250 ◽

2016 ◽

Vol 8 ◽

pp. 9-16 ◽

Cited By ~ 2

Author(s):

May Rashiele K. Sueno ◽

Joel M. Addawe

Keyword(s):

Genetic Algorithm ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structural Information ◽

Multiple Sequence

Download Full-text

Multiple Sequence Alignment and Profile Analysis of Protein Family Utsing Hidden Markov Model

International Journal of Scientific Research ◽

10.15373/22778179/june2013/66 ◽

2012 ◽

Vol 2 (6) ◽

pp. 208-211

Author(s):

Navjot Kaur ◽

◽

Rajbir Singh Cheema ◽

Harmandeep Singh Harmandeep Singh

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Profile Analysis ◽

Hidden Markov ◽

Protein Family ◽

Multiple Sequence

Download Full-text

Faculty Opinions recommendation of MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.731078852.793536612 ◽

2017 ◽

Author(s):

Feng Gao

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Online Service ◽

Multiple Sequence

Download Full-text

Computational Analysis of Therapeutic Enzyme Uricase from Different Source Organisms

Current Proteomics ◽

10.2174/1570164616666190617165107 ◽

2020 ◽

Vol 17 (1) ◽

pp. 59-77

Author(s):

Anand Kumar Nelapati ◽

JagadeeshBabu PonnanEttiyappan

Keyword(s):

Uric Acid ◽

Amino Acid ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Protein Sequences ◽

Amino Acid Sequences ◽

Amino Acid Residues ◽

Multiple Sequence ◽

Physiochemical Properties ◽

Pharmaceutical Industries

Background:Hyperuricemia and gout are the conditions, which is a response of accumulation of uric acid in the blood and urine. Uric acid is the product of purine metabolic pathway in humans. Uricase is a therapeutic enzyme that can enzymatically reduces the concentration of uric acid in serum and urine into more a soluble allantoin. Uricases are widely available in several sources like bacteria, fungi, yeast, plants and animals.Objective:The present study is aimed at elucidating the structure and physiochemical properties of uricase by insilico analysis.Methods:A total number of sixty amino acid sequences of uricase belongs to different sources were obtained from NCBI and different analysis like Multiple Sequence Alignment (MSA), homology search, phylogenetic relation, motif search, domain architecture and physiochemical properties including pI, EC, Ai, Ii, and were performed.Results:Multiple sequence alignment of all the selected protein sequences has exhibited distinct difference between bacterial, fungal, plant and animal sources based on the position-specific existence of conserved amino acid residues. The maximum homology of all the selected protein sequences is between 51-388. In singular category, homology is between 16-337 for bacterial uricase, 14-339 for fungal uricase, 12-317 for plants uricase, and 37-361 for animals uricase. The phylogenetic tree constructed based on the amino acid sequences disclosed clusters indicating that uricase is from different source. The physiochemical features revealed that the uricase amino acid residues are in between 300- 338 with a molecular weight as 33-39kDa and theoretical pI ranging from 4.95-8.88. The amino acid composition results showed that valine amino acid has a high average frequency of 8.79 percentage compared to different amino acids in all analyzed species.Conclusion:In the area of bioinformatics field, this work might be informative and a stepping-stone to other researchers to get an idea about the physicochemical features, evolutionary history and structural motifs of uricase that can be widely used in biotechnological and pharmaceutical industries. Therefore, the proposed in silico analysis can be considered for protein engineering work, as well as for gout therapy.

Download Full-text