Faculty Opinions recommendation of Generalized bootstrap supports for phylogenetic analyses of protein sequences incorporating alignment uncertainty.

Author(s):  
Rafael Zardoya ◽  
Diego San Mauro ◽  
Iker Irisarri
2018 ◽  
Vol 67 (6) ◽  
pp. 997-1009 ◽  
Author(s):  
Maria Chatzou ◽  
Evan W Floden ◽  
Paolo Di Tommaso ◽  
Olivier Gascuel ◽  
Cedric Notredame

2009 ◽  
Vol 14 (22) ◽  
Author(s):  
G M Nava ◽  
M S Attene-Ramos ◽  
J K Ang ◽  
M Escorcia

To gain insight into the possible origins of the 2009 outbreak of new influenza A(H1N1), we performed two independent analyses of genetic evolution of the new influenza A(H1N1) virus. Firstly, protein homology analyses of more than 400 sequences revealed that this virus most likely evolved from recent swine viruses. Secondly, phylogenetic analyses of 5,214 protein sequences of influenza A(H1N1) viruses (avian, swine and human) circulating in North America for the last two decades (from 1989 to 2009) indicated that the new influenza A(H1N1) virus possesses a distinctive evolutionary trait (genetic distinctness). This appears to be a particular characteristic in pig-human interspecies transmission of influenza A. Thus these analyses contribute to the evidence of the role of pig populations as “mixing vessels” for influenza A(H1N1) viruses.


2020 ◽  
Author(s):  
Cory D. Dunn

AbstractPhylogenetic analyses can take advantage of multiple sequence alignments as input. These alignments typically consist of homologous nucleic acid or protein sequences, and the inclusion of outlier or aberrant sequences can compromise downstream analyses. Here, I describe a program, SequenceBouncer, that uses the Shannon entropy values of alignment columns to identify outlier alignment sequences in a manner responsive to overall alignment context. I demonstrate the utility of this software using alignments of available mammalian mitochondrial genomes, bird cytochrome c oxidase-derived DNA barcodes, and COVID-19 sequences.


Author(s):  
Peng-Yeng yin ◽  
Shyong-Jian Shyu ◽  
Guan-Shieng Huang ◽  
Shuang-Te Liao

With the advent of new sequencing technology for biological data, the number of sequenced proteins stored in public databases has become an explosion. The structural, functional, and phylogenetic analyses of proteins would benefit from exploring databases by using data mining techniques. Clustering algorithms can assign proteins into clusters such that proteins in the same cluster are more similar in homology than those in different clusters. This procedure not only simplifies the analysis task but also enhances the accuracy of the results. Most of the existing protein-clustering algorithms compute the similarity between proteins based on one-to-one pairwise sequence


Author(s):  
Peng-Yeng yin ◽  
Shyong-Jian Shyu ◽  
Guan-Shieng Huang ◽  
Shuang-Te Liao

With the advent of new sequencing technology for biological data, the number of sequenced proteins stored in public databases has become an explosion. The structural, functional, and phylogenetic analyses of proteins would benefit from exploring databases by using data mining techniques. Clustering algorithms can assign proteins into clusters such that proteins in the same cluster are more similar in homology than those in different clusters. This procedure not only simplifies the analysis task but also enhances the accuracy of the results. Most of the existing protein-clustering algorithms compute the similarity between proteins based on one-to-one pairwise sequence


2011 ◽  
pp. 2259-2273
Author(s):  
Peng-Yeng Yin ◽  
Shyong-Jian Shyu ◽  
Guan-Shieng Huang ◽  
Shuang-Te Liao

With the advent of new sequencing technology for biological data, the number of sequenced proteins stored in public databases has become an explosion. The structural, functional, and phylogenetic analyses of proteins would benefit from exploring databases by using data mining techniques. Clustering algorithms can assign proteins into clusters such that proteins in the same cluster are more similar in homology than those in different clusters. This procedure not only simplifies the analysis task but also enhances the accuracy of the results. Most of the existing protein-clustering algorithms compute the similarity between proteins based on one-to-one pairwise sequence alignment instead of multiple sequences alignment; the latter is prohibited due to expensive computation. Hence the accuracy of the clustering result is deteriorated. Further, the traditional clustering methods are ad-hoc and the resulting clustering often converges to local optima. This chapter presents a Bayesian framework for improving clustering accuracy of protein sequences based on association rules. The experimental results manifest that the proposed framework can significantly improve the performance of traditional clustering methods.


2008 ◽  
pp. 1091-1102
Author(s):  
Peng-Yeng Yin ◽  
Shyong-Jian Shyu ◽  
Guan-Shieng Huang ◽  
Shuang-Te Liao

With the advent of new sequencing technology for biological data, the number of sequenced proteins stored in public databases has become an explosion. The structural, functional, and phylogenetic analyses of proteins would benefit from exploring databases by using data mining techniques. Clustering algorithms can assign proteins into clusters such that proteins in the same cluster are more similar in homology than those in different clusters. This procedure not only simplifies the analysis task but also enhances the accuracy of the results. Most of the existing protein-clustering algorithms compute the similarity between proteins based on one-to-one pairwise sequence


Genetika ◽  
2019 ◽  
Vol 51 (1) ◽  
pp. 69-80
Author(s):  
Emre Sevindik

The large subunits of ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) protein, which plays an important role in the photosynthesis reaction, are encoded by the chloroplast genome. Sideritis L., a medical and aromatic plant group, belongs to Lamiaceae family. In this study, we performed sequence, physicochemical, phylogenetic and three-dimensional (3D) bioinformatic analyses of RuBisCO large subunit (rbcL) proteins in the Sideritis ssp. using various bioinformatics tools. Physicochemical analyzes were performed by ExPASy - ProtParam. The putative phosphorylation sites of the rbcL proteins were determined by NetPhos 2.0 and NetPhos 3.1. Phylogenetic analyses were performed with the MEGA 6.0 software. To estimate 3D protein structures, PyMol program was used. At the end of the study, it was found that the amino acid number of stilbene synthase proteins ranged between 171 and 456, molecular weight ranged between 19002.67 and 50420.44 Da, instability index ranged between 27.30 to 40.70 and GRAVY values ranged between -0.394 to -0.226. While the highest average amino acid rate in the rbcL proteins was Gly (10.00%), the lowest amino acid ratio (1.4%) was determined as Trp. In phylogenetic analyses performed using protein sequences, maximum likelihood (ML) tree consisted of 2 large clades. Pairwise distance analysis based on Sideritis species? rbcL protein sequences was performed using MEGA 6.0. The lowest pairwise distance was 0.000, while the highest pairwise distance was 0.024. When the estimated 3D structures of the proteins were examined, the Gly residue, which plays an important role in the structure and function of the proteins, was detected as the least in S. libanotica subsp. kurdica species while it was the most abundant residue in S. cretica subsp. spicata. The results of our study provide insights into fundamental characteristics of rbcL proteins in Sideritis taxa.


2004 ◽  
Vol 70 (9) ◽  
pp. 5522-5527 ◽  
Author(s):  
Aurélien Ginolhac ◽  
Cyrille Jarrin ◽  
Benjamin Gillet ◽  
Patrick Robe ◽  
Petar Pujic ◽  
...  

ABSTRACT The metagenomic approach provides direct access to diverse unexplored genomes, especially from uncultivated bacteria in a given environment. This diversity can conceal many new biosynthetic pathways. Type I polyketide synthases (PKSI) are modular enzymes involved in the biosynthesis of many natural products of industrial interest. Among the PKSI domains, the ketosynthase domain (KS) was used to screen a large soil metagenomic library containing more than 100,000 clones to detect those containing PKS genes. Over 60,000 clones were screened, and 139 clones containing KS domains were detected. A 700-bp fragment of the KS domain was sequenced for 40 of 139 randomly chosen clones. None of the 40 protein sequences were identical to those found in public databases, and nucleic sequences were not redundant. Phylogenetic analyses were performed on the protein sequences of three metagenomic clones to select the clones which one can predict to produce new compounds. Two PKS-positive clones do not belong to any of the 23 published PKSI included in the analysis, encouraging further analyses on these two clones identified by the selection process.


2020 ◽  
Vol 142 ◽  
pp. 83-97
Author(s):  
A Chandran ◽  
PU Zacharia ◽  
TV Sathianandan ◽  
NK Sanil

The present study describes a new species of myxosporean, Ellipsomyxa ariusi sp. nov., infecting the gallbladder of the threadfin sea catfish Arius arius (Hamilton, 1822). E. ariusi sp. nov. is characterized by bivalvular, ellipsoid or elongate-oval myxospores with smooth spore valves and a straight suture, arranged at an angle to the longitudinal spore axis. Mature myxospores measured 10.1 ± 0.8 µm in length, 6.8 ± 0.5 µm in width and 7.7 ± 0.7 µm in thickness. Polar capsules are equal in size and oval to pyriform in shape. They are positioned at an angle to the longitudinal myxospore axis and open in opposite directions. Polar capsules measured 2.8 ± 0.3 µm in length and 2.5 ± 0.4 µm in width; polar filaments formed 4-5 coils, and extended to 32.2 ± 2.1 µm in length. Monosporic and disporic plasmodial stages attached to the wall of gallbladder. Molecular analysis of the type specimen generated a 1703 bp partial SSU rDNA sequence (MN892546), which was identical to the isolates from 3 other locations. In phylogenetic analyses, genus Ellipsomyxa appeared monophyletic and E. ariusi sp. nov. occupied an independent position in maximum likelihood and Bayesian inference trees with high bootstrap values. The overall prevalence of infection was 54.8% and multiway ANOVA revealed that it varied significantly with location, year, season, sex and size of the fish host. Histopathological changes associated with E. ariusi sp. nov. infection included swelling, vacuolation and detachment of epithelial layer, reduced mucus production and altered consistency and colour of bile. Based on the morphologic, morphometric and molecular differences with known species of Ellipsomyxa, and considering differences in host and geographic locations, the present species is treated as new and the name Ellipsomyxa ariusi sp. nov. is proposed.


Sign in / Sign up

Export Citation Format

Share Document