scholarly journals Phylogenetic Tree-based Pipeline for Uncovering Mutational Patterns during Influenza Virus Evolution

2019 ◽  
Author(s):  
Fransiskus Xaverius Ivan ◽  
Akhila Deshpande ◽  
Chun Wei Lim ◽  
Xinrui Zhou ◽  
Jie Zheng ◽  
...  

AbstractVarious computational and statistical approaches have been proposed to uncover the mutational patterns of rapidly evolving influenza viral genes. Nonetheless, the approaches mainly rely on sequence alignments which could potentially lead to spurious mutations obtained by comparing sequences from different clades that coexist during particular periods of time. To address this issue, we propose a phylogenetic tree-based pipeline that takes into account the evolutionary structure in the sequence data. Assuming that the sequences evolve progressively under a strict molecular clock, considering a competitive model that is based on a certain Markov model, and using a resampling approach to obtain robust estimates, we could capture statistically significant single-mutations and co-mutations during the sequence evolution. Moreover, by considering the results obtained from analyses that consider all paths and the longest path in the resampled trees, we can categorize the mutational sites and suggest their relevance. Here we applied the pipeline to investigate the 50 years of evolution of the HA sequences of influenza A/H3N2 viruses. In addition to confirming previous knowledge on the A/H3N2 HA evolution, we also demonstrate the use of the pipeline to classify mutational sites according to whether they are able to enhance antigenic drift, compensate other mutations that enhance antigenic drift, or both.


2015 ◽  
Author(s):  
Remco Bouckaert ◽  
Peter Lockhart

Most methods for performing a phylogenetic analysis based on sequence alignments of gene data assume that the mechanism of evolution is constant through time. It is recognised that some sites do evolve somewhat faster than others, and this can be captured using a (gamma) rate heterogeneity model. Further, some species have shorter replication times than others, and this results in faster rates of substitution in some lineages. This feature of lineage specific rate variation can be captured to some extent, by using relaxed clock models. However, it is also clear that there are additional poorly characterised features of sequence data that can sometimes lead to extreme differences in lineage specific rates. This variation is poorly captured by constant time reversible substitution models. The significance of extreme lineage specific rate differences is that they lead both to errors in reconstructing evolutionary relationships as well as biased estimates for the age of ancestral nodes. We propose a new model that allows gamma rate heterogeneity to change on branches, thus offering a more realistic model of sequence evolution. It adds negligible computational cost to likelihood calculations. We illustrate its effectiveness with an example of green algae and land-plants. For many real world data sets, we find a much better fit with multi-gamma sites models as well as substantial differences in ancestral node date estimates.



2020 ◽  
Author(s):  
Michael A. Zeller ◽  
Phillip C. Gauger ◽  
Zebulun W. Arendsee ◽  
Carine K. Souza ◽  
Amy L. Vincent ◽  
...  

ABSTRACTThe antigenic diversity of influenza A virus (IAV) circulating in swine challenges the development of effective vaccines, increasing zoonotic threat and pandemic potential. High throughput sequencing technologies are able to quantify IAV genetic diversity, but there are no accurate approaches to adequately describe antigenic phenotypes. This study evaluated an ensemble of non-linear regression models to estimate virus phenotype from genotype. Regression models were trained with a phenotypic dataset of pairwise hemagglutination inhibition (HI) assays, using genetic sequence identity and pairwise amino acid mutations as predictor features. The model identified amino acid identity, ranked the relative importance of mutations in the hemagglutinin (HA) protein, and demonstrated good prediction accuracy. Four previously untested IAV strains were selected to experimentally validate model predictions by HI assays. Error between predicted and measured distances of uncharacterized strains were 0.34, 0.70, 2.19, and 0.17 antigenic units. These empirically trained regression models can be used to estimate antigenic distances between different strains of IAV in swine using sequence data. By ranking the importance of mutations in the HA, we provide criteria for identifying antigenically advanced IAV strains that may not be controlled by existing vaccines and can inform strain updates to vaccines to better control this pathogen.



2017 ◽  
Author(s):  
Stephen M Crotty ◽  
Bui Quang Minh ◽  
Nigel G Bean ◽  
Barbara R Holland ◽  
Jonathan Tuke ◽  
...  

AbstractMolecular sequence data that have evolved under the influence of heterotachous evolutionary processes are known to mislead phylogenetic inference. We introduce the General Heterogeneous evolution On a Single Topology (GHOST) model of sequence evolution, implemented under a maximum-likelihood framework in the phylogenetic program IQ-TREE (http://www.iqtree.org). Simulations show that using the GHOST model, IQ-TREE can accurately recover the tree topology, branch lengths and substitution model parameters from heterotachously-evolved sequences. We develop a model selection algorithm based on simulation results, and investigate the performance of the GHOST model on empirical data by sampling phylogenomic alignments of varying lengths from a plastome alignment. We then carry out inference under the GHOST model on a phylogenomic dataset composed of 248 genes from 16 taxa, where we find the GHOST model concurs with the currently accepted view, placing turtles as a sister lineage of archosaurs, in contrast to results obtained using traditional variable rates-across-sites models. Finally, we apply the model to a dataset composed of a sodium channel gene of 11 fish taxa, finding that the GHOST model is able to infer a subtle component of the historical signal, linked to the previously established convergent evolution of the electric organ in two geographically distinct lineages of electric fish. We compare inference under the GHOST model to partitioning by codon position and show that, owing to the minimization of model constraints, the GHOST model is able to offer unique biological insights when applied to empirical data.



mSphere ◽  
2021 ◽  
Vol 6 (2) ◽  
Author(s):  
Michael A. Zeller ◽  
Phillip C. Gauger ◽  
Zebulun W. Arendsee ◽  
Carine K. Souza ◽  
Amy L. Vincent ◽  
...  

ABSTRACT The antigenic diversity of influenza A viruses (IAV) circulating in swine challenges the development of effective vaccines, increasing zoonotic threat and pandemic potential. High-throughput sequencing technologies can quantify IAV genetic diversity, but there are no accurate approaches to adequately describe antigenic phenotypes. This study evaluated an ensemble of nonlinear regression models to estimate virus phenotype from genotype. Regression models were trained with a phenotypic data set of pairwise hemagglutination inhibition (HI) assays, using genetic sequence identity and pairwise amino acid mutations as predictor features. The model identified amino acid identity, ranked the relative importance of mutations in the hemagglutinin (HA) protein, and demonstrated good prediction accuracy. Four previously untested IAV strains were selected to experimentally validate model predictions by HI assays. Errors between predicted and measured distances of uncharacterized strains were 0.35, 0.61, 1.69, and 0.13 antigenic units. These empirically trained regression models can be used to estimate antigenic distances between different strains of IAV in swine by using sequence data. By ranking the importance of mutations in the HA, we provide criteria for identifying antigenically advanced IAV strains that may not be controlled by existing vaccines and can inform strain updates to vaccines to better control this pathogen. IMPORTANCE Influenza A viruses (IAV) in swine constitute a major economic burden to an important global agricultural sector, impact food security, and are a public health threat. Despite significant improvement in surveillance for IAV in swine over the past 10 years, sequence data have not been integrated into a systematic vaccine strain selection process for predicting antigenic phenotype and identifying determinants of antigenic drift. To overcome this, we developed nonlinear regression models that predict antigenic phenotype from genetic sequence data by training the model on hemagglutination inhibition assay results. We used these models to predict antigenic phenotype for previously uncharacterized IAV, ranked the importance of genetic features for antigenic phenotype, and experimentally validated our predictions. Our model predicted virus antigenic characteristics from genetic sequence data and provides a rapid and accurate method linking genetic sequence data to antigenic characteristics. This approach also provides support for public health by identifying viruses that are antigenically advanced from strains used as pandemic preparedness candidate vaccine viruses.



Author(s):  
Stephen M Crotty ◽  
Bui Quang Minh ◽  
Nigel G Bean ◽  
Barbara R Holland ◽  
Jonathan Tuke ◽  
...  

Abstract Molecular sequence data that have evolved under the influence of heterotachous evolutionary processes are known to mislead phylogenetic inference. We introduce the General Heterogeneous evolution On a Single Topology (GHOST) model of sequence evolution, implemented under a maximum-likelihood framework in the phylogenetic program IQ-TREE (http://www.iqtree.org). Simulations show that using the GHOST model, IQ-TREE can accurately recover the tree topology, branch lengths, and substitution model parameters from heterotachously evolved sequences. We investigate the performance of the GHOST model on empirical data by sampling phylogenomic alignments of varying lengths from a plastome alignment. We then carry out inference under the GHOST model on a phylogenomic data set composed of 248 genes from 16 taxa, where we find the GHOST model concurs with the currently accepted view, placing turtles as a sister lineage of archosaurs, in contrast to results obtained using traditional variable rates-across-sites models. Finally, we apply the model to a data set composed of a sodium channel gene of 11 fish taxa, finding that the GHOST model is able to elucidate a subtle component of the historical signal, linked to the previously established convergent evolution of the electric organ in two geographically distinct lineages of electric fish. We compare inference under the GHOST model to partitioning by codon position and show that, owing to the minimization of model constraints, the GHOST model offers unique biological insights when applied to empirical data.



Virology ◽  
1981 ◽  
Vol 113 (2) ◽  
pp. 656-662 ◽  
Author(s):  
Setsuko Nakajima ◽  
Alan P. Kendal


Viruses ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 977
Author(s):  
Kobporn Boonnak ◽  
Chayasin Mansanguan ◽  
Dennis Schuerch ◽  
Usa Boonyuen ◽  
Hatairat Lerdsamran ◽  
...  

Influenza viruses continue to be a major public health threat due to the possible emergence of more virulent influenza virus strains resulting from dynamic changes in virus adaptability, consequent of functional mutations and antigenic drift in surface proteins, especially hemagglutinin (HA) and neuraminidase (NA). In this study, we describe the genetic and evolutionary characteristics of H1N1, H3N2, and influenza B strains detected in severe cases of seasonal influenza in Thailand from 2018 to 2019. We genetically characterized seven A/H1N1 isolates, seven A/H3N2 isolates, and six influenza B isolates. Five of the seven A/H1N1 viruses were found to belong to clade 6B.1 and were antigenically similar to A/Switzerland/3330/2017 (H1N1), whereas two isolates belonged to clade 6B.1A1 and clustered with A/Brisbane/02/2018 (H1N1). Interestingly, we observed additional mutations at antigenic sites (S91R, S181T, T202I) as well as a unique mutation at a receptor binding site (S200P). Three-dimensional (3D) protein structure analysis of hemagglutinin protein reveals that this unique mutation may lead to the altered binding of the HA protein to a sialic acid receptor. A/H3N2 isolates were found to belong to clade 3C.2a2 and 3C.2a1b, clustering with A/Switzerland/8060/2017 (H3N2) and A/South Australia/34/2019 (H3N2), respectively. Amino acid sequence analysis revealed 10 mutations at antigenic sites including T144A/I, T151K, Q213R, S214P, T176K, D69N, Q277R, N137K, N187K, and E78K/G. All influenza B isolates in this study belong to the Victoria lineage. Five out of six isolates belong to clade 1A3-DEL, which relate closely to B/Washington/02/2009, with one isolate lacking the three amino acid deletion on the HA segment at position K162, N163, and D164. In comparison to the B/Colorado/06/2017, which is the representative of influenza B Victoria lineage vaccine strain, these substitutions include G129D, G133R, K136E, and V180R for HA protein. Importantly, the susceptibility to oseltamivir of influenza B isolates, but not A/H1N1 and A/H3N2 isolates, were reduced as assessed by the phenotypic assay. This study demonstrates the importance of monitoring genetic variation in influenza viruses regarding how acquired mutations could be associated with an improved adaptability for efficient transmission.



Sign in / Sign up

Export Citation Format

Share Document