scholarly journals HIV Protease and Integrase Empirical Substitution Models of Evolution: Protein-Specific Models Outperform Generalist Models

Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 61
Author(s):  
Roberto Del Amparo ◽  
Miguel Arenas

Diverse phylogenetic methods require a substitution model of evolution that should mimic, as accurately as possible, the real substitution process. At the protein level, empirical substitution models have traditionally been based on a large number of different proteins from particular taxonomic levels. However, these models assume that all of the proteins of a taxonomic level evolve under the same substitution patterns. We believe that this assumption is highly unrealistic and should be relaxed by considering protein-specific substitution models that account for protein-specific selection processes. In order to test this hypothesis, we inferred and evaluated four new empirical substitution models for the protease and integrase of HIV and other viruses. We found that these models more accurately fit, compared with any of the currently available empirical substitution models, the evolutionary process of these proteins. We conclude that evolutionary inferences from protein sequences are more accurate if they are based on protein-specific substitution models rather than taxonomic-specific (generalist) substitution models. We also present four new empirical substitution models of protein evolution that could be useful for phylogenetic inferences of viral protease and integrase.

2020 ◽  
Author(s):  
Huihui Chang ◽  
Yimeng Nie ◽  
Nan Zhang ◽  
Xue Zhang ◽  
Huimin Sun ◽  
...  

Abstract Background Amino acid substitution models play an important role in inferring phylogenies from mitochondrial proteins. Although different amino acid substitution models have been proposed, only a few were estimated from mitochondrial protein sequences for specific taxa such as the mtArt model for Arthropoda. The increasing of mitochondrial genome data from broad Orthoptera taxa provides an opportunity to estimate the Orthoptera-specific mitochondrial amino acid empirical model. Results We sequenced complete mitochondrial genomes of 54 Orthoptera species, and then estimated an amino acid substitution model (named mtOrt) by maximum likelihood method based on the 283 complete mitochondrial genomes available currently. The results indicated that there are obvious differences between mtOrt and the existing model, and the new model can better fit the Orthoptera mitochondrial protein datasets. Moreover, topologies of trees constructed using mtOrt and existing models are frequently different. MtOrt does indeed have an impact on likelihood improvement as well as tree topologies. The comparisons between the topologies of trees constructed using mtOrt and existing models show that the new model outperforms the existing models in inferring phylogenies from Orthoptera mitochondrial protein data. Conclusions The new mitochondrial amino acid substitution model of Orthoptera shows obvious differences from the existing models, and outperforms the existing models in inferring phylogenies from Orthoptera mitochondrial protein sequences.


2020 ◽  
Author(s):  
Huihui Chang ◽  
Yimeng Nie ◽  
Nan Zhang ◽  
Xue Zhang ◽  
Huimin Sun ◽  
...  

Abstract Background Amino acid substitution models play an important role in inferring phylogenies from mitochondrial proteins. Although different amino acid substitution models have been proposed, only a few were estimated from mitochondrial protein sequences for specific taxa such as the mtArt model for Arthropoda. The increasing of mitochondrial genome data from broad Orthoptera taxa provides an opportunity to estimate the Orthoptera-specific mitochondrial amino acid empirical model. Results We sequenced complete mitochondrial genomes of 54 Orthoptera species, and then estimated an amino acid substitution model (named mtOrt) by maximum likelihood method based on the 283 complete mitochondrial genomes available currently. The results indicated that there are obvious differences between mtOrt and the existing model, and the new model can better fit the Orthoptera mitochondrial protein datasets. Moreover, topologies of trees constructed using mtOrt and existing models are frequently different. MtOrt does indeed have an impact on likelihood improvement as well as tree topologies. The comparisons between the topologies of trees constructed using mtOrt and existing models show that the new model outperforms the existing models in inferring phylogenies from Orthoptera mitochondrial protein data. Conclusions The new mitochondrial amino acid substitution model of Orthoptera shows obvious differences from the existing models, and outperforms the existing models in inferring phylogenies from Orthoptera mitochondrial protein sequences.


2013 ◽  
Vol 11 (06) ◽  
pp. 1343003 ◽  
Author(s):  
JING-DOO WANG

In this paper, three genomic materials — DNA sequences, protein sequences, and regions (domains) are used to compare methods of virus classification. Virus classes (categories) are divided by various taxonomic level of virus into three datasets for 6 order, 42 family, and 33 genera. To increase the robustness and comparability of experimental results of virus classification, the classes are selected that contain at least 10 instances, and meanwhile each instance contains at least one region name. Experimental results show that the approach using region names achieved the best accuracies — reaching 99.9%, 97.3%, and 99.0% for 6 orders, 42 families, and 33 genera, respectively. This paper not only involves exhaustive experiments that compare virus classifications using different genomic materials, but also proposes a novel approach to biological classification based on molecular biology instead of traditional morphology.


Author(s):  
Samir Okasha

In a standard Darwinian explanation, natural selection takes place at the level of the individual organism, i.e. some organisms enjoy a survival or reproduction advantage over others, which results in evolutionary change. In principle however, natural selection could operate at other hierarchical levels too, above and below that of the organism, for example the level of genes, cells, groups, colonies or even whole species. This possibility gives rise to the ‘levels of selection’ question in evolutionary biology. Group and colony-level selection have been proposed, originally by Darwin, as a means by which altruism can evolve. (In biology, ‘altruism’ refers to behaviour which entails a fitness cost to the individual so behaving, but benefits others.) Though this idea is still alive today, many theorists regard kin selection as a superior explanation for the existence of altruism. Kin selection arises from the fact that relatives share genes, so if an organism behaves altruistically towards its relatives, there is a greater than random chance that the beneficiary of the altruistic action will itself be an altruist. Kin selection is closely bound up with the ‘gene’s eye view’ of evolution, which holds that genes, not organisms, are the true beneficiaries of the evolutionary process. The gene’s eye approach to evolution, though heuristically valuable, does not in itself resolve the levels of selection question, because selection processes that occur at many hierarchical levels can all be seen from a gene’s eye viewpoint. In recent years, the levels of selection discussion has been re-invigorated, and subtly transformed, by the important new work on the ‘major evolutionary transitions’. These transitions occur when a number of free-living biological units, originally capable of surviving and reproducing alone, become integrated into a larger whole, giving rise to a new biological unit at a higher level of organization. Evolutionary transitions are intimately bound up with the levels of selection issue, because during a transition the potential exists for selection to operate simultaneously at two different hierarchical levels.


2019 ◽  
Author(s):  
Xiaodong Jian ◽  
Scott V. Edwards ◽  
Liang Liu

ABSTRACTA statistical framework of model comparison and model validation is essential to resolving the debates over concatenation and coalescent models in phylogenomic data analysis. A set of statistical tests are here applied and developed to evaluate and compare the adequacy of substitution, concatenation, and multispecies coalescent (MSC) models across 47 phylogenomic data sets collected across tree of life. Tests for substitution models and the concatenation assumption of topologically concordant gene trees suggest that a poor fit of substitution models (44% of loci rejecting the substitution model) and concatenation models (38% of loci rejecting the hypothesis of topologically congruent gene trees) is widespread. Logistic regression shows that the proportions of GC content and informative sites are both negatively correlated with the fit of substitution models across loci. Moreover, a substantial violation of the concatenation assumption of congruent gene trees is consistently observed across 6 major groups (birds, mammals, fish, insects, reptiles, and others, including other invertebrates). In contrast, among those loci adequately described by a given substitution model, the proportion of loci rejecting the MSC model is 11%, significantly lower than those rejecting the substitution and concatenation models, and Bayesian model comparison strongly favors the MSC over concatenation across all data sets. Species tree inference suggests that loci rejecting the MSC have little effect on species tree estimation. Due to computational constraints, the Bayesian model validation and comparison analyses were conducted on the reduced data sets. A complete analysis of phylogenomic data requires the development of efficient algorithms for phylogenetic inference. Nevertheless, the concatenation assumption of congruent gene trees rarely holds for phylogenomic data with more than 10 loci. Thus, for large phylogenomic data sets, model comparison analyses are expected to consistently and more strongly favor the coalescent model over the concatenation model. Our analysis reveals the value of model validation and comparison in phylogenomic data analysis, as well as the need for further improvements of multilocus models and computational tools for phylogenetic inference.


2020 ◽  
Author(s):  
Hsiuying Wang ◽  
Yi-Hau Chen

BACKGROUND Coronavirus pandemic has been a wake-up call for the world. A dispute over the origin of SARS-CoV-2 has been raised. Study results showed that all SARS-CoV-2 sequences around the world sharing a common ancestor towards the end of 2019. Nevertheless, it is hard to reach conclusion regarding SARS-CoV-2 origin. OBJECTIVE In this study, we compare the divergence of SARS-CoV-2 sequences from the three areas, China, the USA, and Europe. METHODS We download SARS-CoV-2 sequences of China, USA, and Europe from the National Center for Biotechnology Information (NCBI). To investigate the diversity of these sequences from these three areas, we apply 17 different nucleotide substitution models to compare the diversity of these SARS-CoV-2 sequences. In the three groups of SARS-CoV-2 sequences, we calculate the pairwise nucleotide substitution distance of any two sequences in each group and then compare the distances in these three groups. RESULTS The analyzed results are consistent in most of the 17 substitution models. The outcomes from 14 substitution models show that China has the lowest diversity, followed by Europe and lastly by the USA. For the other 3 models, in one model, China has the lowest diversity, followed by the USA and lastly by Europe; in another model, USA has the lowest diversity, followed by China and lastly by Europe, and in the last one model, Europe has the lowest diversity, followed by China and lastly by the USA. CONCLUSIONS In this study, we compare the diversity of SARS-CoV-2 samples from China, Europe, and the USA. Different substitution models were applied to analyze the data. Our outcome shows that China has the smallest mean distance value, followed by Europe and lastly by the USA, which consists with the virus transmission time order that SARS-CoV-2 starts in China, then outbreaks in Europe and finally in the USA.


2015 ◽  
Vol 370 (1678) ◽  
pp. 20140329 ◽  
Author(s):  
Richard Gouy ◽  
Denis Baurain ◽  
Hervé Philippe

This article aims to shed light on difficulties in rooting the tree of life (ToL) and to explore the (sociological) reasons underlying the limited interest in accurately addressing this fundamental issue. First, we briefly review the difficulties plaguing phylogenetic inference and the ways to improve the modelling of the substitution process, which is highly heterogeneous, both across sites and over time. We further observe that enriched taxon samplings, better gene samplings and clever data removal strategies have led to numerous revisions of the ToL, and that these improved shallow phylogenies nearly always relocate simple organisms higher in the ToL provided that long-branch attraction artefacts are kept at bay. Then, we note that, despite the flood of genomic data available since 2000, there has been a surprisingly low interest in inferring the root of the ToL. Furthermore, the rare studies dealing with this question were almost always based on methods dating from the 1990s that have been shown to be inaccurate for much more shallow issues! This leads us to argue that the current consensus about a bacterial root for the ToL can be traced back to the prejudice of Aristotle's Great Chain of Beings, in which simple organisms are ancestors of more complex life forms. Finally, we demonstrate that even the best models cannot yet handle the complexity of the evolutionary process encountered both at shallow depth, when the outgroup is too distant, and at the level of the inter-domain relationships. Altogether, we conclude that the commonly accepted bacterial root is still unproven and that the root of the ToL should be revisited using phylogenomic supermatrices to ensure that new evidence for eukaryogenesis, such as the recently described Lokiarcheota, is interpreted in a sound phylogenetic framework.


2020 ◽  
Author(s):  
mohamed mahdi ◽  
János András Mótyán ◽  
Zsófia Ilona Szojka ◽  
Mária Golda ◽  
Márió Miczi ◽  
...  

Abstract Background The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in millions of infections worldwide. While the search for an effective antiviral is still ongoing, experimental therapies based on repurposing of available antivirals is being attempted, of which HIV protease inhibitors (PIs) have gained considerable interest. Inhibition profiling of the PIs directly against the viral protease has never been attempted in vitro, and while few studies reported an efficacy of lopinavir and ritonavir in SARS-CoV-2 context, the mechanism of action of the drugs remains to be validated. Methods We carried out an in-depth analysis of the efficacy of HIV PIs against the main protease of SARS-CoV-2 (Mpro) in cell culture and in vitro enzymatic assays, using a methodology that enabled us to focus solely on any potential inhibitory effects of the inhibitors against the viral protease. Results Lopinavir, ritonavir, darunavir, saquinavir, and atazanavir were able to inhibit the viral protease in cell culture, albeit in concentrations much higher than their achievable plasma levels, given their current drug formulations. While inhibition by lopinavir was attributed to its cytotoxicity, ritonavir was the most effective of the panel, with IC50 of 13.7 µM. Atazanavir on the other hand was the only PI to inhibit the viral protease both in cell culture and in our in vitro enzymatic assay. Conclusion Targeting of SARS-CoV-2 Mpro by some of the HIV PIs might be of limited clinical potential, given the high concentration of the drugs required to achieve significant inhibition. Therefore, given their weak inhibition of the viral protease, any potential beneficial effect of the PIs in COVID-19 context might perhaps be attributed to acting on other molecular target(s), rather than SARS-CoV-2 Mpro.


2017 ◽  
Author(s):  
Sarah K. Hilton ◽  
Michael B Doud ◽  
Jesse D Bloom

AbstractBackgroundThe evolution of protein-coding genes can be quantitatively modeled using phylogenetic methods. Recently, it has been shown that high-throughput experimental measurements of mutational effects made via deep mutational scanning can inform site-specific phylogenetic substitution models of gene evolution. However, there is currently no software tailored for such analyses.ResultsWe describe software that efficiently performs phylogenetic analyses with substitution models informed by deep mutational scanning. This software, phydms, is ∼100-fold faster than existing programs that accommodate such substitution models. It can be used to compare the results of deep mutational scanning experiments to the selection on genes in nature. For instance, phydms enables rigorous comparison of how well different experiments on the same gene describe natural selection. It also enables the re-scaling of deep mutational scanning data to account for differences in the stringency of selection in the lab and nature. Finally, phydms can identify sites that are evolving differently in nature than expected from experiments in the lab.ConclusionsThe phydms software makes it easy to use phylogenetic substitution models informed by deep mutational scanning experiments. As data from such experiments becomes increasingly widespread, phydms will facilitate quantitative comparison of the experimental results to the actual selection pressures shaping evolution in nature.


Author(s):  
Brian J. Loasby

This chapter analyses Technological Innovation as an Evolutionary Process, a book that explores the analogy between technical innovation and biological evolution, and whether such an analogy could be developed from a ‘metaphor’ into a ‘model’. After discussing the explanatory power of ‘evolutionary reasoning’, the chapter describes an alternative approach to the analysis of technological innovation. It then presents an evolutionary argument for the growth of knowledge and explains how it differs from neo-Darwinism, and examines rational choice theory in relation to natural selection. It also looks at six elements of Adam Smith's psychological theory of the emergence and development of science: the motivation for generating new ideas; the generation of novelty and the ex-ante selection processes which guide its adoption or rejection; the role of aesthetic criteria both in guiding conjectures and in encouraging their acceptance; Smith's argument that connecting principles which seem to work well are widely diffused; the renewal of the evolutionary process; and the evolution of the evolutionary process itself. Finally, the chapter considers the implications of uncertainty for cognition and the growth of knowledge.


Sign in / Sign up

Export Citation Format

Share Document