scholarly journals Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10234 ◽  
Author(s):  
Alejandro Berrio ◽  
Valerie Gartner ◽  
Gregory A. Wray

Background The emergence of a novel coronavirus (SARS-CoV-2) associated with severe acute respiratory disease (COVID-19) has prompted efforts to understand the genetic basis for its unique characteristics and its jump from non-primate hosts to humans. Tests for positive selection can identify apparently nonrandom patterns of mutation accumulation within genomes, highlighting regions where molecular function may have changed during the origin of a species. Several recent studies of the SARS-CoV-2 genome have identified signals of conservation and positive selection within the gene encoding Spike protein based on the ratio of synonymous to nonsynonymous substitution. Such tests cannot, however, detect changes in the function of RNA molecules. Methods Here we apply a test for branch-specific oversubstitution of mutations within narrow windows of the genome without reference to the genetic code. Results We recapitulate the finding that the gene encoding Spike protein has been a target of both purifying and positive selection. In addition, we find other likely targets of positive selection within the genome of SARS-CoV-2, specifically within the genes encoding Nsp4 and Nsp16. Homology-directed modeling indicates no change in either Nsp4 or Nsp16 protein structure relative to the most recent common ancestor. These SARS-CoV-2-specific mutations may affect molecular processes mediated by the positive or negative RNA molecules, including transcription, translation, RNA stability, and evasion of the host innate immune system. Our results highlight the importance of considering mutations in viral genomes not only from the perspective of their impact on protein structure, but also how they may impact other molecular processes critical to the viral life cycle.

2020 ◽  
Author(s):  
Alejandro Berrio ◽  
Valerie Gartner ◽  
Gregory A Wray

AbstractBackgroundThe emergence of a novel coronavirus (SARS-CoV-2) associated with severe acute respiratory disease (COVID-19) has prompted efforts to understand the genetic basis for its unique characteristics and its jump from non-primate hosts to humans. Tests for positive selection can identify apparently nonrandom patterns of mutation accumulation within genomes, highlighting regions where molecular function may have changed during the origin of a species. Several recent studies of the SARS-CoV-2 genome have identified signals of conservation and positive selection within the gene encoding Spike protein based on the ratio of synonymous to nonsynonymous substitution. Such tests cannot, however, detect changes in the function of RNA molecules.MethodsHere we apply a test for branch-specific oversubstitution of mutations within narrow windows of the genome without reference to the genetic code.ResultsWe recapitulate the finding that the gene encoding Spike protein has been a target of both purifying and positive selection. In addition, we find other likely targets of positive selection within the genome of SARS-CoV-2, specifically within the genes encoding Nsp4 and Nsp16. Homology-directed modeling indicates no change in either Nsp4 or Nsp16 protein structure relative to the most recent common ancestor. Thermodynamic modeling of RNA stability and structure, however, indicates that RNA secondary structure within both genes in the SARS-CoV-2 genome differs from those of RaTG13, the reconstructed common ancestor, and Pan-CoV-GD (Guangdong). These SARS-CoV-2-specific mutations may affect molecular processes mediated by the positive or negative RNA molecules, including transcription, translation, RNA stability, and evasion of the host innate immune system. Our results highlight the importance of considering mutations in viral genomes not only from the perspective of their impact on protein structure, but also how they may impact other molecular processes critical to the viral life cycle.


2021 ◽  
Vol 166 (4) ◽  
pp. 1103-1112
Author(s):  
Eun-Ha Hwang ◽  
Green Kim ◽  
Hoyin Chung ◽  
Hanseul Oh ◽  
Jong-Hwan Park ◽  
...  

AbstractDengue virus (DV) is a mosquito-borne virus that is endemic to many tropical and subtropical areas. Recently, the annual incidence of DV infection has increased worldwide, including in Korea, due to global warming and increased global travel. We therefore sought to characterize the molecular and evolutionary features of DV-1 and DV-4 isolated from Korean overseas travelers. We used phylogenetic analysis based on the full coding region to classify isolates of DV-1 in Korea into genotype I (43251, KP406802), genotype IV (KP406803), and genotype V (KP406801). In addition, we found that strains of DV-4 belonged to genotype I (KP406806) and genotype II (43257). Evidence of positive selection in DV-1 strains was identified in the C, prM, NS2A, and NS5 proteins, whereas DV-4 showed positive selection only in the non-structural proteins NS2A, NS3, and NS5. The substitution rates per site per year were 5.58 × 10-4 and 6.72 × 10-4 for DV-1 and DV-4, respectively, and the time of the most recent common ancestor was determined using the Bayesian skyline coalescent method. In this study, the molecular, phylogenetic, and evolutionary characteristics of Korean DV-1 and DV-4 isolates were evaluated for the first time.


Author(s):  
Francisco Díez-Fuertes ◽  
María Iglesias-Caballero ◽  
Javier García Pérez ◽  
Sara Monzón ◽  
Pilar Jiménez ◽  
...  

SARS-CoV-2 whole-genome analysis has identified five large clades worldwide, emerged in 2019 (19A and 19B) and in 2020 (20A, 20B and 20C). This study aims to analyze the diffusion of SARS-CoV-2 in Spain using maximum likelihood phylogenetic and Bayesian phylodynamic analyses. The most recent common ancestor (MRCA) of the SARS-CoV-2 pandemic was estimated in Wuhan, China, around November 24, 2019. Phylogenetic analyses of the first 12,511 SARS-CoV-2 whole genome sequences obtained worldwide, including 290 from 11 different regions of Spain, revealed 62 independent introductions of the virus in the country. Most sequences from Spain were distributed in clades characterized by D614G substitution in S gene (20A, 20B and 20C) and L84S substitution in ORF8 (19B) with 163 and 118 sequences, respectively, with the remaining sequences branching in 19A. A total of 110 (38%) sequences from Spain grouped in four different monophyletic clusters of 20A clade (20A-Sp1 and 20A-Sp2) and 19B clade (19B-Sp1 and 19B-Sp2) along with sequences from 29 countries worldwide. The MRCA of 19A-Sp1, 20A-Sp1, 19A-Sp2 and 20A-Sp2 clusters were estimated in Spain around January 21 and 29, and February 6 and 17, 2020, respectively. The prevalence of 19B clade in Spain (40%) was by far higher than in any other European country during the first weeks of the epidemic, probably by a founder effect. However, this variant was replaced by G614-bearing viruses in April. In vitro assays showed an enhanced infectivity of pseudotyped virions displaying G614 substitution compared with D614, suggesting a fitness advantage of D614G. IMPORTANCE Multiple SARS-CoV-2 introductions have been detected in Spain and at least four resulted in the emergence of locally transmitted clusters originated not later than mid-February, with further dissemination to many other countries around the world and a few weeks before the explosion of COVID-19 cases detected in Spain during the first week of March. The majority of the earliest variants detected in Spain branched in 19B clade (D614 viruses), which was the most prevalent clade during the first weeks of March, pointing to a founder effect. However, from mid-March to June, 2020, G614-bearing viruses (20A, 20B and 20C clades) overcame D614 variants in Spain, probably as a consequence of an evolutionary advantage of this substitution in the spike protein. A higher infectivity of G614-bearing viruses compared to D614 variants was detected, suggesting that this substitution in SARS-CoV-2 spike protein could be behind the variant shift observed in Spain.


2020 ◽  
Author(s):  
Babatunde Olarenwaju Motayo ◽  
Olukunle Oluwapamilerin Oluwasemowo ◽  
Paul Akiniyi Akinduti ◽  
Babatunde Adebiyi Olusola ◽  
Olumide T Aerege ◽  
...  

ABSTRACTThe ongoing SARSCoV-2 pandemic was introduced into Africa on 14th February 2020 and has rapidly spread across the continent causing severe public health crisis and mortality. We investigated the genetic diversity and evolution of this virus during the early outbreak months using whole genome sequences. We performed; recombination analysis against closely related CoV, Bayesian time scaled phylogeny and investigated spike protein amino acid mutations. Results from our analysis showed recombination signals between the AfrSARSCoV-2 sequences and reference sequences within the N and S genes. The evolutionary rate of the AfrSARSCoV-2 was 4.133 × 10−4 high posterior density HPD (4.132 × 10−4 to 4.134 × 10−4) substitutions/site/year. The time to most recent common ancestor TMRCA of the African strains was December 7th 2019. The AfrSARCoV-2 sequences diversified into two lineages A and B with B being more diverse with multiple sub-lineages confirmed by both maximum clade credibility MCC tree and PANGOLIN software. There was a high prevalence of the D614-G spike protein amino acid mutation (82.61%) among the African strains. Our study has revealed a rapidly diversifying viral population with the G614 spike protein variant dominating, we advocate for up scaling NGS sequencing platforms across Africa to enhance surveillance and aid control effort of SARSCoV-2 in Africa.


2017 ◽  
Author(s):  
Chandrasekhar Natarajan ◽  
Agnieszka Jendroszek ◽  
Amit Kumar ◽  
Roy E. Weber ◽  
Jeremy R. H. Tame ◽  
...  

AbstractDuring adaptive phenotypic evolution, some selectively fixed mutations may be directly causative and others may be purely compensatory. The relative contribution of these two classes of mutation depends on the form and prevalence of mutational pleiotropy. To investigate the nature of adaptive substitutions and their pleiotropic effects, we used a protein engineering approach to characterize the molecular basis of hemoglobin (Hb) adaptation in the bar-headed goose (Anser indicus), a hypoxia-tolerant species renowned for its trans-Himalayan migratory flights. We synthesized and tested all possible mutational intermediates in the line of descent connecting the wildtype bar-headed goose genotype with the most recent common ancestor of bar-headed goose and its lowland relatives. Site-directed mutagenesis experiments revealed effect-size distributions of causative mutations and biophysical mechanisms underlying changes in function. Trade-offs between alternative functional properties revealed the importance of compensating deleterious pleiotropic effects in the adaptive evolution of protein function.


mBio ◽  
2014 ◽  
Vol 5 (1) ◽  
Author(s):  
Matthew Cotten ◽  
Simon J. Watson ◽  
Alimuddin I. Zumla ◽  
Hatem Q. Makhdoom ◽  
Anne L. Palser ◽  
...  

ABSTRACT The Middle East respiratory syndrome coronavirus (MERS-CoV) was first documented in the Kingdom of Saudi Arabia (KSA) in 2012 and, to date, has been identified in 180 cases with 43% mortality. In this study, we have determined the MERS-CoV evolutionary rate, documented genetic variants of the virus and their distribution throughout the Arabian peninsula, and identified the genome positions under positive selection, important features for monitoring adaptation of MERS-CoV to human transmission and for identifying the source of infections. Respiratory samples from confirmed KSA MERS cases from May to September 2013 were subjected to whole-genome deep sequencing, and 32 complete or partial sequences (20 were ≥99% complete, 7 were 50 to 94% complete, and 5 were 27 to 50% complete) were obtained, bringing the total available MERS-CoV genomic sequences to 65. An evolutionary rate of 1.12 × 10−3 substitutions per site per year (95% credible interval [95% CI], 8.76 × 10−4; 1.37 × 10−3) was estimated, bringing the time to most recent common ancestor to March 2012 (95% CI, December 2011; June 2012). Only one MERS-CoV codon, spike 1020, located in a domain required for cell entry, is under strong positive selection. Four KSA MERS-CoV phylogenetic clades were found, with 3 clades apparently no longer contributing to current cases. The size of the population infected with MERS-CoV showed a gradual increase to June 2013, followed by a decline, possibly due to increased surveillance and infection control measures combined with a basic reproduction number (R 0) for the virus that is less than 1. IMPORTANCE MERS-CoV adaptation toward higher rates of sustained human-to-human transmission appears not to have occurred yet. While MERS-CoV transmission currently appears weak, careful monitoring of changes in MERS-CoV genomes and of the MERS epidemic should be maintained. The observation of phylogenetically related MERS-CoV in geographically diverse locations must be taken into account in efforts to identify the animal source and transmission of the virus.


Genetics ◽  
1998 ◽  
Vol 150 (3) ◽  
pp. 1187-1198 ◽  
Author(s):  
Mikkel H Schierup ◽  
Xavier Vekemans ◽  
Freddy B Christiansen

Abstract Expectations for the time scale and structure of allelic genealogies in finite populations are formed under three models of sporophytic self-incompatibility. The models differ in the dominance interactions among the alleles that determine the self-incompatibility phenotype: In the SSIcod model, alleles act codominantly in both pollen and style, in the SSIdom model, alleles form a dominance hierarchy, and in SSIdomcod, alleles are codominant in the style and show a dominance hierarchy in the pollen. Coalescence times of alleles rarely differ more than threefold from those under gametophytic self-incompatibility, and transspecific polymorphism is therefore expected to be equally common. The previously reported directional turnover process of alleles in the SSIdomcod model results in coalescence times lower and substitution rates higher than those in the other models. The SSIdom model assumes strong asymmetries in allelic action, and the most recessive extant allele is likely to be the most recent common ancestor. Despite these asymmetries, the expected shape of the allele genealogies does not deviate markedly from the shape of a neutral gene genealogy. The application of the results to sequence surveys of alleles, including interspecific comparisons, is discussed.


Author(s):  
Wenjun Cheng ◽  
Tianjiao Ji ◽  
Shuaifeng Zhou ◽  
Yong Shi ◽  
Lili Jiang ◽  
...  

AbstractEchovirus 6 (E6) is associated with various clinical diseases and is frequently detected in environmental sewage. Despite its high prevalence in humans and the environment, little is known about its molecular phylogeography in mainland China. In this study, 114 of 21,539 (0.53%) clinical specimens from hand, foot, and mouth disease (HFMD) cases collected between 2007 and 2018 were positive for E6. The complete VP1 sequences of 87 representative E6 strains, including 24 strains from this study, were used to investigate the evolutionary genetic characteristics and geographical spread of E6 strains. Phylogenetic analysis based on VP1 nucleotide sequence divergence showed that, globally, E6 strains can be grouped into six genotypes, designated A to F. Chinese E6 strains collected between 1988 and 2018 were found to belong to genotypes C, E, and F, with genotype F being predominant from 2007 to 2018. There was no significant difference in the geographical distribution of each genotype. The evolutionary rate of E6 was estimated to be 3.631 × 10-3 substitutions site-1 year-1 (95% highest posterior density [HPD]: 3.2406 × 10-3-4.031 × 10-3 substitutions site-1 year-1) by Bayesian MCMC analysis. The most recent common ancestor of the E6 genotypes was traced back to 1863, whereas their common ancestor in China was traced back to around 1962. A small genetic shift was detected in the Chinese E6 population size in 2009 according to Bayesian skyline analysis, which indicated that there might have been an epidemic around that year.


Author(s):  
Ya-Fang Hu ◽  
Li-Ping Jia ◽  
Fang-Yuan Yu ◽  
Li-Ying Liu ◽  
Qin-Wei Song ◽  
...  

Abstract Background Coxsackievirus A16 (CVA16) is one of the major etiological agents of hand, foot and mouth disease (HFMD). This study aimed to investigate the molecular epidemiology and evolutionary characteristics of CVA16. Methods Throat swabs were collected from children with HFMD and suspected HFMD during 2010–2019. Enteroviruses (EVs) were detected and typed by real-time reverse transcription-polymerase chain reaction (RT-PCR) and RT-PCR. The genotype, evolutionary rate, the most recent common ancestor, population dynamics and selection pressure of CVA16 were analyzed based on viral protein gene (VP1) by bioinformatics software. Results A total of 4709 throat swabs were screened. EVs were detected in 3180 samples and 814 were CVA16 positive. More than 81% of CVA16-positive children were under 5 years old. The prevalence of CVA16 showed obvious periodic fluctuations with a high level during 2010–2012 followed by an apparent decline during 2013–2017. However, the activities of CVA16 increased gradually during 2018–2019. All the Beijing CVA16 strains belonged to sub-genotype B1, and B1b was the dominant strain. One B1c strain was detected in Beijing for the first time in 2016. The estimated mean evolutionary rate of VP1 gene was 4.49 × 10–3 substitution/site/year. Methionine gradually fixed at site-23 of VP1 since 2012. Two sites were detected under episodic positive selection, one of which (site-223) located in neutralizing linear epitope PEP71. Conclusions The dominant strains of CVA16 belonged to clade B1b and evolved in a fast evolutionary rate during 2010–2019 in Beijing. To provide more favorable data for HFMD prevention and control, it is necessary to keep attention on molecular epidemiological and evolutionary characteristics of CVA16.


Sign in / Sign up

Export Citation Format

Share Document