scholarly journals Molecular Evolution of SARS-CoV-2 Structural Genes: Evidence of Positive Selection in Spike Glycoprotein

2020 ◽  
Author(s):  
Xiao-Yong Zhan ◽  
Ying Zhang ◽  
Xuefu Zhou ◽  
Ke Huang ◽  
Yichao Qian ◽  
...  

AbstractSARS-CoV-2 caused a global pandemic in early 2020 and has resulted in more than 8,000,000 infections as well as 430,000 deaths in the world so far. Four structural proteins, envelope (E), membrane (M), nucleocapsid (N) and spike (S) glycoprotein, play a key role in controlling the entry into human cells and virion assembly of SARS-CoV-2. However, how these genes evolve during its human to human transmission is largely unknown. In this study, we screened and analyzed roughly 3090 SARS-CoV-2 isolates from GenBank database. The distribution of the four gene alleles is determined:16 for E, 40 for M, 131 for N and 173 for S genes. Phylogenetic analysis shows that global SARS-CoV-2 isolates can be clustered into three to four major clades based on the protein sequences of these genes. Intragenic recombination event isn’t detected among different alleles. However, purifying selection has conducted on the evolution of these genes. By analyzing full genomic sequences of these alleles using codon-substitution models (M8, M3 and M2a) and likelihood ratio tests (LRTs) of codeML package, it reveals that codon 614 of S glycoprotein has subjected to strong positive selection pressure and a persistent D614G mutation is identified. The definitive positive selection of D614G mutation is further confirmed by internal fixed effects likelihood (IFEL) and Evolutionary Fingerprinting methods implemented in Hyphy package. In addition, another potential positive selection site at codon 5 in the signal sequence of the S protein is also identified. The allele containing D614G mutation has undergone significant expansion during SARS-CoV-2 global pandemic, implying a better adaptability of isolates with the mutation. However, L5F allele expansion is relatively restricted. The D614G mutation is located at the subdomain 2 (SD2) of C-terminal portion (CTP) of the S1 subunit. Protein structural modeling shows that the D614G mutation may cause the disruption of salt bridge among S protein monomers increase their flexibility, and in turn promote receptor binding domain (RBD) opening, virus attachment and entry into host cells. Located at the signal sequence of S protein as it is, L5F mutation may facilitate the protein folding, assembly, and secretion of the virus. This is the first evidence of positive Darwinian selection in the spike gene of SARS-CoV-2, which contributes to a better understanding of the adaptive mechanism of this virus and help to provide insights for developing novel therapeutic approaches as well as effective vaccines by targeting on mutation sites.

2020 ◽  
Author(s):  
Xiao-Yong Zhan ◽  
Ying Zhang ◽  
Xuefu Zhou ◽  
Ke Huang ◽  
Yichao Qian ◽  
...  

Abstract Background: SARS-CoV-2 has caused a global pandemic since early 2020 and is still a serious public health issue world-wide. Four structural proteins, envelope (E), membrane (M), nucleocapsid (N) and spike (S) glycoprotein, play a key role in controlling the entry into human cells and virion assembly of SARS-CoV-2. The evolution of these genes may determine the infectivity of SARS-CoV-2, but is largely unknown. Results: We analyzed roughly 3090 SARS-CoV-2 isolates from GenBank database. The distribution of four gene alleles is determined: 16 for E, 40 for M, 131 for N and 173 for S genes. Phylogenetic analysis shows that global SARS-CoV-2 isolates can be clustered into three to four major clades based on the protein sequence. Although intragenic recombination event isn’t detected among different alleles, purifying selection has conducted on the evolution of these genes. By analyzing full genomic sequences of these alleles, it reveals that codon 614 of S glycoprotein has subjected to strong positive selection pressure and a consistent D614G mutation is identified. Additionally, another potential positive selection site at codon 5 in the signal sequence of the S protein is also identified with consistent L5F mutation. The allele containing D614G mutation has undergone significant expansion during SARS-CoV-2 transmission, implying a better adaptability of isolates with the mutation. However, L5F allele expansion is relatively restricted. The D614G mutation is located at the subdomain 2 (SD2) of C-terminal portion (CTP) of the S1 subunit. Protein structural modeling shows that the D614G mutation may cause the disruption of a salt bridge between S protein monomers and increase their flexibility, and in turn promote receptor binding domain (RBD) opening, virus attachment and entry into host cells. Located at the signal sequence of S protein as it is, L5F mutation may facilitate the protein folding, assembly, and secretion of the virus. Conclusions: This is the first evidence of positive Darwinian selection in the spike gene of SARS-CoV-2, which contributes to a better understanding of the adaptive mechanism of this virus and help to provide insights for developing novel therapeutic approaches as well as effective vaccines by targeting on mutation sites.


2020 ◽  
Author(s):  
Xiao-Yong Zhan ◽  
Ying Zhang ◽  
Xuefu Zhou ◽  
Ke Huang ◽  
Yichao Qian ◽  
...  

Abstract Background: SARS-CoV-2 has caused a global pandemic since early 2020 and remains a serious public health issue worldwide. Four structural genes, envelope (E), membrane (M), nucleocapsid (N) and spike (S), play a key role in controlling entry into human cells and virion assembly of SARS-CoV-2. The evolution of these genes may determine infectivity of SARS-CoV-2, but thus far, little is known about them. Methods: We analyzed 3090 SARS-CoV-2 isolates from the GenBank database to determine the evolutionary patterns of the four structural genes by employing various molecular evolution algorithms. Results: Phylogenetic analyses showed that global SARS-CoV-2 isolates can be clustered into three to four major clades based upon protein sequence. Although intragenic recombination was not detected among different alleles, purifying selection has affected the evolution of these genes. By analyzing full genomic sequences of these alleles, our result revealed that codon 614 of the S glycoprotein has been subjected to a strong positive selection pressure, and a consistent D614G mutation was identified. Additionally, another potentially positive selection site at codon 5 in the signal sequence of the S protein was also identified with a consistent L5F mutation. The allele containing the D614G mutation has undergone significant expansion during SARS-CoV-2 transmission, implying a better adaptability of isolates with the mutation. Nevertheless, L5F allele expansion was found to be relatively restricted. The D614G mutation is located at subdomain 2 (SD2) of the C-terminal portion (CTP) of the S1 subunit. Protein structural modeling showed that the D614G mutation may cause the disruption of a salt bridge between S protein monomers and increase their flexibility, consequently promoting receptor binding domain (RBD) opening, virus attachment, and ultimately entry into host cells. Located at the signal sequence of S protein, the L5F mutation may facilitate protein folding, assembly, and secretion of the virus. Conclusions: This is the first reported evidence of positive Darwinian selection in the spike gene of SARS-CoV-2. This finding contributes to a broader understanding of the adaptive mechanisms of this virus, and provide insight for the development of novel therapeutic approaches, as well as the creation of effective vaccines, through targeting mutation sites.


Molecules ◽  
2021 ◽  
Vol 26 (9) ◽  
pp. 2622
Author(s):  
Romina Oliva ◽  
Abdul Rajjak Shaikh ◽  
Andrea Petta ◽  
Anna Vangone ◽  
Luigi Cavallo

The crown of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is constituted by its spike (S) glycoprotein. S protein mediates the SARS-CoV-2 entry into the host cells. The “fusion core” of the heptad repeat 1 (HR1) on S plays a crucial role in the virus infectivity, as it is part of a key membrane fusion architecture. While SARS-CoV-2 was becoming a global threat, scientists have been accumulating data on the virus at an impressive pace, both in terms of genomic sequences and of three-dimensional structures. On 15 February 2021, from the SARS-CoV-2 genomic sequences in the GISAID resource, we collected 415,673 complete S protein sequences and identified all the mutations occurring in the HR1 fusion core. This is a 21-residue segment, which, in the post-fusion conformation of the protein, gives many strong interactions with the heptad repeat 2, bringing viral and cellular membranes in proximity for fusion. We investigated the frequency and structural effect of novel mutations accumulated over time in such a crucial region for the virus infectivity. Three mutations were quite frequent, occurring in over 0.1% of the total sequences. These were S929T, D936Y, and S949F, all in the N-terminal half of the HR1 fusion core segment and particularly spread in Europe and USA. The most frequent of them, D936Y, was present in 17% of sequences from Finland and 12% of sequences from Sweden. In the post-fusion conformation of the unmutated S protein, D936 is involved in an inter-monomer salt bridge with R1185. We investigated the effect of the D936Y mutation on the pre-fusion and post-fusion state of the protein by using molecular dynamics, showing how it especially affects the latter one.


Author(s):  
Luigi Cavallo ◽  
Romina Oliva

AbstractThe iconic “red crown” of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is made of its spike (S) glycoprotein. The S protein is the Trojan horse of coronaviruses, mediating their entry into the host cells. While SARS-CoV-2 was becoming a global threat, scientists have been accumulating data on the virus at an impressive pace, both in terms of genomic sequences and of three-dimensional structures. On April 21st, the GISAID resource had collected 10,823 SARS-CoV-2 genomic sequences. We extracted from them all the complete S protein sequences and identified point mutations thereof. Six mutations were located on a 14-residue segment (929-943) in the “fusion core” of the heptad repeat 1 (HR1). Our modeling in the pre- and post-fusion S protein conformations revealed, for three of them, the loss of interactions stabilizing the post-fusion assembly. On May 29th, the SARS-CoV-2 genomic sequences in GISAID were 34,805. An analysis of the occurrences of the HR1 mutations in this updated dataset revealed a significant increase for the S929I and S939F mutations and a dramatic increase for the D936Y mutation, which was particularly widespread in Sweden and Wales/England. We notice that this is also the mutation causing the loss of a strong inter-monomer interaction, the D936-R1185 salt bridge, thus clearly weakening the post-fusion assembly.


Author(s):  
Devika Singh ◽  
Soojin V. Yi

AbstractThe severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the ongoing global outbreak of a coronavirus disease (herein referred to as COVID-19). Other viruses in the same phylogenetic group have been responsible for previous regional outbreaks, including SARS and MERS. SARS-CoV-2 has a zoonotic origin, similar to the causative viruses of these previous outbreaks. The repetitive introduction of animal viruses into human populations resulting in disease outbreaks suggests that similar future epidemics are inevitable. Therefore, understanding the molecular origin and ongoing evolution of SARS-CoV-2 will provide critical insights for preparing for and preventing future outbreaks. A key feature of SARS-CoV-2 is its propensity for genetic recombination across host species boundaries. Consequently, the genome of SARS-CoV-2 harbors signatures of multiple recombination events, likely encompassing multiple species and broad geographic regions. Other regions of the SARS-CoV-2 genome show the impact of purifying selection. The spike (S) protein of SARS-CoV-2, which enables the virus to enter host cells, exhibits signatures of both purifying selection and ancestral recombination events, leading to an effective S protein capable of infecting human and many other mammalian cells. The global spread and explosive growth of the SARS-CoV-2 population (within human hosts) has contributed additional mutational variability into this genome, increasing opportunities for future recombination.


2021 ◽  
pp. 004728752110047
Author(s):  
Giray Gozgor ◽  
Marco Chi Keung Lau ◽  
Yan Zeng ◽  
Cheng Yan ◽  
Zhibin Lin

Capital investment is vital for sustainable tourism growth, particularly in times of geopolitical turmoil. This study examines how tourism investment was influenced by geopolitical risks considering social globalization as a moderating factor. Data were collected from 18 developing economies between 1995 and 2018. The results from the fixed effects and the least squares dummy variable–corrected methods show that the geopolitical risks negatively affect capital investment in tourism, with social globalization playing a moderating role in alleviating the adverse effect. The results were robust to different measures and analyses. The study advances our understanding of sustainable tourism growth amid geopolitical turmoil. Policymakers, especially those from developing economies, are suggested to be vigilant about the media atmosphere of geopolitics and enhancing social globalization as a countermeasure against politically turbulent times. The study also provides implications for alleviating the impact of the global pandemic on tourism investment.


Genetics ◽  
2000 ◽  
Vol 155 (1) ◽  
pp. 431-449 ◽  
Author(s):  
Ziheng Yang ◽  
Rasmus Nielsen ◽  
Nick Goldman ◽  
Anne-Mette Krabbe Pedersen

AbstractComparison of relative fixation rates of synonymous (silent) and nonsynonymous (amino acid-altering) mutations provides a means for understanding the mechanisms of molecular sequence evolution. The nonsynonymous/synonymous rate ratio (ω = dN/dS) is an important indicator of selective pressure at the protein level, with ω = 1 meaning neutral mutations, ω < 1 purifying selection, and ω > 1 diversifying positive selection. Amino acid sites in a protein are expected to be under different selective pressures and have different underlying ω ratios. We develop models that account for heterogeneous ω ratios among amino acid sites and apply them to phylogenetic analyses of protein-coding DNA sequences. These models are useful for testing for adaptive molecular evolution and identifying amino acid sites under diversifying selection. Ten data sets of genes from nuclear, mitochondrial, and viral genomes are analyzed to estimate the distributions of ω among sites. In all data sets analyzed, the selective pressure indicated by the ω ratio is found to be highly heterogeneous among sites. Previously unsuspected Darwinian selection is detected in several genes in which the average ω ratio across sites is <1, but in which some sites are clearly under diversifying selection with ω > 1. Genes undergoing positive selection include the β-globin gene from vertebrates, mitochondrial protein-coding genes from hominoids, the hemagglutinin (HA) gene from human influenza virus A, and HIV-1 env, vif, and pol genes. Tests for the presence of positively selected sites and their subsequent identification appear quite robust to the specific distributional form assumed for ω and can be achieved using any of several models we implement. However, we encountered difficulties in estimating the precise distribution of ω among sites from real data sets.


2005 ◽  
Vol 79 (6) ◽  
pp. 3289-3296 ◽  
Author(s):  
Choong-Tat Keng ◽  
Aihua Zhang ◽  
Shuo Shen ◽  
Kuo-Ming Lip ◽  
Burtram C. Fielding ◽  
...  

ABSTRACT The spike (S) protein of the severe acute respiratory syndrome coronavirus (SARS-CoV) interacts with cellular receptors to mediate membrane fusion, allowing viral entry into host cells; hence it is recognized as the primary target of neutralizing antibodies, and therefore knowledge of antigenic determinants that can elicit neutralizing antibodies could be beneficial for the development of a protective vaccine. Here, we expressed five different fragments of S, covering the entire ectodomain (amino acids 48 to 1192), as glutathione S-transferase fusion proteins in Escherichia coli and used the purified proteins to raise antibodies in rabbits. By Western blot analysis and immunoprecipitation experiments, we showed that all the antibodies are specific and highly sensitive to both the native and denatured forms of the full-length S protein expressed in virus-infected cells and transfected cells, respectively. Indirect immunofluorescence performed on fixed but unpermeabilized cells showed that these antibodies can recognize the mature form of S on the cell surface. All the antibodies were also able to detect the maturation of the 200-kDa form of S to the 210-kDa form by pulse-chase experiments. When the antibodies were tested for their ability to inhibit SARS-CoV propagation in Vero E6 culture, it was found that the anti-SΔ10 antibody, which was targeted to amino acid residues 1029 to 1192 of S, which include heptad repeat 2, has strong neutralizing activities, suggesting that this region of S carries neutralizing epitopes and is very important for virus entry into cells.


2020 ◽  
Vol 6 (27) ◽  
pp. eabb9153 ◽  
Author(s):  
Xiaojun Li ◽  
Elena E. Giorgi ◽  
Manukumar Honnayakanahalli Marichannegowda ◽  
Brian Foley ◽  
Chuan Xiao ◽  
...  

COVID-19 has become a global pandemic caused by the novel coronavirus SARS-CoV-2. Understanding the origins of SARS-CoV-2 is critical for deterring future zoonosis, discovering new drugs, and developing a vaccine. We show evidence of strong purifying selection around the receptor binding motif (RBM) in the spike and other genes among bat, pangolin, and human coronaviruses, suggesting similar evolutionary constraints in different host species. We also demonstrate that SARS-CoV-2’s entire RBM was introduced through recombination with coronaviruses from pangolins, possibly a critical step in the evolution of SARS-CoV-2’s ability to infect humans. Similar purifying selection in different host species, together with frequent recombination among coronaviruses, suggests a common evolutionary mechanism that could lead to new emerging human coronaviruses.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Palur V Raghuvamsi ◽  
Nikhil Kumar Tulsian ◽  
Firdaus Samsudin ◽  
Xinlei Qian ◽  
Kiren Purushotorman ◽  
...  

The Spike (S) protein is the main handle for SARS-CoV-2 to enter host cells via surface ACE2 receptors. How ACE2 binding activates proteolysis of S protein is unknown. Here, using amide hydrogen-deuterium exchange mass spectrometry and molecular dynamics simulations, we have mapped the S:ACE2 interaction interface and uncovered long-range allosteric propagation of ACE2 binding to sites necessary for host-mediated proteolysis of S protein, critical for viral host entry. Unexpectedly, ACE2 binding enhances dynamics at a distal S1/S2 cleavage site and flanking protease docking site ~27 Å away while dampening dynamics of the stalk hinge (central helix and heptad repeat) regions ~130 Å away. This highlights that the stalk and proteolysis sites of the S protein are dynamic hotspots in the pre-fusion state. Our findings provide a dynamics map of the S:ACE2 interface in solution and also offer mechanistic insights into how ACE2 binding is allosterically coupled to distal proteolytic processing sites and viral-host membrane fusion. Our findings highlight protease docking sites flanking the S1/S2 cleavage site, fusion peptide and heptad repeat 1 (HR1) as alternate allosteric hotspot targets for potential therapeutic development.


Sign in / Sign up

Export Citation Format

Share Document