Host Adaptation of Codon Usage in SARS-CoV-2 From Mammals Indicate Natural Selection

Author(s):  
Yanan Fu ◽  
Yanping Huang ◽  
Jingjing Rao ◽  
Feng Zeng ◽  
Ruiping Yang ◽  
...  

Abstract The outbreak of COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections, spread across hosts from humans to animals, transmitting particularly effectively in mink. How SARS-CoV-2 selects and evolves in the host, and the differences in the evolution of different animals are still unclear. To analysis the mutation and codon usage bias of SARS-CoV-2 in infected humans and animals. The SARS-CoV-2 sequence in mink (Mink-SARS2) and binding energy with receptor were calculated compared with human. The relative synonymous codon usage of viral encoded gene was analyzed to characterize the differences and the evolutionary characteristics. A synonymous codon usage analysis showed that SARS-CoV-2 is optimized to adapt in the animals in which it is currently reported, and all of the animals showed decreased adaptability relative to that of humans, except for mink. The neutrality plot showed that the effect of natural selection on different SARS-CoV-2 sequences is stronger than mutation pressure. A binding affinity analysis indicated that the spike protein of the SARS-CoV-2 variant in mink showed a greater preference for binding with the mink receptor ACE2 than with the human receptor, especially as the mutation Y453F and N501T in Mink-SARS2 lead to improvement of binding affinity for mink receptor. In summary, mutations Y453F and N501T in Mink-SARS2 lead to improvement of binding affinity with mink receptor, indicating possible natural selection and current host adaptation. Monitoring the variation and codon bias of SARS-CoV-2 provides a theoretical basis for tracing the epidemic, evolution and cross-species spread of SARS-CoV-2.

2021 ◽  
Author(s):  
Zhixiong Lei ◽  
Dan Zhang ◽  
Long Liu

The outbreak of COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, rapidly spread to create a global pandemic and has continued to spread across hosts from humans to animals, transmitting particularly effectively in mink. How SARS-CoV-2 evolves in animals and humans and the differences in the separate evolutionary processes remain unclear. We analyzed the composition and codon usage bias of SARS-CoV-2 in infected humans and animals. Compared with other animals, SARS-CoV-2 in mink had the most substitutions. The substitutions of cytidine in SARS-CoV-2 in mink account for nearly 50% of the substitutions, while those in other animals represent only 30% of the substitutions. The incidence of adenine transversion in SARS-CoV-2 in other animals is threefold higher than that in mink-CoV (the SARS-CoV-2 virus in mink). A synonymous codon usage analysis showed that SARS-CoV-2 is optimized to adapt in the animals in which it is currently reported, and all of the animals showed decreased adaptability relative to that of humans, except for mink. A binding affinity analysis indicated that the spike protein of the SARS-CoV-2 variant in mink showed a greater preference for binding with the mink receptor ACE2 than with the human receptor, especially as the mutation Y453F and F486L in mink-CoV lead to improvement of binding affinity for mink receptor. Our study focuses on the divergence of SARS-CoV-2 genome composition and codon usage in humans and animals, indicating possible natural selection and current host adaptation.


2021 ◽  
Author(s):  
Alexander L Cope ◽  
Premal Shah

Patterns of non-uniform usage of synonymous codons (codon bias) varies across genes in an organism and across species from all domains of life. The bias in codon usage is due to a combination of both non-adaptive (e.g. mutation biases) and adaptive (e.g. natural selection for translation efficiency/accuracy) evolutionary forces. Most population genetics models quantify the effects of mutation bias and selection on shaping codon usage patterns assuming a uniform mutation bias across the genome. However, mutation biases can vary both along and across chromosomes due to processes such as biased gene conversion, potentially obfuscating signals of translational selection. Moreover, estimates of variation in genomic mutation biases are often lacking for non-model organisms. Here, we combine an unsupervised learning method with a population genetics model of synonymous codon bias evolution to assess the impact of intragenomic variation in mutation bias on the strength and direction of natural selection on synonymous codon usage across 49 Saccharomycotina budding yeasts. We find that in the absence of a priori information, unsupervised learning approaches can be used to identify regions evolving under different mutation biases. We find that the impact of intragenomic variation in mutation bias varies widely, even among closely-related species. We show that the overall strength and direction of selection on codon usage can be underestimated by failing to account for intragenomic variation in mutation biases. Interestingly, genes falling into clusters identified by machine learning are also often physically clustered across chromosomes, consistent with processes such as biased gene conversion. Our results indicate the need for more nuanced models of sequence evolution that systematically incorporate the effects of variable mutation biases on codon frequencies.


Viruses ◽  
2020 ◽  
Vol 12 (9) ◽  
pp. 991
Author(s):  
Huiguang Wu ◽  
Zhengyu Bao ◽  
Chunxiao Mou ◽  
Zhenhai Chen ◽  
Jingwen Zhao

Porcine astrovirus (PAstV), associated with mild diarrhea and neurological disease, is transmitted in pig farms worldwide. The purpose of this study is to elucidate the main factors affecting codon usage to PAstVs. Phylogenetic analysis showed that the subtype PAstV-5 sat at the bottom of phylogenetic tree, followed by PAstV-3, PAstV-1, PAstV-2, and PAstV-4, indicating that the five existing subtypes (PAstV1-PAstV5) may be formed by multiple differentiations of PAstV ancestors. A codon usage bias was found in the PAstVs-2,3,4,5 from the analyses of effective number of codons (ENC) and relative synonymous codon usage (RSCU). Nucleotides A/U are more frequently used than nucleotides C/G in the genome CDSs of the PAstVs-3,4,5. Codon usage patterns of PAstV-5 are dominated by mutation pressure and natural selection, while natural selection is the main evolutionary force that affects the codon usage pattern of PAstVs-2,3,4. The analyses of codon adaptation index (CAI), relative codon deoptimization index (RCDI), and similarity index (SiD) showed the codon usage similarities between the PAstV and animals might contribute to the broad host range and the cross-species transmission of astrovirus. Our results provide insight into understanding the PAstV evolution and codon usage patterns.


Genetics ◽  
2001 ◽  
Vol 159 (3) ◽  
pp. 1191-1199
Author(s):  
Araxi O Urrutia ◽  
Laurence D Hurst

Abstract In numerous species, from bacteria to Drosophila, evidence suggests that selection acts even on synonymous codon usage: codon bias is greater in more abundantly expressed genes, the rate of synonymous evolution is lower in genes with greater codon bias, and there is consistency between genes in the same species in which codons are preferred. In contrast, in mammals, while nonequal use of alternative codons is observed, the bias is attributed to the background variance in nucleotide concentrations, reflected in the similar nucleotide composition of flanking noncoding and exonic third sites. However, a systematic examination of the covariants of codon usage controlling for background nucleotide content has yet to be performed. Here we present a new method to measure codon bias that corrects for background nucleotide content and apply this to 2396 human genes. Nearly all (99%) exhibit a higher amount of codon bias than expected by chance. The patterns associated with selectively driven codon bias are weakly recovered: Broadly expressed genes have a higher level of bias than do tissue-specific genes, the bias is higher for genes with lower rates of synonymous substitutions, and certain codons are repeatedly preferred. However, while these patterns are suggestive, the first two patterns appear to be methodological artifacts. The last pattern reflects in part biases in usage of nucleotide pairs. We conclude that we find no evidence for selection on codon usage in humans.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Giovanni Franzo ◽  
Claudia Maria Tucciarone ◽  
Matteo Legnardi ◽  
Mattia Cecchinato

Abstract Background Infectious bronchitis virus (IBV) is one of the most relevant viruses affecting the poultry industry, and several studies have investigated the factors involved in its biological cycle and evolution. However, very few of those studies focused on the effect of genome composition and the codon bias of different IBV proteins, despite the remarkable increase in available complete genomes. In the present study, all IBV complete genomes were downloaded (n = 383), and several statistics representative of genome composition and codon bias were calculated for each protein-coding sequence, including but not limited to, the nucleotide odds ratio, relative synonymous codon usage and effective number of codons. Additionally, viral codon usage was compared to host codon usage based on a collection of highly expressed genes in IBV target and nontarget tissues. Results The results obtained demonstrated a significant difference among structural, non-structural and accessory proteins, especially regarding dinucleotide composition, which appears under strong selective forces. In particular, some dinucleotide pairs, such as CpG, a probable target of the host innate immune response, are underrepresented in genes coding for pp1a, pp1ab, S and N. Although genome composition and dinucleotide bias appear to affect codon usage, additional selective forces may act directly on codon bias. Variability in relative synonymous codon usage and effective number of codons was found for different proteins, with structural proteins and polyproteins being more adapted to the codon bias of host target tissues. In contrast, accessory proteins had a more biased codon usage (i.e., lower number of preferred codons), which might contribute to the regulation of their expression level and timing throughout the cell cycle. Conclusions The present study confirms the existence of selective forces acting directly on the genome and not only indirectly through phenotype selection. This evidence might help understanding IBV biology and in developing attenuated strains without affecting the protein phenotype and therefore immunogenicity.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Youhua Chen

Synonymous codon usage patterns of neuraminidase (NA) gene of 64 subtypes (one is a mixed subtype) of influenza A virus found in Canada were analyzed. In total, 1422 NA sequences were analyzed. Among the subtypes, H1N1 is the prevailing one with 516 NCBI accession records, followed by H3N2, H3N8, and H4N6. The year of 2009 has the highest report records for the NA sequences in Canada, corresponding to the 2009 pandemic event. Correspondence analysis on the RSCU values of the four major subtypes showed that they had distinct clustering patterns in the two-dimensional scatter plot, indicating that different subtypes of IAV utilized different preferential codons. This subtype clustering pattern implied the important influence of natural selection, which could be further evidenced by an extremely flattened regression line in the neutrality plot (GC12 versus G3s plot) and a significant phylogenetic signal on the distribution of different subtypes in the clades of the phylogenetic tree (λ statistic). In conclusion, different subtypes of IAV showed an evolutionary differentiation on choosing different optimal codons. Natural selection played a deterministic role to structure IAV codon usage patterns in Canada.


Viruses ◽  
2018 ◽  
Vol 10 (11) ◽  
pp. 604 ◽  
Author(s):  
Naveen Kumar ◽  
Diwakar Kulkarni ◽  
Benhur Lee ◽  
Rahul Kaushik ◽  
Sandeep Bhatia ◽  
...  

Hendra virus (HeV) and Nipah virus (NiV) are among a group of emerging bat-borne paramyxoviruses that have crossed their species-barrier several times by infecting several hosts with a high fatality rate in human beings. Despite the fatal nature of their infection, a comprehensive study to explore their evolution and adaptation in different hosts is lacking. A study of codon usage patterns in henipaviruses may provide some fruitful insight into their evolutionary processes of synonymous codon usage and host-adapted evolution. Here, we performed a systematic evolutionary and codon usage bias analysis of henipaviruses. We found a low codon usage bias in the coding sequences of henipaviruses and that natural selection, mutation pressure, and nucleotide compositions shapes the codon usage patterns of henipaviruses, with natural selection being more important than the others. Also, henipaviruses showed the highest level of adaptation to bats of the genus Pteropus in the codon adaptation index (CAI), relative to the codon de-optimization index (RCDI), and similarity index (SiD) analyses. Furthermore, a comparison to recently identified henipa-like viruses indicated a high tRNA adaptation index of henipaviruses for human beings, mainly due to F, G and L proteins. Consequently, the study concedes the substantial emergence of henipaviruses in human beings, particularly when paired with frequent exposure to direct/indirect bat excretions.


2018 ◽  
Vol 19 (12) ◽  
pp. 4010
Author(s):  
Zhaocai Li ◽  
Wen Hu ◽  
Xiaoan Cao ◽  
Ping Liu ◽  
Youjun Shang ◽  
...  

The family of Chlamydiaceae contains a group of obligate intracellular bacteria that can infect a wide range of hosts. The evolutionary trend of members in this family is a hot topic, which benefits our understanding of the cross-infection of these pathogens. In this study, 14 whole genomes of 12 Chlamydia species were used to investigate the nucleotide, codon, and amino acid usage bias by synonymous codon usage value and information entropy method. The results showed that all the studied Chlamydia spp. had A/T rich genes with over-represented A or T at the third positions and G or C under-represented at these positions, suggesting that nucleotide usages influenced synonymous codon usages. The overall codon usage trend from synonymous codon usage variations divides the Chlamydia spp. into four separate clusters, while amino acid usage divides the Chlamydia spp. into two clusters with some exceptions, which reflected the genetic diversity of the Chlamydiaceae family members. The overall codon usage pattern represented by the effective number of codons (ENC) was significantly positively correlated to gene GC3 content. A negative correlation exists between ENC and the codon adaptation index for some Chlamydia species. These results suggested that mutation pressure caused by nucleotide composition constraint played an important role in shaping synonymous codon usage patterns. Furthermore, codon usage of T3ss and Pmps gene families adapted to that of the corresponding genome. Taken together, analyses help our understanding of evolutionary interactions between nucleotide, synonymous codon, and amino acid usages in genes of Chlamydiaceae family members.


2005 ◽  
Vol 03 (01) ◽  
pp. 157-168 ◽  
Author(s):  
PETER L. MEINTJES ◽  
ALLEN G. RODRIGO

Mutation in Human Immunodeficiency Virus type-1 (HIV-1) is extremely rapid, a consequence of a low-fidelity viral reverse transcription process. The envelope gene has been shown to accumulate substitutions at a rate of approximately 1% per year and can frequently spend a long time in the host (approximately 10 years). The relative synonymous codon usage (RSCU) in HIV-1 is known to be different from that of the human host. However, by reengineering the protein coding sequences of HIV-1 to reflect the RSCU patterns observed in humans, a large increase in protein expression is observed. It is reasonable to suggest that within a host there may be a selective drive for change in the RSCU of HIV-1 towards human RSCU.To test this hypothesis we analyzed HIV-1 partial envelope sequences from eight patients sampled serially in time. For each sequence, an RSCU table was constructed. Sequences were labelled as "early" or "late" depending on whether they were sampled before or after the mid-point of the study. Using the RSCU values as descriptor variables, a Principal Components Analysis (PCA) was performed. The first three components clearly discriminated between early and late sequences. We also constructed pooled groupwise RSCU tables for early and late sequences. The viral RSCU values of each of the groups were correlated with human RSCU. If there is selection for host-adaptation in RSCU, we expect that "late" viral RSCUs would tend to be more highly correlated with human RSCU than "early" viral RSCUs. In fact, tests of significance suggest that this is the case. However, closer examination of the data revealed that the apparent trend towards human RSCU can be attributed to the homogenization of the codon usage by mutation pressure rather than host adaptation.


Sign in / Sign up

Export Citation Format

Share Document