scholarly journals Functional alterations caused by mutations reflect evolutionary trends of SARS-CoV-2

Author(s):  
Liang Cheng ◽  
Xudong Han ◽  
Zijun Zhu ◽  
Changlu Qi ◽  
Ping Wang ◽  
...  

Abstract Since the first report of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in December 2019, the COVID-19 pandemic has spread rapidly worldwide. Due to the limited virus strains, few key mutations that would be very important with the evolutionary trends of virus genome were observed in early studies. Here, we downloaded 1809 sequence data of SARS-CoV-2 strains from GISAID before April 2020 to identify mutations and functional alterations caused by these mutations. Totally, we identified 1017 nonsynonymous and 512 synonymous mutations with alignment to reference genome NC_045512, none of which were observed in the receptor-binding domain (RBD) of the spike protein. On average, each of the strains could have about 1.75 new mutations each month. The current mutations may have few impacts on antibodies. Although it shows the purifying selection in whole-genome, ORF3a, ORF8 and ORF10 were under positive selection. Only 36 mutations occurred in 1% and more virus strains were further analyzed to reveal linkage disequilibrium (LD) variants and dominant mutations. As a result, we observed five dominant mutations involving three nonsynonymous mutations C28144T, C14408T and A23403G and two synonymous mutations T8782C, and C3037T. These five mutations occurred in almost all strains in April 2020. Besides, we also observed two potential dominant nonsynonymous mutations C1059T and G25563T, which occurred in most of the strains in April 2020. Further functional analysis shows that these mutations decreased protein stability largely, which could lead to a significant reduction of virus virulence. In addition, the A23403G mutation increases the spike-ACE2 interaction and finally leads to the enhancement of its infectivity. All of these proved that the evolution of SARS-CoV-2 is toward the enhancement of infectivity and reduction of virulence.

PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0261014
Author(s):  
Carlos Arana ◽  
Chaoying Liang ◽  
Matthew Brock ◽  
Bo Zhang ◽  
Jinchun Zhou ◽  
...  

High viral transmission in the COVID-19 pandemic has enabled SARS‐CoV‐2 to acquire new mutations that may impact genome sequencing methods. The ARTIC.v3 primer pool that amplifies short amplicons in a multiplex-PCR reaction is one of the most widely used methods for sequencing the SARS-CoV-2 genome. We observed that some genomic intervals are poorly captured with ARTIC primers. To improve the genomic coverage and variant detection across these intervals, we designed long amplicon primers and evaluated the performance of a short (ARTIC) plus long amplicon (MRL) sequencing approach. Sequencing assays were optimized on VR-1986D-ATCC RNA followed by sequencing of nasopharyngeal swab specimens from fifteen COVID-19 positive patients. ARTIC data covered 94.47% of the virus genome fraction in the positive control and patient samples. Variant analysis in the ARTIC data detected 217 mutations, including 209 single nucleotide variants (SNVs) and eight insertions & deletions. On the other hand, long-amplicon data detected 156 mutations, of which 80% were concordant with ARTIC data. Combined analysis of ARTIC + MRL data improved the genomic coverage to 97.03% and identified 214 high confidence mutations. The combined final set of 214 mutations included 203 SNVs, 8 deletions and 3 insertions. Analysis showed 26 SARS-CoV-2 lineage defining mutations including 4 known variants of concern K417N, E484K, N501Y, P618H in spike gene. Hybrid analysis identified 7 nonsynonymous and 5 synonymous mutations across the genome that were either ambiguous or not called in ARTIC data. For example, G172V mutation in the ORF3a protein and A2A mutation in Membrane protein were missed by the ARTIC assay. Thus, we show that while the short amplicon (ARTIC) assay provides good genomic coverage with high throughput, complementation of poorly captured intervals with long amplicon data can significantly improve SARS-CoV-2 genomic coverage and variant detection.


2021 ◽  
Vol 17 (1) ◽  
pp. e1008678
Author(s):  
Carlos Valiente-Mullor ◽  
Beatriz Beamud ◽  
Iván Ansari ◽  
Carlos Francés-Cuesta ◽  
Neris García-González ◽  
...  

Mapping of high-throughput sequencing (HTS) reads to a single arbitrary reference genome is a frequently used approach in microbial genomics. However, the choice of a reference may represent a source of errors that may affect subsequent analyses such as the detection of single nucleotide polymorphisms (SNPs) and phylogenetic inference. In this work, we evaluated the effect of reference choice on short-read sequence data from five clinically and epidemiologically relevant bacteria (Klebsiella pneumoniae, Legionella pneumophila, Neisseria gonorrhoeae, Pseudomonas aeruginosa and Serratia marcescens). Publicly available whole-genome assemblies encompassing the genomic diversity of these species were selected as reference sequences, and read alignment statistics, SNP calling, recombination rates, dN/dS ratios, and phylogenetic trees were evaluated depending on the mapping reference. The choice of different reference genomes proved to have an impact on almost all the parameters considered in the five species. In addition, these biases had potential epidemiological implications such as including/excluding isolates of particular clades and the estimation of genetic distances. These findings suggest that the single reference approach might introduce systematic errors during mapping that affect subsequent analyses, particularly for data sets with isolates from genetically diverse backgrounds. In any case, exploring the effects of different references on the final conclusions is highly recommended.


Author(s):  
Eddie M. Wampande ◽  
Peter Waiswa ◽  
David J. Allen ◽  
Roger Hewson ◽  
Simon D.W. Frost ◽  
...  

Crimean-Congo haemorrhagic fever virus (CCHFV) is the most geographically widespread tick-borne virus. However, African strains are poorly represented in sequence databases. In addition, almost all sequence data have been obtained from cases of human disease, while information regarding circulation of the virus in tick and animal reservoirs is severely lacking. Here, we characterise the complete coding region of a novel CCHFV strain, detected in African blue ticks (Rhipicephalus (Boophilus) decoloratus) feeding on cattle in an abattoir in Kampala, Uganda. These cattle originated from a farm in Mbarara, a major cattle-trading hub for much of Uganda. Phylogenetic analysis indicates that the newly sequenced strain belongs to the African genotype II clade, which predominantly contains the sequences of strains isolated from West Africa in the 1950’s and South Africa in the 1980’s. Whilst, the viral S (nucleoprotein) and L (RNA polymerase) genome segments shared >90% nucleotide similarity with previously reported genotype II strains, the glycoprotein-coding M segment shared only 80% nucleotide similarity with the next most closely related strains, which were from India and China. This segment also displayed a large number of non-synonymous mutations previously unreported in genotype II strains. Characterisation of this novel strain adds to our limited understanding of the natural diversity of CCHFV circulating in both ticks and in Africa. Such data can be used to inform the design of vaccines and diagnostics, as well as studies exploring the epidemiology and evolution of the virus for the establishment of future CCHFV control strategies.


2016 ◽  
Author(s):  
Abayomi S Olabode ◽  
Derek Gatherer ◽  
Xiaowei Jiang ◽  
David Matthews ◽  
Julian A Hiscox ◽  
...  

The phylogenetic relationships of Zaire ebolavirus have been intensively analysed over the course of the 2013-2016 outbreak. However, there has been limited consideration of the functional impact of this variation. Here we describe an analysis of the available sequence data in the context of protein structure and phylogenetic history. Amino acid replacements are rare and predicted to have minor effects on protein stability. Synonymous mutations greatly outnumber nonsynonymous mutations, and most of the latter fall into unstructured intrinsically disordered regions, indicating that purifying selection is the dominant mode of selective pressure. However, one replacement, occurring early in the outbreak in Gueckedou in Guinea on 31st March 2014 (alanine to valine at position 82 in the GP protein), is close to the site where the virus binds to the host receptor NPC1 and is located in the phylogenetic tree at the origin of the major B lineage of the outbreak. The functional and evolutionary evidence indicates this A82V change likely has consequences for EBOV's host specificity and hence adaptation to humans.


Author(s):  
Rami Obeid ◽  
Elias Wehbe ◽  
Mohamad Rima ◽  
Mohammad Kabara ◽  
Romeo Al Bersaoui ◽  
...  

Background: Tobacco mosaic virus (TMV) is the most known virus in the plant mosaic virus family and is able to infect a wide range of crops, in particularly tobacco, causing a production loss. Objectives: Herein, and for the first time in Lebanon, we investigated the presence of TMV infection in crops by analyzing 88 samples of tobacco, tomato, cucumber and pepper collected from different regions in North Lebanon. Methods: Double-antibody sandwich enzyme-linked immunosorbent assay (DAS-ELISA), revealed a potential TMV infection of four tobacco samples out of 88 crops samples collected. However, no tomato, cucumber and pepper samples were infected. The TMV+ tobacco samples were then extensively analyzed by RT-PCR to detect viral RNA using different primers covering all the viral genome. Results and Discussion: PCR results confirmed those of DAS-ELISA showing TMV infection of four tobacco samples collected from three crop fields of North Lebanon. In only one of four TMV+ samples, we were able to amplify almost all the regions of viral genome, suggesting possible mutations in the virus genome or an infection with a new, not yet identified, TMV strain. Conclusion: Our study is the first in Lebanon revealing TMV infection in crop fields, and highlighting the danger that may affect the future of agriculture.


Genetics ◽  
1999 ◽  
Vol 153 (1) ◽  
pp. 497-506 ◽  
Author(s):  
Rasmus Nielsen ◽  
Daniel M Weinreich

Abstract McDonald/Kreitman tests performed on animal mtDNA consistently reveal significant deviations from strict neutrality in the direction of an excess number of polymorphic nonsynonymous sites, which is consistent with purifying selection acting on nonsynonymous sites. We show that under models of recurrent neutral and deleterious mutations, the mean age of segregating neutral mutations is greater than the mean age of segregating selected mutations, even in the absence of recombination. We develop a test of the hypothesis that the mean age of segregating synonymous mutations equals the mean age of segregating nonsynonymous mutations in a sample of DNA sequences. The power of this age-of-mutation test and the power of the McDonald/Kreitman test are explored by computer simulations. We apply the new test to 25 previously published mitochondrial data sets and find weak evidence for selection against nonsynonymous mutations.


eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Colin A Russell ◽  
Peter M Kasson ◽  
Ruben O Donis ◽  
Steven Riley ◽  
John Dunbar ◽  
...  

Assessing the pandemic risk posed by specific non-human influenza A viruses is an important goal in public health research. As influenza virus genome sequencing becomes cheaper, faster, and more readily available, the ability to predict pandemic potential from sequence data could transform pandemic influenza risk assessment capabilities. However, the complexities of the relationships between virus genotype and phenotype make such predictions extremely difficult. The integration of experimental work, computational tool development, and analysis of evolutionary pathways, together with refinements to influenza surveillance, has the potential to transform our ability to assess the risks posed to humans by non-human influenza viruses and lead to improved pandemic preparedness and response.


2020 ◽  
Author(s):  
Brendan N. Reid ◽  
Rachel L. Moran ◽  
Christopher J. Kopack ◽  
Sarah W. Fitzpatrick

AbstractResearchers studying non-model organisms have an increasing number of methods available for generating genomic data. However, the applicability of different methods across species, as well as the effect of reference genome choice on population genomic inference, are still difficult to predict in many cases. We evaluated the impact of data type (whole-genome vs. reduced representation) and reference genome choice on data quality and on population genomic and phylogenomic inference across several species of darters (subfamily Etheostomatinae), a highly diverse radiation of freshwater fish. We generated a high-quality reference genome and developed a hybrid RADseq/sequence capture (Rapture) protocol for the Arkansas darter (Etheostoma cragini). Rapture data from 1900 individuals spanning four darter species showed recovery of most loci across darter species at high depth and consistent estimates of heterozygosity regardless of reference genome choice. Loci with baits spanning both sides of the restriction enzyme cut site performed especially well across species. For low-coverage whole-genome data, choice of reference genome affected read depth and inferred heterozygosity. For similar amounts of sequence data, Rapture performed better at identifying fine-scale genetic structure compared to whole-genome sequencing. Rapture loci also recovered an accurate phylogeny for the study species and demonstrated high phylogenetic informativeness across the evolutionary history of the genus Etheostoma. Low cost and high cross-species effectiveness regardless of reference genome suggest that Rapture and similar sequence capture methods may be worthwhile choices for studies of diverse species radiations.


2019 ◽  
Vol 20 (21) ◽  
pp. 5311 ◽  
Author(s):  
Muhammad Imran ◽  
Sarfraz Shafiq ◽  
Muhammad Ansar Farooq ◽  
Muhammad Kashif Naeem ◽  
Emilie Widemann ◽  
...  

Post-translational modifications are involved in regulating diverse developmental processes. Histone acetyltransferases (HATs) play vital roles in the regulation of chromation structure and activate the gene transcription implicated in various cellular processes. However, HATs in cotton, as well as their regulation in response to developmental and environmental cues, remain unidentified. In this study, 9 HATs were identified from Gossypium raimondi and Gossypium arboretum, while 18 HATs were identified from Gossypium hirsutum. Based on their amino acid sequences, Gossypium HATs were divided into three groups: CPB, GNAT, and TAFII250. Almost all the HATs within each subgroup share similar gene structure and conserved motifs. Gossypium HATs are unevenly distributed on the chromosomes, and duplication analysis suggests that Gossypium HATs are under strong purifying selection. Gene expression analysis showed that Gossypium HATs were differentially expressed in various vegetative tissues and at different stages of fiber development. Furthermore, all the HATs were differentially regulated in response to various stresses (salt, drought, cold, heavy metal and DNA damage) and hormones (abscisic acid (ABA) and auxin (NAA)). Finally, co-localization of HAT genes with reported quantitative trait loci (QTL) of fiber development were reported. Altogether, these results highlight the functional diversification of HATs in cotton growth and fiber development, as well as in response to different environmental cues. This study enhances our understanding of function of histone acetylation in cotton growth, fiber development, and stress adaptation, which will eventually lead to the long-term improvement of stress tolerance and fiber quality in cotton.


Sign in / Sign up

Export Citation Format

Share Document