scholarly journals Genomic Variation and Diversification in Begomovirus Genome in Implication to Host and Vector Adaptation

Plants ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 1706 ◽  
Author(s):  
Deepti Nigam

Begomoviruses (family Geminiviridae, genus Begomovirus) are DNA viruses transmitted in a circulative, persistent manner by the whitefly Bemisia tabaci (Gennadius). As revealed by their wide host range (more than 420 plant species), worldwide distribution, and effective vector transmission, begomoviruses are highly adaptive. Still, the genetic factors that facilitate their adaptation to a diverse array of hosts and vectors remain poorly understood. Mutations in the virus genome may confer a selective advantage for essential functions, such as transmission, replication, evading host responses, and movement within the host. Therefore, genetic variation is vital to virus evolution and, in response to selection pressure, is demonstrated as the emergence of new strains and species adapted to diverse hosts or with unique pathogenicity. The combination of variation and selection forms a genetic imprint on the genome. This review focuses on factors that contribute to the evolution of Begomovirus and their global spread, for which an unforeseen diversity and dispersal has been recognized and continues to expand.

2021 ◽  
Author(s):  
Rajan Saha Raju ◽  
Abdullah Al Nahid ◽  
Preonath Shuvo ◽  
Rashedul Islam

AbstractTaxonomic classification of viruses is a multi-class hierarchical classification problem, as taxonomic ranks (e.g., order, family and genus) of viruses are hierarchically structured and have multiple classes in each rank. Classification of biological sequences which are hierarchically structured with multiple classes is challenging. Here we developed a machine learning architecture, VirusTaxo, using a multi-class hierarchical classification by k-mer enrichment. VirusTaxo classifies DNA and RNA viruses to their taxonomic ranks using genome sequence. To assign taxonomic ranks, VirusTaxo extracts k-mers from genome sequence and creates bag-of-k-mers for each class in a rank. VirusTaxo uses a top-down hierarchical classification approach and accurately assigns the order, family and genus of a virus from the genome sequence. The average accuracies of VirusTaxo for DNA viruses are 99% (order), 98% (family) and 95% (genus) and for RNA viruses 97% (order), 96% (family) and 82% (genus). VirusTaxo can be used to detect taxonomy of novel viruses using full length genome or contig sequences.AvailabilityOnline version of VirusTaxo is available at https://omics-lab.com/virustaxo/.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sayaka Nagamoto ◽  
Miyuki Agawa ◽  
Emi Tsuchitani ◽  
Kazunori Akimoto ◽  
Saki Kondo Matsushima ◽  
...  

AbstractGenome editing techniques such as CRISPR/Cas9 have both become common gene engineering technologies and have been applied to gene therapy. However, the problems of increasing the efficiency of genome editing and reducing off-target effects that induce double-stranded breaks at unexpected sites in the genome remain. In this study, we developed a novel Cas9 transduction system, Exci-Cas9, using an adenovirus vector (AdV). Cas9 was expressed on a circular molecule excised by the site-specific recombinase Cre and succeeded in shortening the expression period compared to AdV, which expresses the gene of interest for at least 6 months. As an example, we chose hepatitis B, which currently has more than 200 million carriers in the world and frequently progresses to liver cirrhosis or hepatocellular carcinoma. The efficiencies of hepatitis B virus genome disruption by Exci-Cas9 and Cas9 expression by AdV directly (Avec) were the same, about 80–90%. Furthermore, Exci-Cas9 enabled cell- or tissue-specific genome editing by expressing Cre from a cell- or tissue-specific promoter. We believe that Exci-Cas9 developed in this study is useful not only for resolving the persistent expression of Cas9, which has been a problem in genome editing, but also for eliminating long-term DNA viruses such as human papilloma virus.


2019 ◽  
Vol 5 (Supplement_1) ◽  
Author(s):  
F Ferron ◽  
B Canard

Abstract Large-genome Nidoviruses and Nidovirus-like viruses reside at the current boundary of largest RNA genome sizes. They code for an unusually large number of gene products matching that of small DNA viruses (e.g. DNA bacteriophages). The order of appearance and distribution of enzyme genes along various virus families (e.g. helicase and ExoN) may be seen as an evolutionary marker in these large RNA genomes lying at the genome size boundary. A positive correlation exists between (+)RNA virus genome sizes and the presence of the RNA helicase and the ExoN domains. Although the mechanistic basis of the presence of the helicase is still unclear, the role of the ExoN activity has been linked to the existence of an RNA synthesis proofreading system. In large Nidovirales, ExoN is bound to a processive replicative RNA-dependent RNA polymerase (RdRp) and corrects mismatched bases during viral RNA synthesis. Over the last decade, a view of the overall process has been refined in Coronaviruses, and in particular in our lab (Ferron et al., PNAS, 2018). We have identified genetic markers of large RNA genomes that we wish to use to data-mine currently existing metagenomic datasets. We have also initiated a collaboration to sequence and explore new viromes that will be searched according to these criteria. Likewise, we have a collection of purified viral RdRps that are currently being used to generate RNA synthesis products that will be compared to existing NGS datasets of cognate viruses. We will be able to have an idea about how much genetic diversity is possibly achievable by viral RdRp (‘tunable fidelity’) versus the detectable diversity (i.e. after selection in the infected cell) that is actually produced.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Timothy E Schlub ◽  
Edward C Holmes

Abstract Overlapping genes are commonplace in viruses and play an important role in their function and evolution. However, aside from studies on specific groups of viruses, relatively little is known about the extent and nature of gene overlap and its determinants in viruses as a whole. Here, we present an extensive characterisation of gene overlap in viruses through an analysis of reference genomes present in the NCBI virus genome database. We find that over half the instances of gene overlap are very small, covering <10 nt, and 84 per cent are <50 nt in length. Despite this, 53 per cent of all viruses still contained a gene overlap of 50 nt or larger. We also investigate several predictors of gene overlap such as genome structure (single- and double-stranded RNA and DNA), virus family, genome length, and genome segmentation. This revealed that gene overlap occurs more frequently in DNA viruses than in RNA viruses, and more frequently in single-stranded viruses than in double-stranded viruses. Genome segmentation is also associated with gene overlap, particularly in single-stranded DNA viruses. Notably, we observed a large range of overlap frequencies across families of all genome types, suggesting that it is a common evolutionary trait that provides flexible genome structures in all virus families.


2017 ◽  
Author(s):  
Devang Mehta ◽  
Matthias Hirsch-Hoffmann ◽  
Mariam Were ◽  
Andrea Patrignani ◽  
Hassan Were ◽  
...  

ABSTRACTDeep-sequencing of virus isolates using short-read sequencing technologies is problematic since viruses are often present in complexes sharing a high-degree of sequence identity. The full-length genomes of such highly-similar viruses cannot be assembled accurately from short sequencing reads. We present a new method, CIDER-Seq (Circular DNA Enrichment Sequencing) which successfully generates accurate full-length virus genomes from individual sequencing reads with no sequence assembly required. CIDER-Seq operates by combining a PCR-free, circular DNA enrichment protocol with Single Molecule Real Time sequencing and a new sequence deconcatenation algorithm. We apply our technique to produce more than 1,200 full-length, highly accurate geminivirus genomes from RNAi-transgenic and control plants in a field trial in Kenya. Using CIDER-Seq we can demonstrate for the first time that the expression of antiviral doublestranded RNA (dsRNA) in transgenic plants causes a consistent shift in virus populations towards species sharing low homology to the transgene derived dsRNA. Our results show that CIDER-seq is a powerful, cost-effective tool for accurately sequencing circular DNA viruses, with future applications in deep-sequencing other forms of circular DNA such as transposons and plasmids.


2017 ◽  
Author(s):  
Brook G. Milligan

Landscape genetics combines population genetics, landscape ecology, and spatial analysis to identify landscape and genetic factors that influence genetic and genomic variation. Progress in the field depends on a strong conceptual foundation and the means of identifying mechanistic connnections between environmental factors, landscape features, and genetic or genomic variation. Many existing approaches and much of the software commonly in use was developed for population genetics or statistics and is not entirely appropriate for landscape genetics. Probabilistic graph models provide a statistically rigorous and flexible means of constructing models directly applicable to landscape genetics. Probabilistic graph models also allow construction of mechanistic models, which are crucial elements in testing hypotheses. Sophisticated software exists for the analysis of graph models; however, much of it does not handle the types of data used for landscape genetics, model structures involving autoregressive spatial interaction between variables, or the scale of landscape genetics problems. Thus, an important priority for the field is to develop suitably flexible software tools for graph models that overcome these problems and allow landscape geneticists to explore meaningfully mechanistic and flexible models. We are developing such a library and applying it to examples in landscape genetics.


2021 ◽  
Author(s):  
Ali Rahnavard ◽  
Rebecca Clement ◽  
Nathaniel Stearrett ◽  
Marcos Pérez-Losada ◽  
Keith A. Crandall ◽  
...  

Abstract The 2019 novel coronavirus (SARS-CoV-2) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. To diminish the short-term and long-term impacts of coronavirus (CoV), we investigated CoV differences at the nucleotide and protein level and CoV genomic variation associated with epidemiological variation and geography. We divided the CoV genome into 29 constituent regions for this analysis. Our results highlight the variation of CoV variants of lineage and show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation, which makes these two proteins potential targets for treatments. S protein variation is highly correlated with nsp3, nsp6, and 3'−to−5' exonuclease. Country of origin and time since the start of the pandemic were the most influential metadata in these differences. Host sex and age are the lowest in terms of explaining the virus genome variation. We quantified variation explained by regions of the CoV genome across different CoV viruses including, SARS-CoV-2, Middle East respiratory syndrome coronavirus (MERS-CoV), other severe acute respiratory syndrome coronavirus SARS-CoV (SARS-related), and bat-derived severe acute respiratory syndrome (SARS)-like coronaviruses (Bat-SL-CoV). We found that Spike protein and nsp3 explain most of the variation among these viruses; they are also among the genomic regions with the highest number of sites under natural selection. Our results provide a direction to prioritize genes associated with outcome predictors, including health, therapeutic, and vaccine outcomes, and to inform improved DNA tests for predicting disease status.


2021 ◽  
Author(s):  
Dieke Boezen ◽  
Ghulam Ali ◽  
Manli Wang ◽  
Xi Wang ◽  
Wopke van der Werf ◽  
...  

AbstractMutation rates are of key importance for understanding evolutionary processes and predicting their outcomes. Empirical estimates of mutation rate are available for a number of RNA viruses, but few are available for DNA viruses, which tend to have larger genomes. Whilst some viruses have very high mutation rates, lower mutation rates are expected for viruses with large genomes to ensure genome integrity. Alphabaculoviruses are insect viruses with large genomes and often have high levels of polymorphism, suggesting high mutation rates despite evidence of proofreading activity by the replication machinery. Here, we report an empirical estimate of the mutation rate per base per strand copying (s/n/r) of Autographa californica multiple nucleopolyhedrovirus (AcMNPV). To avoid biases due to selection, we analyzed mutations that occurred in a stable, non-functional genomic insert after five serial passages in Spodoptera exigua larvae. Population bottlenecks, viral mode of replication and thresholds for mutation detection likely affect mutation rate estimates, and we therefore used population genetic models that account for these processes to infer the mutation rate. We estimated a mutation rate of 1×10−7 s/n/r. This estimate was not sensitive to different model assumptions or including whole genome data. The rates at which different classes of mutations accumulate provide good evidence for neutrality of mutations occurring within the inserted region. We therefore present a robust approach for mutation rate estimation for viruses with stable genomes, and strong evidence of a much lower alphabaculovirus mutation rate than supposed based on the high levels of polymorphism observed.Author SummaryVirus populations can evolve rapidly, driven by the large number of mutations that occur during virus replication. It is challenging to measure mutation rates because selection will affect which mutations are observed: beneficial mutations are overrepresented in virus populations, while deleterious mutations are selected against and therefore underrepresented. Few mutation rates have been estimated for viruses with large DNA genomes, and there are no estimates for any insect virus. Here, we estimate the mutation rate for an alphabaculovirus, a virus that infects caterpillars and has a large, 134 kilobase pair DNA genome. To ensure that selection did not bias our estimate of mutation rate, we studied which mutations occurred in a large artificial region inserted into the virus genome, where mutations did not affect viral fitness. We deep sequenced evolved virus populations, and compared the distribution of observed mutants to predictions from a simulation model to estimate mutation rate. We found evidence for a relatively low mutation rate, of one mutation in every 10 million bases replicated. This estimate is in line with expectations for a virus with self-correcting replication machinery and a large genome.


2017 ◽  
Author(s):  
Brook G. Milligan

Landscape genetics combines population genetics, landscape ecology, and spatial analysis to identify landscape and genetic factors that influence genetic and genomic variation. Progress in the field depends on a strong conceptual foundation and the means of identifying mechanistic connnections between environmental factors, landscape features, and genetic or genomic variation. Many existing approaches and much of the software commonly in use was developed for population genetics or statistics and is not entirely appropriate for landscape genetics. Probabilistic graph models provide a statistically rigorous and flexible means of constructing models directly applicable to landscape genetics. Probabilistic graph models also allow construction of mechanistic models, which are crucial elements in testing hypotheses. Sophisticated software exists for the analysis of graph models; however, much of it does not handle the types of data used for landscape genetics, model structures involving autoregressive spatial interaction between variables, or the scale of landscape genetics problems. Thus, an important priority for the field is to develop suitably flexible software tools for graph models that overcome these problems and allow landscape geneticists to explore meaningfully mechanistic and flexible models. We are developing such a library and applying it to examples in landscape genetics.


Sign in / Sign up

Export Citation Format

Share Document