scholarly journals The ability of single genes vs full genomes to resolve time and space in outbreak analysis

2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Gytis Dudas ◽  
Trevor Bedford

Abstract Background Inexpensive pathogen genome sequencing has had a transformative effect on the field of phylodynamics, where ever increasing volumes of data have promised real-time insight into outbreaks of infectious disease. As well as the sheer volume of pathogen isolates being sequenced, the sequencing of whole pathogen genomes, rather than select loci, has allowed phylogenetic analyses to be carried out at finer time scales, often approaching serial intervals for infections caused by rapidly evolving RNA viruses. Despite its utility, whole genome sequencing of pathogens has not been adopted universally and targeted sequencing of loci is common in some pathogen-specific fields. Results In this study we highlighted the utility of sequencing whole genomes of pathogens by re-analysing a well-characterised collection of Ebola virus sequences in the form of complete viral genomes (≈19 kb long) or the rapidly evolving glycoprotein (GP, ≈2 kb long) gene. We have quantified changes in phylogenetic, temporal, and spatial inference resolution as a result of this reduction in data and compared these to theoretical expectations. Conclusions We propose a simple intuitive metric for quantifying temporal resolution, i.e. the time scale over which sequence data might be informative of various processes as a quick back-of-the-envelope calculation of statistical power available to molecular clock analyses.

2019 ◽  
Author(s):  
Gytis Dudas ◽  
Trevor Bedford

AbstractInexpensive pathogen genome sequencing has had a transformative effect on the field of phylodynamics, where ever increasing volumes of data have promised real-time insight into outbreaks of infectious disease. As well as the sheer volume of pathogen isolates being sequenced, the sequencing of whole pathogen genomes, rather than select loci, has allowed phylogenetic analyses to be carried out at finer time scales, often approaching serial intervals for infections caused by rapidly evolving RNA viruses. Despite its utility, whole genome sequencing of pathogens has not been adopted universally and targeted sequencing of loci is common in some pathogen-specific fields. In this study we aim to highlight the utility of sequencing whole genomes of pathogens by re-analysing a well-characterised collection of Ebola virus sequences in the form of complete viral genomes (~19kb long) or the rapidly evolving glycoprotein (GP, ~2kb long) gene. We quantify changes in phylogenetic, temporal, and spatial inference resolution as a result of this reduction in data and compare these to theoretical expectations. We propose a simple intuitive metric for quantifying temporal resolution,i.e.the time scale over which sequence data might be informative of various processes as a quick back-of-the-envelope calculation of statistical power available to molecular clock analyses.


2021 ◽  
Vol 26 (43) ◽  
Author(s):  
Maximilian Muenchhoff ◽  
Alexander Graf ◽  
Stefan Krebs ◽  
Caroline Quartucci ◽  
Sandra Hasmann ◽  
...  

Background In the SARS-CoV-2 pandemic, viral genomes are available at unprecedented speed, but spatio-temporal bias in genome sequence sampling precludes phylogeographical inference without additional contextual data. Aim We applied genomic epidemiology to trace SARS-CoV-2 spread on an international, national and local level, to illustrate how transmission chains can be resolved to the level of a single event and single person using integrated sequence data and spatio-temporal metadata. Methods We investigated 289 COVID-19 cases at a university hospital in Munich, Germany, between 29 February and 27 May 2020. Using the ARTIC protocol, we obtained near full-length viral genomes from 174 SARS-CoV-2-positive respiratory samples. Phylogenetic analyses using the Auspice software were employed in combination with anamnestic reporting of travel history, interpersonal interactions and perceived high-risk exposures among patients and healthcare workers to characterise cluster outbreaks and establish likely scenarios and timelines of transmission. Results We identified multiple independent introductions in the Munich Metropolitan Region during the first weeks of the first pandemic wave, mainly by travellers returning from popular skiing areas in the Alps. In these early weeks, the rate of presumable hospital-acquired infections among patients and in particular healthcare workers was high (9.6% and 54%, respectively) and we illustrated how transmission chains can be dissected at high resolution combining virus sequences and spatio-temporal networks of human interactions. Conclusions Early spread of SARS-CoV-2 in Europe was catalysed by superspreading events and regional hotspots during the winter holiday season. Genomic epidemiology can be employed to trace viral spread and inform effective containment strategies.


2021 ◽  
Author(s):  
Rebecca I. Johnson ◽  
Beata Boczkowska ◽  
Kendra Alfson ◽  
Taylor Weary ◽  
Heather Menzie ◽  
...  

Ebola virus (EBOV), of the family Filoviridae, is an RNA virus that can cause hemorrhagic fever with a high mortality rate. Defective viral genomes (DVGs) are truncated genomes that have been observed during multiple RNA virus infections, including  in vitro EBOV infection, and have previously been associated with viral persistence and immunostimulatory activity. As DVGs have been detected in cells persistently infected with EBOV, we hypothesized that DVGs may also accumulate during viral replication in filovirus-infected hosts. Therefore, we interrogated sequence data from serum and tissues using a bioinformatics tool in order to identify the presence of DVGs in nonhuman primates (NHPs) infected with EBOV, Sudan virus (SUDV) or Marburg virus (MARV). Multiple 5’ copy-back DVGs (cbDVGs) were detected in NHP serum during the acute phase of filovirus infection. While the relative abundance of total DVGs in most animals was low, serum collected during acute EBOV and SUDV infections, but not MARV infection, contained a higher proportion of short trailer sequence cbDVGs than the challenge stock. This indicated an accumulation of these DVGs throughout infection, potentially due to the preferential replication of short DVGs over the longer viral genome. Using RT-PCR and deep sequencing, we also confirmed the presence of 5’ cbDVGs in EBOV-infected NHP testes, which is of interest due to EBOV persistence in semen of male survivors of infection. This work suggests that DVGs play a role in EBOV infection in vivo and further study will lead to a better understanding of EBOV pathogenesis. Importance The study of filovirus pathogenesis is critical for understanding the consequences of infection and the development of strategies to ameliorate future outbreaks. Defective viral genomes (DVGs) have been detected during EBOV infections in vitro , however their presence in in vivo infections remains unknown. In this study, DVGs were detected in samples collected from EBOV- and SUDV-infected nonhuman primates (NHPs). The accumulation of these DVGs in the trailer region of the genome during infection indicates a potential role in EBOV and SUDV pathogenesis. In particular, the presence of DVGs in the testes of infected NHPs requires further investigation as it may be linked to the establishment of persistence.


Author(s):  
Tobias Andermann ◽  
Maria Fernanda Torres Jimenez ◽  
Pável Matos-Maraví ◽  
Romina Batista ◽  
José L Blanco-Pastor ◽  
...  

High-throughput DNA sequencing techniques enable time- and cost-effective sequencing of large portions of the genome. Instead of sequencing and annotating whole genomes, many phylogenetic studies focus sequencing efforts on large sets of pre-selected loci, which further reduces costs and bioinformatic challenges while increasing sequencing depth. One common approach that enriches loci before sequencing is often referred to as target sequence capture. This technique has been shown to be applicable to phylogenetic studies of greatly varying evolutionary depth and has proven to produce powerful, large multi-locus DNA sequence datasets of selected loci, suitable for phylogenetic analyses. However, target capture requires careful theoretical and practical considerations, which will greatly affect the success of the experiment. Here we provide an easy-to-follow flowchart for adequately designing phylogenomic target capture experiments, and we discuss necessary considerations and decisions from the first steps in the lab to the final bioinformatic processing of the sequence data. We particularly discuss issues and challenges related to the taxonomic scope, sample quality, and available genomic resources of target capture projects and how these issues affect all steps from bait design to the bioinformatic processing of the data. Altogether this review outlines a roadmap for future target capture experiments and is intended to assist researchers with making informed decisions for designing and carrying out successful phylogenetic target capture studies


Author(s):  
C. Lam ◽  
K. Gray ◽  
M. Gall ◽  
R. Sadsad ◽  
A. Arnott ◽  
...  

SARS-CoV-2 genomic surveillance has been vital in understanding the spread of COVID-19, the emergence of viral escape mutants and variants of concern. However, low viral loads in clinical specimens affect variant calling for phylogenetic analyses and detection of low frequency variants, important in uncovering infection transmission chains. We systematically evaluated three widely adopted SARS-CoV-2 whole genome sequencing methods for their sensitivity, specificity, and ability to reliably detect low frequency variants. Our analyses highlight that the ARTIC v3 protocol consistently displays high sensitivity for generating complete genomes at low viral loads compared with the probe-based Illumina respiratory viral oligo panel, and a pooled long-amplicon method. We show substantial variability in the number and location of low-frequency variants detected using the three methods, highlighting the importance of selecting appropriate methods to obtain high quality sequence data from low viral load samples for public health and genomic surveillance purposes.


2020 ◽  
Vol 2 (4) ◽  
Author(s):  
Gargi Dayama ◽  
Weichen Zhou ◽  
Javier Prado-Martinez ◽  
Tomas Marques-Bonet ◽  
Ryan E Mills

Abstract The transfer and integration of whole and partial mitochondrial genomes into the nuclear genomes of eukaryotes is an ongoing process that has facilitated the transfer of genes and contributed to the evolution of various cellular pathways. Many previous studies have explored the impact of these insertions, referred to as NumtS, but have focused primarily on older events that have become fixed and are therefore present in all individual genomes for a given species. We previously developed an approach to identify novel Numt polymorphisms from next-generation sequence data and applied it to thousands of human genomes. Here, we extend this analysis to 79 individuals of other great ape species including chimpanzee, bonobo, gorilla, orang-utan and also an old world monkey, macaque. We show that recent Numt insertions are prevalent in each species though at different apparent rates, with chimpanzees exhibiting a significant increase in both polymorphic and fixed Numt sequences as compared to other great apes. We further assessed positional effects in each species in terms of evolutionary time and rate of insertion and identified putative hotspots on chromosome 5 for Numt integration, providing insight into both recent polymorphic and older fixed reference NumtS in great apes in comparison to human events.


2020 ◽  
Author(s):  
Gargi Dayama ◽  
Weichen Zhou ◽  
Javier Prado-Martinez ◽  
Tomas Marques-Bonet ◽  
Ryan E. Mills

ABSTRACTThe transfer and integration of whole and partial mitochondrial genomes into the nuclear genomes of eukaryotes is an ongoing process that has facilitated the transfer of genes and contributed to the evolution of various cellular pathways. Many previous studies have explored the impact of these insertions, referred to as NumtS, but have focused primarily on older events that have become fixed and are therefore present in all individual genomes for a given species. We previously developed an approach to identify novel Numt polymorphisms from next generation sequence data and applied it to thousands of human genomes. Here, we extend this analysis to 79 individuals of other great ape species including chimpanzee, bonobo, gorilla, orang-utan and also an old world monkey, macaque. We show that recent Numt insertions are prevalent in each species though at different apparent rates, with chimpanzees exhibiting a significant increase in both polymorphic and fixed Numt sequences as compared to other great apes. We further assessed positional effects in each species in terms of evolutionary time and rate of insertion and identified putative hotspots on chromosome 5 for Numt integration, providing insight into both recent polymorphic and older fixed reference NumtS in great apes in comparison to human events.


2019 ◽  
Author(s):  
Tobias Andermann ◽  
Maria Fernanda Torres Jimenez ◽  
Pável Matos-Maraví ◽  
Romina Batista ◽  
José L Blanco-Pastor ◽  
...  

High-throughput DNA sequencing techniques enable time- and cost-effective sequencing of large portions of the genome. Instead of sequencing and annotating whole genomes, many phylogenetic studies focus sequencing efforts on large sets of pre-selected loci, which further reduces costs and bioinformatic challenges while increasing sequencing depth. One common approach that enriches loci before sequencing is often referred to as target sequence capture. This technique has been shown to be applicable to phylogenetic studies of greatly varying evolutionary depth and has proven to produce powerful, large multi-locus DNA sequence datasets of selected loci, suitable for phylogenetic analyses. However, target capture requires careful theoretical and practical considerations, which will greatly affect the success of the experiment. Here we provide an easy-to-follow flowchart for adequately designing phylogenomic target capture experiments, and we discuss necessary considerations and decisions from the first steps in the lab to the final bioinformatic processing of the sequence data. We particularly discuss issues and challenges related to the taxonomic scope, sample quality, and available genomic resources of target capture projects and how these issues affect all steps from bait design to the bioinformatic processing of the data. Altogether this review outlines a roadmap for future target capture experiments and is intended to assist researchers with making informed decisions for designing and carrying out successful phylogenetic target capture studies


2021 ◽  
Author(s):  
Connie Lam ◽  
Karen-Ann Gray ◽  
Mailie Gall ◽  
Rosemarie Sadsad ◽  
Alicia Arnott ◽  
...  

SARS-CoV-2 genomic surveillance has been vital in understanding the spread of COVID-19, the emergence of viral escape mutants and variants of concern. However, low viral loads in clinical specimens affect variant calling for phylogenetic analyses and detection of low frequency variants, important in uncovering infection transmission chains. We systematically evaluated three widely adopted SARS-CoV-2 whole genome sequencing methods for their sensitivity, specificity, and ability to reliably detect low frequency variants. Our analyses highlight that the ARTIC v3 protocol consistently displays high sensitivity for generating complete genomes at low viral loads compared with the probe-based Illumina respiratory viral oligo panel, and a pooled long-amplicon method. We show substantial variability in the number and location of low-frequency variants detected using the three methods, highlighting the importance of selecting appropriate methods to obtain high quality sequence data from low viral load samples for public health and genomic surveillance purposes.


2019 ◽  
Vol 44 (4) ◽  
pp. 930-942
Author(s):  
Geraldine A. Allen ◽  
Luc Brouillet ◽  
John C. Semple ◽  
Heidi J. Guest ◽  
Robert Underhill

Abstract—Doellingeria and Eucephalus form the earliest-diverging clade of the North American Astereae lineage. Phylogenetic analyses of both nuclear and plastid sequence data show that the Doellingeria-Eucephalus clade consists of two main subclades that differ from current circumscriptions of the two genera. Doellingeria is the sister group to E. elegans, and the Doellingeria + E. elegans subclade in turn is sister to the subclade containing all remaining species of Eucephalus. In the plastid phylogeny, the two subclades are deeply divergent, a pattern that is consistent with an ancient hybridization event involving ancestral species of the Doellingeria-Eucephalus clade and an ancestral taxon of a related North American or South American group. Divergence of the two Doellingeria-Eucephalus subclades may have occurred in association with northward migration from South American ancestors. We combine these two genera under the older of the two names, Doellingeria, and propose 12 new combinations (10 species and two varieties) for all species of Eucephalus.


Sign in / Sign up

Export Citation Format

Share Document