scholarly journals Characterization of Viral Populations by Using Circular Sequencing

2016 ◽  
Vol 90 (20) ◽  
pp. 8950-8953 ◽  
Author(s):  
Zachary J. Whitfield ◽  
Raul Andino

With the enormous sizes viral populations reach, many variants are at too low a frequency to be detected by conventional next-generation sequencing (NGS) methods. Circular sequencing (CirSeq) is a method by which the error rate of next-generation sequencing is decreased so that even low-frequency viral variants can be accurately detected. The ability to visualize almost the entire genetic makeup of a viral swarm has implications for epidemiology, viral evolution, and vaccine design. Here we discuss experimental planning, analysis, and recent insights using CirSeq.

2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Santosh Anand ◽  
Eleonora Mangano ◽  
Nadia Barizzone ◽  
Roberta Bordoni ◽  
Melissa Sorosina ◽  
...  

Abstract Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments.


2019 ◽  
Vol 93 (13) ◽  
Author(s):  
Pragya D. Yadav ◽  
Shannon L. M. Whitmer ◽  
Prasad Sarkale ◽  
Terry Fei Fan Ng ◽  
Cynthia S. Goldsmith ◽  
...  

ABSTRACTIn 2011, ticks were collected from livestock following an outbreak of Crimean Congo hemorrhagic fever (CCHF) in Gujarat state, India. CCHF-negativeHyalomma anatolicumtick pools were passaged for virus isolation, and two virus isolates were obtained, designated Karyana virus (KARYV) and Kundal virus (KUNDV), respectively. Traditional reverse transcription-PCR (RT-PCR) identification of known viruses was unsuccessful, but a next-generation sequencing (NGS) approach identified KARYV and KUNDV as viruses in theReoviridaefamily,OrbivirusandColtivirusgenera, respectively. Viral genomes werede novoassembled, yielding 10 complete segments of KARYV and 12 nearly complete segments of KUNDV. The VP1 gene of KARYV shared a most recent common ancestor with Wad Medani virus (WMV), strain Ar495, and based on nucleotide identity we demonstrate that it is a novel WMV strain. The VP1 segment of KUNDV shares a common ancestor with Colorado tick fever virus, Eyach virus, Tai Forest reovirus, and Tarumizu tick virus from theColtivirusgenus. Based on VP1, VP6, VP7, and VP12 nucleotide and amino acid identities, KUNDV is proposed to be a new species ofColtivirus. Electron microscopy supported the classification of KARYV and KUNDV as reoviruses and identified replication morphology consistent with other orbi- and coltiviruses. The identification of novel tick-borne viruses carried by the CCHF vector is an important step in the characterization of their potential role in human and animal pathogenesis.IMPORTANCETicks and mosquitoes, as wellCulicoides, can transmit viruses in theReoviridaefamily. With the help of next-generation sequencing (NGS), previously unreported reoviruses such as equine encephalosis virus, Wad Medani virus (WMV), Kammavanpettai virus (KVPTV), and, with this report, KARYV and KUNDV have been discovered and characterized in India. The isolation of KUNDV and KARYV fromHyalomma anatolicum, which is a known vector for zoonotic pathogens, such as Crimean Congo hemorrhagic fever virus,Babesia,Theileria, andAnaplasmaspecies, identifies arboviruses with the potential to transmit to humans. Characterization of KUNDV and KARYV isolated fromHyalommaticks is critical for the development of specific serological and molecular assays that can be used to determine the association of these viruses with disease in humans and livestock.


2015 ◽  
Vol 76 ◽  
pp. 70
Author(s):  
Deborah Ferriola ◽  
Jamie Duke ◽  
Anh Huynh ◽  
Alison Gasiewski ◽  
Marianne Rogers ◽  
...  

2014 ◽  
Vol 8 (04) ◽  
pp. 498-509 ◽  
Author(s):  
Zhen Lin ◽  
Amber Farooqui ◽  
Guishuang Li ◽  
Gane KS Wong ◽  
Andrew L Mason ◽  
...  

Introduction: Conventional methods used to detect and characterize influenza viruses in biological samples face multiple challenges due to the diversity of subtypes and high dissimilarity of emerging strains. Next-generation sequencing (NGS) is a powerful technique that can facilitate the detection and characterization of influenza, however, the sequencing strategy and the procedures of data analysis possess different aspects that require careful consideration. Methodology: The RNA from the lungs of ferrets infected with influenza A/California/07/2009 was analyzed by next-generation sequencing (NGS) without using specific PCR amplification of the viral sequences. Several bioinformatic approaches were used to resolve the viral genes and detect viral quasispecies. Results: The genomic sequences of influenza virus were characterized to a high level of detail when analyzing the short-reads with either the fast aligner Bowtie2, the general purpose aligner BLASTn or de novo assembly with Abyss. Moreover, when using distant viral sequences as reference, these methods were still able to resolve the viral sequences of a biological sample. Finally, direct sequencing of RNA samples did not provide sufficient coverage of the viral genome to study viral quasispecies, and, therefore, prior amplification of the viral segments by PCR would be required to perform this type of analysis. Conclusions: the introduction of NGS for virus research allows routine full characterization of viral isolates; however, careful design of the sequencing strategy and the procedures for data analysis are still of critical importance.


2015 ◽  
Vol 89 (16) ◽  
pp. 8540-8555 ◽  
Author(s):  
Shuntai Zhou ◽  
Corbin Jones ◽  
Piotr Mieczkowski ◽  
Ronald Swanstrom

ABSTRACTValidating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS.IMPORTANCEAlthough next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize the recovery of sequences from input templates and to reduce resampling of the Primer ID so that appropriate multiplexing can be included in the experimental design. With the defined sampling depth and measured error rate, we are able to assign cutoffs for the accurate detection of minority variants in viral populations. This approach allows the power of NGS to be realized without having to guess about sampling depth or to ignore the problem of PCR resampling, while also being able to correct most of the errors in the data set.


Sign in / Sign up

Export Citation Format

Share Document