scholarly journals Optimization and Ev a Luation of Viral Metagenomic Amplification and Sequencing Methods Toward a Genome -le Vel Resolution of the Human Fecal DNA Virome

Author(s):  
Guangyang Wang ◽  
Shenghui Li ◽  
Qiulong Yan ◽  
Ruochun Guo ◽  
Yue Zhang ◽  
...  

Abstract Background: Viruses in the human gut have been linked to health and disease. Deciphering of the gut virome is dependent on metagenomic sequencing of the virus-like particles purified from the fecal specimens. A major limitation of conventional viral metagenomic sequencing is the low recoverability of viral genomes from the metagenomic dataset. Results: Herein, we developed an optimal method for viral amplification and metagenomic sequencing to maximize the recovery of viral genomes. Using 5 fecal specimens with multiple repetitions, we revealed the optimal number of PCR cycles of high-fidelity enzyme-based amplification and the reliability of multiple displacement amplification in virome DNA preparation, verified the reproducibility of the optimally whole viral metagenomic experimental process, and tested the capability of long-read sequencing for improving viral metagenomic assembly. Based on our optimized results, we generated 151 high-quality viruses using the data combined from short-read (15 cycles for PCR amplification) and long-read sequencing. Genomic analysis of these viruses found that most (60.3%) of them were previously unknown and showed a remarkable diversity of viral functions, especially the existence of 206 viral auxiliary metabolic genes. Finally, we compared the viral metagenomic and bulk metagenomic sequencing approaches and revealed significant differences in the efficiency and coverage of viral identification between them. Conclusions: Our study demonstrates the potential of optimized experiment and sequencing strategies in uncovering viral genomes from fecal specimens, which will facilitate future research about genome-level characterization of complex viral communities.

Viruses ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 562
Author(s):  
Joyce Odeke Akello ◽  
Stephen L. Leib ◽  
Olivier Engler ◽  
Christian Beuret

Identification and characterization of viral genomes in vectors including ticks and mosquitoes positive for pathogens of great public health concern using metagenomic next generation sequencing (mNGS) has challenges. One such challenge is the ability to efficiently recover viral RNA which is typically dependent on sample processing. We evaluated the quantitative effect of six different extraction methods in recovering viral RNA in vectors using negative tick homogenates spiked with serial dilutions of tick-borne encephalitis virus (TBEV) and surrogate Langat virus (LGTV). Evaluation was performed using qPCR and mNGS. Sensitivity and proof of concept of optimal method was tested using naturally positive TBEV tick homogenates and positive dengue, chikungunya, and Zika virus mosquito homogenates. The amount of observed viral genome copies, percentage of mapped reads, and genome coverage varied among different extractions methods. The developed Method 5 gave a 120.8-, 46-, 2.5-, 22.4-, and 9.9-fold increase in the number of viral reads mapping to the expected pathogen in comparison to Method 1, 2, 3, 4, and 6, respectively. Our developed Method 5 termed ROVIV (Recovery of Viruses in Vectors) greatly improved viral RNA recovery and identification in vectors using mNGS. Therefore, it may be a more sensitive method for use in arbovirus surveillance.


2019 ◽  
Author(s):  
Dhaivat Joshi ◽  
Shunfu Mao ◽  
Sreeram Kannan ◽  
Suhas Diggavi

AbstractMotivationEfficient and accurate alignment of DNA / RNA sequence reads to each other or to a reference genome / transcriptome is an important problem in genomic analysis. Nanopore sequencing has emerged as a major sequencing technology and many long-read aligners have been designed for aligning nanopore reads. However, the high error rate makes accurate and efficient alignment difficult. Utilizing the noise and error characteristics inherent in the sequencing process properly can play a vital role in constructing a robust aligner. In this paper, we design QAlign, a pre-processor that can be used with any long-read aligner for aligning long reads to a genome / transcriptome or to other long reads. The key idea in QAlign is to convert the nucleotide reads into discretized current levels that capture the error modes of the nanopore sequencer before running it through a sequence aligner.ResultsWe show that QAlign is able to improve alignment rates from around 80% up to 90% with nanopore reads when aligning to the genome. We also show that QAlign improves the average overlap quality by 9.2%, 2.5% and 10.8% in three real datasets for read-to-read alignment. Read-to-transcriptome alignment rates are improved from 51.6% to 75.4% and 82.6% to 90% in two real datasets.Availabilityhttps://github.com/joshidhaivat/QAlign.git


2021 ◽  
Author(s):  
Slawomir Michniewski ◽  
Branko Rihtman ◽  
Ryan Cook ◽  
Michael Jones ◽  
William Wilson ◽  
...  

Megaphages - bacteriophages harbouring extremely large genomes - have recently been found to be ubiquitous, being described from a variety of microbiomes ranging from the animal gut to soil and freshwater systems. However, no complete marine megaphage has been identified to date. Here, using both short and long read sequencing, we assembled >900 high-quality draft viral genomes from water in the English Channel. One of these genomes included a novel megaphage, Mar_Mega_1 at >650 Kb, making it one of the largest phage genomes assembled to date. Utilising phylogenetic and network approaches, we found this phage represents a new family of bacteriophages. Genomic analysis showed Mar_Mega_1 shares relatively few homologues with its closest relatives, but, as with other mega-phages Mar_Mega_1 contained a variety of auxiliary metabolic genes responsible for carbon metabolism and nucleotide biosynthesis, including isocitrate dehydrogenase [NADP] and nicotinamide-nucleotide amidohydrolase [PncC] which have not previously been identified in megaphages. The results of this study indicate that phages containing extremely large genomes can be found in abundance in the marine environment and augment host metabolism by mechanisms not previously described.


Author(s):  
Dhaivat Joshi ◽  
Shunfu Mao ◽  
Sreeram Kannan ◽  
Suhas Diggavi

Abstract Motivation Efficient and accurate alignment of DNA/RNA sequence reads to each other or to a reference genome/transcriptome is an important problem in genomic analysis. Nanopore sequencing has emerged as a major sequencing technology and many long-read aligners have been designed for aligning nanopore reads. However, the high error rate makes accurate and efficient alignment difficult. Utilizing the noise and error characteristics inherent in the sequencing process properly can play a vital role in constructing a robust aligner. In this article, we design QAlign, a pre-processor that can be used with any long-read aligner for aligning long reads to a genome/transcriptome or to other long reads. The key idea in QAlign is to convert the nucleotide reads into discretized current levels that capture the error modes of the nanopore sequencer before running it through a sequence aligner. Results We show that QAlign is able to improve alignment rates from around 80% up to 90% with nanopore reads when aligning to the genome. We also show that QAlign improves the average overlap quality by 9.2, 2.5 and 10.8% in three real datasets for read-to-read alignment. Read-to-transcriptome alignment rates are improved from 51.6% to 75.4% and 82.6% to 90% in two real datasets. Availability and implementation https://github.com/joshidhaivat/QAlign.git. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Gherman Uritskiy ◽  
Maximillian Press ◽  
Christine Sun ◽  
Guillermo Dominguez Huerta ◽  
Ahmed A. Zayed ◽  
...  

Viruses play crucial roles in the ecology of microbial communities, yet they remain relatively understudied in their native environments. Despite many advancements in high-throughput whole-genome sequencing (WGS), sequence assembly, and annotation of viruses, the reconstruction of full-length viral genomes directly from metagenomic sequencing is possible only for the most abundant phages and requires long-read sequencing technologies. Additionally, the prediction of their cellular hosts remains difficult from conventional metagenomic sequencing alone. To address these gaps in the field and to accelerate the study of viruses directly in their native microbiomes, we developed an end-to-end bioinformatics platform for viral genome reconstruction and host attribution from metagenomic data using proximity-ligation sequencing (i.e., Hi-C). We demonstrate the capabilities of the platform by recovering and characterizing the metavirome of a variety of metagenomes, including a fecal microbiome that has also been sequenced with accurate long reads, allowing for the assessment and benchmarking of the new methods. The platform can accurately extract numerous near-complete viral genomes even from highly fragmented short-read assemblies and can reliably predict their cellular hosts with minimal false positives. To our knowledge, this is the first software for performing these tasks. Being significantly cheaper than long-read sequencing of comparable depth, the incorporation of proximity-ligation sequencing in microbiome research shows promise to greatly accelerate future advancements in the field.


2021 ◽  
Vol 1 (1) ◽  
Author(s):  
Slawomir Michniewski ◽  
Branko Rihtman ◽  
Ryan Cook ◽  
Michael A. Jones ◽  
William H. Wilson ◽  
...  

AbstractMegaphages, bacteriophages harbouring extremely large genomes, have recently been found to be ubiquitous, being described from a variety of microbiomes ranging from the animal gut to soil and freshwater systems. However, no complete marine megaphage has been identified to date. Here, using both short and long read sequencing, we assembled >900 high-quality draft viral genomes from water in the English Channel. One of these genomes included a novel megaphage, Mar_Mega_1 at >650 Kb, making it one of the largest phage genomes assembled to date. Utilising phylogenetic and network approaches, we found this phage represents a new family of megaphages. Genomic analysis showed Mar_Mega_1 shares relatively few homologues with its closest relatives, but, as with other megaphages Mar_Mega_1 contained a variety of auxiliary metabolic genes responsible for carbon metabolism and nucleotide biosynthesis, including a NADP-dependent isocitrate dehydrogenase [Idh] and nicotinamide-nucleotide amidohydrolase [PncC], which have not previously been identified in megaphages. Mar_Mega_1 was abundant in a marine virome sample and related phages are widely prevalent in the oceans.


2003 ◽  
Vol 185 (11) ◽  
pp. 3352-3360 ◽  
Author(s):  
Michael B. Howard ◽  
Nathan A. Ekborg ◽  
Larry E. Taylor ◽  
Ronald M. Weiner ◽  
Steven W. Hutcheson

ABSTRACT The marine bacterium Microbulbifer degradans strain 2-40 produces at least 10 enzyme systems for degrading insoluble complex polysaccharides (ICP). The draft sequence of the 2-40 genome allowed a genome-wide analysis of the chitinolytic system of strain 2-40. The chitinolytic system includes three secreted chitin depolymerases (ChiA, ChiB, and ChiC), a secreted chitin-binding protein (CbpA), periplasmic chitooligosaccharide-modifying enzymes, putative sugar transporters, and a cluster of genes encoding cytoplasmic proteins involved in N-acetyl-d-glucosamine (GlcNAc) metabolism. Each chitin depolymerase was detected in culture supernatants of chitin-grown strain 2-40 and was active against chitin and glycol chitin. The chitin depolymerases also had a specific pattern of activity toward the chitin analogs 4-methylumbelliferyl-β-d-N,N′-diacetylchitobioside (MUF-diNAG) and 4-methylumbelliferyl-β-d-N,N′,N"-triacetylchitotrioside (MUF-triNAG). The depolymerases were modular in nature and contained glycosyl hydrolase family 18 domains, chitin-binding domains, and polycystic kidney disease domains. ChiA and ChiB each possessed polyserine linkers of up to 32 consecutive serine residues. In addition, ChiB and CbpA contained glutamic acid-rich domains. At 1,271 amino acids, ChiB is the largest bacterial chitinase reported to date. A chitodextrinase (CdxA) with activity against chitooligosaccharides (degree of polymerization of 5 to 7) was identified. The activities of two apparent periplasmic (HexA and HexB) N-acetyl-β-d-glucosaminidases and one cytoplasmic (HexC) N-acetyl-β-d-glucosaminidase were demonstrated. Genes involved in GlcNAc metabolism, similar to those of the Escherichia coli K-12 NAG utilization operon, were identified. NagA from strain 2-40, a GlcNAc deacetylase, was shown to complement a nagA mutation in E. coli K-12. Except for the GlcNAc utilization cluster, genes for all other components of the chitinolytic system were dispersed throughout the genome. Further examination of this system may provide additional insight into the mechanisms by which marine bacteria degrade chitin and provide a basis for future research on the ICP-degrading systems of strain 2-40.


2021 ◽  
Vol 9 (2) ◽  
pp. 348
Author(s):  
Florian Tagini ◽  
Trestan Pillonel ◽  
Claire Bertelli ◽  
Katia Jaton ◽  
Gilbert Greub

The Mycobacterium kansasii species comprises six subtypes that were recently classified into six closely related species; Mycobacterium kansasii (formerly M. kansasii subtype 1), Mycobacterium persicum (subtype 2), Mycobacterium pseudokansasii (subtype 3), Mycobacterium ostraviense (subtype 4), Mycobacterium innocens (subtype 5) and Mycobacterium attenuatum (subtype 6). Together with Mycobacterium gastri, they form the M. kansasii complex. M. kansasii is the most frequent and most pathogenic species of the complex. M. persicum is classically associated with diseases in immunosuppressed patients, and the other species are mostly colonizers, and are only very rarely reported in ill patients. Comparative genomics was used to assess the genetic determinants leading to the pathogenicity of members of the M. kansasii complex. The genomes of 51 isolates collected from patients with and without disease were sequenced and compared with 24 publicly available genomes. The pathogenicity of each isolate was determined based on the clinical records or public metadata. A comparative genomic analysis showed that all M. persicum, M. ostraviense, M innocens and M. gastri isolates lacked the ESX-1-associated EspACD locus that is thought to play a crucial role in the pathogenicity of M. tuberculosis and other non-tuberculous mycobacteria. Furthermore, M. kansasii was the only species exhibiting a 25-Kb-large genomic island encoding for 17 type-VII secretion system-associated proteins. Finally, a genome-wide association analysis revealed that two consecutive genes encoding a hemerythrin-like protein and a nitroreductase-like protein were significantly associated with pathogenicity. These two genes may be involved in the resistance to reactive oxygen and nitrogen species, a required mechanism for the intracellular survival of bacteria. Three non-pathogenic M. kansasii lacked these genes likely due to two distinct distributive conjugal transfers (DCTs) between M. attenuatum and M. kansasii, and one DCT between M. persicum and M. kansasii. To our knowledge, this is the first study linking DCT to reduced pathogenicity.


2021 ◽  
Author(s):  
Valentin Waschulin ◽  
Chiara Borsetto ◽  
Robert James ◽  
Kevin K. Newsham ◽  
Stefano Donadio ◽  
...  

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes, indicating their potential for producing novel and useful compounds. However, recovering full-length BGC sequences from uncultivated bacteria remains a challenge due to the technological restraints of short-read sequencing, thus making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly full-length BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. A large number of highly divergent BGCs were not only found in the phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota but also in the actinobacterial classes Acidimicrobiia and Thermoleophilia and the gammaproteobacterial order UBA7966. The latter furthermore contained a potential novel family of RiPPs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped genetic reservoir of specialised metabolite gene clusters of the uncultured majority of microbes.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yusuke Okazaki ◽  
Shohei Fujinaga ◽  
Michaela M. Salcher ◽  
Cristiana Callieri ◽  
Atsushi Tanaka ◽  
...  

Abstract Background Freshwater ecosystems are inhabited by members of cosmopolitan bacterioplankton lineages despite the disconnected nature of these habitats. The lineages are delineated based on > 97% 16S rRNA gene sequence similarity, but their intra-lineage microdiversity and phylogeography, which are key to understanding the eco-evolutional processes behind their ubiquity, remain unresolved. Here, we applied long-read amplicon sequencing targeting nearly full-length 16S rRNA genes and the adjacent ribosomal internal transcribed spacer sequences to reveal the intra-lineage diversities of pelagic bacterioplankton assemblages in 11 deep freshwater lakes in Japan and Europe. Results Our single nucleotide-resolved analysis, which was validated using shotgun metagenomic sequencing, uncovered 7–101 amplicon sequence variants for each of the 11 predominant bacterial lineages and demonstrated sympatric, allopatric, and temporal microdiversities that could not be resolved through conventional approaches. Clusters of samples with similar intra-lineage population compositions were identified, which consistently supported genetic isolation between Japan and Europe. At a regional scale (up to hundreds of kilometers), dispersal between lakes was unlikely to be a limiting factor, and environmental factors or genetic drift were potential determinants of population composition. The extent of microdiversification varied among lineages, suggesting that highly diversified lineages (e.g., Iluma-A2 and acI-A1) achieve their ubiquity by containing a consortium of genotypes specific to each habitat, while less diversified lineages (e.g., CL500-11) may be ubiquitous due to a small number of widespread genotypes. The lowest extent of intra-lineage diversification was observed among the dominant hypolimnion-specific lineage (CL500-11), suggesting that their dispersal among lakes is not limited despite the hypolimnion being a more isolated habitat than the epilimnion. Conclusions Our novel approach complemented the limited resolution of short-read amplicon sequencing and limited sensitivity of the metagenome assembly-based approach, and highlighted the complex ecological processes underlying the ubiquity of freshwater bacterioplankton lineages. To fully exploit the performance of the method, its relatively low read throughput is the major bottleneck to be overcome in the future.


Sign in / Sign up

Export Citation Format

Share Document