scholarly journals Rapid screening and identification of viral pathogens in metagenomic data

2021 ◽  
Vol 14 (S6) ◽  
Author(s):  
Shiyang Song ◽  
Liangxiao Ma ◽  
Xintian Xu ◽  
Han Shi ◽  
Xuan Li ◽  
...  

Abstract Background Virus screening and viral genome reconstruction are urgent and crucial for the rapid identification of viral pathogens, i.e., tracing the source and understanding the pathogenesis when a viral outbreak occurs. Next-generation sequencing (NGS) provides an efficient and unbiased way to identify viral pathogens in host-associated and environmental samples without prior knowledge. Despite the availability of software, data analysis still requires human operations. A mature pipeline is urgently needed when thousands of viral pathogen and viral genome reconstruction samples need to be rapidly identified. Results In this paper, we present a rapid and accurate workflow to screen metagenomics sequencing data for viral pathogens and other compositions, as well as enable a reference-based assembler to reconstruct viral genomes. Moreover, we tested our workflow on several metagenomics datasets, including a SARS-CoV-2 patient sample with NGS data, pangolins tissues with NGS data, Middle East Respiratory Syndrome (MERS)-infected cells with NGS data, etc. Our workflow demonstrated high accuracy and efficiency when identifying target viruses from large scale NGS metagenomics data. Our workflow was flexible when working with a broad range of NGS datasets from small (kb) to large (100 Gb). This took from a few minutes to a few hours to complete each task. At the same time, our workflow automatically generates reports that incorporate visualized feedback (e.g., metagenomics data quality statistics, host and viral sequence compositions, details about each of the identified viral pathogens and their coverages, and reassembled viral pathogen sequences based on their closest references). Conclusions Overall, our system enabled the rapid screening and identification of viral pathogens from metagenomics data, providing an important piece to support viral pathogen research during a pandemic. The visualized report contains information from raw sequence quality to a reconstructed viral sequence, which allows non-professional people to screen their samples for viruses by themselves (Additional file 1).

2021 ◽  
Vol 12 ◽  
Author(s):  
Kai Song

Metagenomes can be considered as mixtures of viral, bacterial, and other eukaryotic DNA sequences. Mining viral sequences from metagenomes could shed insight into virus–host relationships and expand viral databases. Current alignment-based methods are unsuitable for identifying viral sequences from metagenome sequences because most assembled metagenomic contigs are short and possess few or no predicted genes, and most metagenomic viral genes are dissimilar to known viral genes. In this study, I developed a Markov model-based method, VirMC, to identify viral sequences from metagenomic data. VirMC uses Markov chains to model sequence signatures and construct a scoring model using a likelihood test to distinguish viral and bacterial sequences. Compared with the other two state-of-the-art viral sequence-prediction methods, VirFinder and PPR-Meta, my proposed method outperformed VirFinder and had similar performance with PPR-Meta for short contigs with length less than 400 bp. VirMC outperformed VirFinder and PPR-Meta for identifying viral sequences in contaminated metagenomic samples with eukaryotic sequences. VirMC showed better performance in assembling viral-genome sequences from metagenomic data (based on filtering potential bacterial reads). Applying VirMC to human gut metagenomes from healthy subjects and patients with type-2 diabetes (T2D) revealed that viral contigs could help classify healthy and diseased statuses. This alignment-free method complements gene-based alignment approaches and will significantly improve the precision of viral sequence identification.


2021 ◽  
Author(s):  
Gherman Uritskiy ◽  
Maximillian Press ◽  
Christine Sun ◽  
Guillermo Dominguez Huerta ◽  
Ahmed A. Zayed ◽  
...  

Viruses play crucial roles in the ecology of microbial communities, yet they remain relatively understudied in their native environments. Despite many advancements in high-throughput whole-genome sequencing (WGS), sequence assembly, and annotation of viruses, the reconstruction of full-length viral genomes directly from metagenomic sequencing is possible only for the most abundant phages and requires long-read sequencing technologies. Additionally, the prediction of their cellular hosts remains difficult from conventional metagenomic sequencing alone. To address these gaps in the field and to accelerate the study of viruses directly in their native microbiomes, we developed an end-to-end bioinformatics platform for viral genome reconstruction and host attribution from metagenomic data using proximity-ligation sequencing (i.e., Hi-C). We demonstrate the capabilities of the platform by recovering and characterizing the metavirome of a variety of metagenomes, including a fecal microbiome that has also been sequenced with accurate long reads, allowing for the assessment and benchmarking of the new methods. The platform can accurately extract numerous near-complete viral genomes even from highly fragmented short-read assemblies and can reliably predict their cellular hosts with minimal false positives. To our knowledge, this is the first software for performing these tasks. Being significantly cheaper than long-read sequencing of comparable depth, the incorporation of proximity-ligation sequencing in microbiome research shows promise to greatly accelerate future advancements in the field.


2021 ◽  
Author(s):  
Agnes S Montgomery ◽  
Michael B Lustik ◽  
Susan A Reichert-Scrivner ◽  
Ronald L Woodbury ◽  
Milissa U Jones ◽  
...  

ABSTRACT Introduction Acute respiratory diseases account for a substantial number of outpatient visits and hospitalizations among U.S. military personnel, significantly affecting mission readiness and military operations. We conducted a retrospective analysis of respiratory viral pathogen (RVP) samples collected from U.S. military personnel stationed in Hawaii and tested at Tripler Army Medical Center from January 2014 to May 2019 in order to describe the etiology, distribution, and seasonality of RVP exposure in a military population. Materials and Methods Samples were analyzed by viral culture or multiplex PCR. Distribution of respiratory viruses over time was analyzed as well as subject demographic and encounter data. Presenting signs and symptoms were evaluated with each RVP. Results A total of 2,576 military personnel were tested, of which 726 (28.2%) were positive for one or more RVP. Among positive tests, the three most common viral pathogens detected were influenza A (43.0%), rhinovirus (24.5%), and parainfluenza (7.6%). Symptoms were generally mild and most frequently included cough, fever, and body aches. Conclusion Our study evaluated respiratory virus prevalence, seasonality, and association with clinical symptoms for military personnel in an urban tropical setting in Oahu, HI, over a 5-year period. We show that viral prevalence and seasonality in Hawaii are distinct from those of the CONUS. Results contribute to the broader understanding of seasonality, clinical manifestation, and demographics of RVP among active duty military personnel stationed in Hawaii.


2014 ◽  
Vol 104 (10) ◽  
pp. 1125-1129 ◽  
Author(s):  
A. H. Stobbe ◽  
W. L. Schneider ◽  
P. R. Hoyt ◽  
U. Melcher

Next generation sequencing (NGS) is not used commonly in diagnostics, in part due to the large amount of time and computational power needed to identify the taxonomic origin of each sequence in a NGS data set. By using the unassembled NGS data sets as the target for searches, pathogen-specific sequences, termed e-probes, could be used as queries to enable detection of specific viruses or organisms in plant sample metagenomes. This method, designated e-probe diagnostic nucleic acid assay, first tested with mock sequence databases, was tested with NGS data sets generated from plants infected with a DNA (Bean golden yellow mosaic virus, BGYMV) or an RNA (Plum pox virus, PPV) virus. In addition, the ability to detect and differentiate among strains of a single virus species, PPV, was examined by using probe sets that were specific to strains. The use of probe sets for multiple viruses determined that one sample was dually infected with BGYMV and Bean golden mosaic virus.


Viruses ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 2006
Author(s):  
Anna Y Budkina ◽  
Elena V Korneenko ◽  
Ivan A Kotov ◽  
Daniil A Kiselev ◽  
Ilya V Artyushin ◽  
...  

According to various estimates, only a small percentage of existing viruses have been discovered, naturally much less being represented in the genomic databases. High-throughput sequencing technologies develop rapidly, empowering large-scale screening of various biological samples for the presence of pathogen-associated nucleotide sequences, but many organisms are yet to be attributed specific loci for identification. This problem particularly impedes viral screening, due to vast heterogeneity in viral genomes. In this paper, we present a new bioinformatic pipeline, VirIdAl, for detecting and identifying viral pathogens in sequencing data. We also demonstrate the utility of the new software by applying it to viral screening of the feces of bats collected in the Moscow region, which revealed a significant variety of viruses associated with bats, insects, plants, and protozoa. The presence of alpha and beta coronavirus reads, including the MERS-like bat virus, deserves a special mention, as it once again indicates that bats are indeed reservoirs for many viral pathogens. In addition, it was shown that alignment-based methods were unable to identify the taxon for a large proportion of reads, and we additionally applied other approaches, showing that they can further reveal the presence of viral agents in sequencing data. However, the incompleteness of viral databases remains a significant problem in the studies of viral diversity, and therefore necessitates the use of combined approaches, including those based on machine learning methods.


2019 ◽  
Author(s):  
Bernardo Gutierrez ◽  
Emma Wise ◽  
Steven Pullan ◽  
Christopher Logue ◽  
Thomas A. Bowden ◽  
...  

AbstractThe Amazon basin is host to numerous arthropod-borne viral pathogens that cause febrile disease in humans. Among these,Oropouche orthobunyavirus(OROV) is a relatively understudied member of the Peribunyavirales that causes periodic outbreaks in human populations in Brazil and other South American countries. Although several studies have described the genetic diversity of the virus, the evolutionary processes that shape the viral genome remain poorly understood. Here we present a comprehensive study of the genomic dynamics of OROV that encompasses phylogenetic analysis, evolutionary rate estimates, inference of natural selective pressures, recombination and reassortment, and structural analysis of OROV variants. Our study includes all available published sequences, as well as a set of new OROV genomes sequences obtained from patients in Ecuador, representing the first set of viral genomes from this country. Our results show that differing evolutionary processes on the three segments that encompass the viral genome lead to variable evolutionary rates and TMRCAs that could be explained by cryptic reassortment. We also present the discovery of previously unobserved putative N-linked glycosylation sites, and codons which evolve under positive selection on the viral surface proteins, and discuss the potential role of these features in the evolution of the virus through a combined phylogenetic and structural approach.


Viruses ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 1432
Author(s):  
Xavier Fernandez-Cassi ◽  
Sandra Martínez-Puchol ◽  
Marcelle Silva-Sales ◽  
Thais Cornejo ◽  
Rosa Bartolome ◽  
...  

Acute infectious gastroenteritis is an important illness worldwide, especially on children, with viruses accounting for approximately 70% of the acute cases. A high number of these cases have an unknown etiological agent and the rise of next generation sequencing technologies has opened new opportunities for viral pathogen detection and discovery. Viral metagenomics in routine clinical settings has the potential to identify unexpected or novel variants of viral pathogens that cause gastroenteritis. In this study, 124 samples from acute gastroenteritis patients from 2012–2014 previously tested negative for common gastroenteritis pathogens were pooled by age and analyzed by next generation sequencing (NGS) to elucidate unidentified viral infections. The most abundant sequences detected potentially associated to acute gastroenteritis were from Astroviridae and Caliciviridae families, with the detection of norovirus GIV and sapoviruses. Lower number of contigs associated to rotaviruses were detected. As expected, other viruses that may be associated to gastroenteritis but also produce persistent infections in the gut were identified including several Picornaviridae members (EV, parechoviruses, cardioviruses) and adenoviruses. According to the sequencing data, astroviruses, sapoviruses and NoV GIV should be added to the list of viral pathogens screened in routine clinical analysis.


RSC Advances ◽  
2015 ◽  
Vol 5 (30) ◽  
pp. 23431-23442 ◽  
Author(s):  
Sheraz A. K. Tanoli ◽  
Nazish U. Tanoli ◽  
Tatiani M. Bondancia ◽  
Saman Usmani ◽  
Zaheer Ul-Haq ◽  
...  

Over the last two decades, new and more advanced strategies that help in the rapid screening and identification of new ligands for a specific macromolecule have become an important domain.


Sign in / Sign up

Export Citation Format

Share Document