metagenomic data
Recently Published Documents


TOTAL DOCUMENTS

867
(FIVE YEARS 493)

H-INDEX

50
(FIVE YEARS 12)

2022 ◽  
Author(s):  
Emily F Wissel ◽  
Brooke M Talbot ◽  
Bjorn A Johnson ◽  
Robert A Petit ◽  
Vicki Hertzberg ◽  
...  

The use of shotgun metagenomics for AMR detection is appealing because data can be generated from clinical samples with minimal processing. Detecting antimicrobial resistance (AMR) in clinical genomic data is an important epidemiological task, yet a complex bioinformatic process. Many software tools exist to detect AMR genes, but they have mostly been tested in their detection of genotypic resistance in individual bacterial strains. It is important to understand how well these bioinformatic tools detect AMR genes in shotgun metagenomic data. We developed a software pipeline, hAMRoaster (https://github.com/ewissel/hAMRoaster), for assessing accuracy of prediction of antibiotic resistance phenotypes. For evaluation purposes, we simulated a short read (Illumina) shotgun metagenomics community of eight bacterial pathogens with extensive antibiotic susceptibility testing profiles. We benchmarked nine open source bioinformatics tools for detecting AMR genes that 1) were conda or Docker installable, 2) had been actively maintained, 3) had an open source license, and 4) took FASTA or FASTQ files as input. Several metrics were calculated for each tool including sensitivity, specificity, and F1 at three coverage levels. This study revealed that tools were highly variable in sensitivity (0.25 - 0.99) and specificity (0.2 - 1) in detection of resistance in our synthetic FASTQ files despite similar databases and methods implemented. Tools performed similarly at all coverage levels (5x, 50x, 100x). Cohen’s kappa revealed low agreement across tools.


2022 ◽  
Author(s):  
Wanxin Li ◽  
Lila Kari ◽  
Yaoliang Yu ◽  
Laura A Hug

We propose MT-MAG, a novel machine learning-based taxonomic assignment tool for hierarchically-structured local classification of metagenome-assembled genomes (MAGs). MT-MAG is capable of classifying large and diverse real metagenomic datasets, having analyzed for this study a total of 240 Gbp of data in the training set, and 7 Gbp of data in the test set. MT-MAG is, to the best of our knowledge, the first machine learning method for taxonomic assignment of metagenomic data that offers a "partial classification" option. MT-MAG outputs complete or a partial classification paths, and interpretable numerical classification confidences of its classifications, at all taxonomic ranks. MT-MAG is able to completely classify 48% more sequences than DeepMicrobes to the Species level (the only comparable taxonomic rank for DeepMicrobes), and it outperforms DeepMicrobes by an average of 33% in weighted accuracy, and by 89% in constrained accuracy.


2022 ◽  
Author(s):  
Hannah-Marie Martiny ◽  
Patrick Munk ◽  
Christian Brinch ◽  
Judit Szarvas ◽  
Frank Aarestrup ◽  
...  

Abstract Since the initial discovery of a mobilized colistin resistance gene (mcr-1), several other variants have been reported, some of which might have circulated a while before being discovered. Metagenomic data provides an opportunity to re-analyze available older data to understand the evolutionary history of recently discovered antimicrobial resistance genes (ARGs). Here, we present a large-scale metagenomic study of 442 Tbp of sequencing reads from 214,095 samples to identify the host and geographical distribution and genomic context of nine mcr gene variants (mcr-1 to mcr-9). Our results show that the dissemination of each variant is not uniform. Instead, the source and location play a role in the spread. Despite the very diverse distribution, the genomic background of the mcr genes remains unchanged as the same mobile genetic elements and plasmid replicons occur. This work emphasizes the importance of sharing genomic data for surveillance of ARGs in our fight against antimicrobial resistance.


2022 ◽  
Author(s):  
Albane Ruaud ◽  
Niklas A Pfister ◽  
Ruth E Ley ◽  
Nicholas D Youngblut

Background: Tree ensemble machine learning models are increasingly used in microbiome science as they are compatible with the compositional, high-dimensional, and sparse structure of sequence-based microbiome data. While such models are often good at predicting phenotypes based on microbiome data, they only yield limited insights into how microbial taxa or genomic content may be associated. Results: We developed endoR, a method to interpret a fitted tree ensemble model. First, endoR simplifies the fitted model into a decision ensemble from which it then extracts information on the importance of individual features and their pairwise interactions and also visualizes these data as an interpretable network. Both the network and importance scores derived from endoR provide insights into how features, and interactions between them, contribute to the predictive performance of the fitted model. Adjustable regularization and bootstrapping help reduce the complexity and ensure that only essential parts of the model are retained. We assessed the performance of endoR on both simulated and real metagenomic data. We found endoR to infer true associations with more or comparable accuracy than other commonly used approaches while easing and enhancing model interpretation. Using endoR, we also confirmed published results on gut microbiome differences between cirrhotic and healthy individuals. Finally, we utilized endoR to gain insights into components of the microbiome that predict the presence of human gut methanogens, as these hydrogen-consumers are expected to interact with fermenting bacteria in a complex syntrophic network. Specifically, we analyzed a global metagenome dataset of 2203 individuals and confirmed the previously reported association between Methanobacteriaceae and Christensenellales. Additionally, we observed that Methanobacteriaceae are associated with a network of hydrogen-producing bacteria. Conclusion: Our method accurately captures how tree ensembles use features and interactions between them to predict a response. As demonstrated by our applications, the resultant visualizations and summary outputs facilitate model interpretation and enable the generation of novel hypotheses about complex systems. An implementation of endoR is available as an open-source R-package on GitHub (https://github.com/leylabmpi/endoR).


2022 ◽  
Author(s):  
Bo Dong ◽  
Jing Liu ◽  
Bing Chen ◽  
Yuqi Huang ◽  
Peng Ai ◽  
...  

Abstract -Purpose: The adaptability of blue-spotted mudskipper (Boleophthalmus Periophthalmodon; BP) and giant-fin mudskipper (Periophthalmus magnuspinnatus; PM), has been previously reported at the genome level to explain their amphibious life. However, the roles of GI microbiota in their adaptation to the terrestrial life are worth exploring. -Methods: In this study, we mainly utilized metagenomic data from these two representative mudskippers and typical aquicolous fish species to obtain microbial composition, diversity, abundance and potential functions of GI microbiota for comparisons between amphibious and aquicolous fishes. Meanwhile, we summarized the GI microbiota results of representative seawater fishes, freshwater fishes, amphibians, and terrestrial animals by literature mining for comparing those of the mudskippers. -Result: Interestingly the content for each dominant phylum was strikingly different among BP, PM and aquicolous fishes. We also observed that the profile of GI microbiota in mudskippers owned the typical bacterial families for the terrestrial animals, (freshwater and seawater) fishes, and amphibians at the same time, which is consistent with their life style of water-to-land and freshwater to seawater transition. More interestingly, certain bacteria strains like S24-7, previously thought to be specific in terrestrial animals, were also identified in both BP and PM. -Conclusion: The various composite and diversity of mudskipper GI microflora are therefore considered to conduce to their terrestrial adaptation in these amphibious fishes.


Author(s):  
Felix Teufel ◽  
José Juan Almagro Armenteros ◽  
Alexander Rosenberg Johansen ◽  
Magnús Halldór Gíslason ◽  
Silas Irby Pihl ◽  
...  

AbstractSignal peptides (SPs) are short amino acid sequences that control protein secretion and translocation in all living organisms. SPs can be predicted from sequence data, but existing algorithms are unable to detect all known types of SPs. We introduce SignalP 6.0, a machine learning model that detects all five SP types and is applicable to metagenomic data.


Author(s):  
Songyi Ning ◽  
Ziyuan Dai ◽  
Chunyan Zhao ◽  
Zhanghao Feng ◽  
Kexin Jin ◽  
...  

Author(s):  
Diana Y. Lee ◽  
Caitlin Bartels ◽  
Katelyn McNair ◽  
Robert A. Edwards ◽  
Manal A. Swairjo ◽  
...  
Keyword(s):  

Author(s):  
Qingzhen Hou ◽  
Fabrizio Pucci ◽  
Fengming Pan ◽  
Fuzhong Xue ◽  
Marianne Rooman ◽  
...  

2021 ◽  
Author(s):  
Seth Commichaux ◽  
Kiran Javkar ◽  
Harihara Subrahmaniam Muralidharan ◽  
Padmini Ramachandran ◽  
Andrea Ottesen ◽  
...  

Abstract BackgroundMicrobial eukaryotes are nearly ubiquitous in microbiomes on Earth and contribute to many integral ecological functions. Metagenomics is a proven tool for studying the microbial diversity, functions, and ecology of microbiomes, but has been underutilized for microeukaryotes due to the computational challenges they present. For taxonomic classification, the use of a eukaryotic marker gene database can improve the computational efficiency, precision and sensitivity. However, state-of-the-art tools which use marker gene databases implement universal thresholds for classification rather than dynamically learning the thresholds from the database structure, impacting the accuracy of the classification process.ResultsHere we introduce taxaTarget, a method for the taxonomic classification of microeukaryotes in metagenomic data. Using a database of eukaryotic marker genes and a supervised learning approach for training, we learned the discriminatory power and classification thresholds for each 20 amino acid region of each marker gene in our database. This approach provided improved sensitivity and precision compared to other state-of-the-art approaches, with rapid runtimes and low memory usage. Additionally, taxaTarget was better able to detect the presence of multiple closely related species as well as species with no representative sequences in the database. One of the greatest challenges faced during the development of taxaTarget was the general sparsity of available sequences for microeukaryotes. Several algorithms were implemented, including threshold padding, which effectively handled the missing training data and reduced classification errors. Using taxaTarget on metagenomes from human fecal microbiomes, a broader range of genera were detected, including multiple parasites that the other tested tools missed.ConclusionData-driven methods for learning classification thresholds from the structure of an input database can provide granular information about the discriminatory power of the sequences and improve the sensitivity and precision of classification. These methods will help facilitate a more comprehensive analysis of metagenomic data and expand our knowledge about the diverse eukaryotes in microbial communities.


Sign in / Sign up

Export Citation Format

Share Document