microbial genomes
Recently Published Documents


TOTAL DOCUMENTS

475
(FIVE YEARS 129)

H-INDEX

63
(FIVE YEARS 7)

2022 ◽  
Vol 12 ◽  
Author(s):  
Varada Khot ◽  
Jackie Zorz ◽  
Daniel A. Gittins ◽  
Anirban Chakraborty ◽  
Emma Bell ◽  
...  

Many pathways for hydrocarbon degradation have been discovered, yet there are no dedicated tools to identify and predict the hydrocarbon degradation potential of microbial genomes and metagenomes. Here we present the Calgary approach to ANnoTating HYDrocarbon degradation genes (CANT-HYD), a database of 37 HMMs of marker genes involved in anaerobic and aerobic degradation pathways of aliphatic and aromatic hydrocarbons. Using this database, we identify understudied or overlooked hydrocarbon degradation potential in many phyla. We also demonstrate its application in analyzing high-throughput sequence data by predicting hydrocarbon utilization in large metagenomic datasets from diverse environments. CANT-HYD is available at https://github.com/dgittins/CANT-HYD-HydrocarbonBiodegradation.


2022 ◽  
Vol 12 ◽  
Author(s):  
Alejandro Rodríguez-Gijón ◽  
Julia K. Nuy ◽  
Maliheh Mehrshad ◽  
Moritz Buck ◽  
Frederik Schulz ◽  
...  

Our view of genome size in Archaea and Bacteria has remained skewed as the data has been dominated by genomes of microorganisms that have been cultivated under laboratory settings. However, the continuous effort to catalog Earth’s microbiomes, specifically propelled by recent extensive work on uncultivated microorganisms, provides an opportunity to revise our perspective on genome size distribution. We present a meta-analysis that includes 26,101 representative genomes from 3 published genomic databases; metagenomic assembled genomes (MAGs) from GEMs and stratfreshDB, and isolates from GTDB. Aquatic and host-associated microbial genomes present on average the smallest estimated genome sizes (3.1 and 3.0 Mbp, respectively). These are followed by terrestrial microbial genomes (average 3.7 Mbp), and genomes from isolated microorganisms (average 4.3 Mbp). On the one hand, aquatic and host-associated ecosystems present smaller genomes sizes in genera of phyla with genome sizes above 3 Mbp. On the other hand, estimated genome size in phyla with genomes under 3 Mbp showed no difference between ecosystems. Moreover, we observed that when using 95% average nucleotide identity (ANI) as an estimator for genetic units, only 3% of MAGs cluster together with genomes from isolated microorganisms. Although there are potential methodological limitations when assembling and binning MAGs, we found that in genome clusters containing both environmental MAGs and isolate genomes, MAGs were estimated only an average 3.7% smaller than isolate genomes. Even when assembly and binning methods introduce biases, estimated genome size of MAGs and isolates are very similar. Finally, to better understand the ecological drivers of genome size, we discuss on the known and the overlooked factors that influence genome size in different ecosystems, phylogenetic groups, and trophic strategies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Runbiao Wu ◽  
Luyu Wang ◽  
Jianping Xie ◽  
Zhisheng Zhang

Wolf spiders (Lycosidae) are crucial component of integrated pest management programs and the characteristics of their gut microbiota are known to play important roles in improving fitness and survival of the host. However, there are only few studies of the gut microbiota among closely related species of wolf spider. Whether wolf spiders gut microbiota vary with habitats remains unknown. Here, we used shotgun metagenomic sequencing to compare the gut microbiota of two wolf spider species, Pardosa agraria and P. laura from farmland and woodland ecosystems, respectively. The results show that the gut microbiota of Pardosa spiders is similar in richness and abundance. Approximately 27.3% of the gut microbiota of P. agraria comprises Proteobacteria, and approximately 34.5% of the gut microbiota of P. laura comprises Firmicutes. We assembled microbial genomes and found that the gut microbiota of P. laura are enriched in genes for carbohydrate metabolism. In contrast, those of P. agraria showed a higher proportion of genes encoding acetyltransferase, an enzyme involved in resistance to antibiotics. We reconstructed three high-quality and species-level microbial genomes: Vulcaniibacterium thermophilum, Anoxybacillus flavithermus and an unknown bacterium belonging to the family Simkaniaceae. Our results contribute to an understanding of the diversity and function of gut microbiota in closely related spiders.


2021 ◽  
Author(s):  
Adelme Bazin ◽  
Claudine Medigue ◽  
David Vallenet ◽  
Alexandra Calteau

The recent years have seen the rise of pangenomes as comparative genomic tools to better understand the evolution of gene content among microbial genomes in close phylogenetic groups such as species. While the core or persistent genome is often well-known as it includes essential or ubiquitous genes, the variable genome is usually less characterized and includes many genes with unknown functions even among the most studied organisms. It gathers important genes for strain adaptation that are acquired by horizontal gene transfer. Here, we introduce panModule, an original method to identify conserved modules in pangenome graphs built from thousands of microbial genomes. These modules correspond to synteny blocks composed of consecutive genes that are conserved in a subset of the compared strains. Identifying conserved modules can provide insights on genes involved in the same functional processes, and as such is a very helpful tool to facilitate the understanding of genomic regions with complex evolutionary histories. The panModule method was benchmarked on a curated dataset of conserved modules in Escherichia coli genomes. Its use was illustrated through a study of a high pathogenicity island in Klebsiella pneumoniae that allowed a better understanding of this region. panModule is freely available and accessible through the PPanGGOLiN software suite (https://github.com/labgem/PPanGGOLiN).


2021 ◽  
Vol 1 ◽  
Author(s):  
Steven L. Salzberg ◽  
Derrick E. Wood

Ten years ago, the dramatic rise in the number of microbial genomes led to an inflection point, when the approach of finding short, exact matches in a comprehensive database became just as accurate as older, slower approaches. The new idea led to a method that was hundreds of times times faster than those that came before. Today, exact k-mer matching is a standard technique at the heart of many microbiome analysis tools.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
Jian Zeng ◽  
Yan Wang ◽  
Ju Zhang ◽  
Shixing Yang ◽  
Wen Zhang

AbstractMembers of the family Inoviridae (inoviruses) are characterized by their unique filamentous morphology and infection cycle. The viral genome of inovirus is able to integrate into the host genome and continuously releases virions without lysing the host, establishing chronic infection. A large number of inoviruses have been obtained from microbial genomes and metagenomes recently, but putative novel inoviruses remaining to be identified. Here, using viral metagenomics, we identified four novel inoviruses from cloacal swab samples of wild and breeding birds. The circular genome of those four inoviruses are 6732 to 7709 nt in length with 51.4% to 56.5% GC content and encodes 9 to 13 open reading frames, respectively. The zonula occludens toxin gene implicated in the virulence of pathogenic host bacteria were identified in all four inoviruses and shared the highest amino acid sequences identity (< 37.3%) to other reference strains belonging to different genera of the family Inoviridae and among themselves. Phylogenetic analysis indicated that all the four inoviruses were genetically far away from other strains belonging to the family Inoviridae and formed an independent clade. According to the genetic distance-based criteria, all the four inoviruses identified in the present study respectively belong to four novel putative genera in the family Inoviridae.


Author(s):  
Xinyue Mei ◽  
Ying Wang ◽  
Zuran Li ◽  
Marie Larousse ◽  
Arthur Pere ◽  
...  

AbstractIntercropping or assistant endophytes promote phytoremediation capacities of hyperaccumulators and enhance their tolerance to heavy metal (HM) stress. Findings from a previous study showed that intercropping the hyperaccumulator Sonchus asper (L.) Hill grown in HM-contaminated soils with maize improved the remediating properties and indicated an excluder-to-hyperaccumulator switched mode of action towards lead. In the current study, RNA-Seq analysis was conducted on Sonchus roots grown under intercropping or monoculture systems to explore the molecular events underlying this shift in lead sequestering strategy. The findings showed that intercropping only slightly affects S. asper transcriptome but significantly affects expression of root-associated microbial genomes. Further, intercropping triggers significant reshaping of endophytic communities associated with a ‘root-to-shoot’ transition of lead sequestration and improved phytoremediation capacities of S. asper. These findings indicate that accumulator activities of a weed are partially attributed to the root-associated microbiota, and a complex network of plant–microbe-plant interactions shapes the phytoremediation potential of S. asper. Analysis showed that intercropping may significantly change the structure of root-associated communities resulting in novel remediation properties, thus providing a basis for improving phytoremediation practices to restore contaminated soils.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Yuqing Feng ◽  
Yanan Wang ◽  
Baoli Zhu ◽  
George Fu Gao ◽  
Yuming Guo ◽  
...  

AbstractGut microbial reference genomes and gene catalogs are necessary for understanding the chicken gut microbiome. Here, we assembled 12,339 microbial genomes and constructed a gene catalog consisting of ~16.6 million genes by integrating 799 public chicken gut microbiome samples from ten countries. We found that 893 and 38 metagenome-assembled genomes (MAGs) in our dataset were putative novel species and genera, respectively. In the chicken gut, Lactobacillus aviarius and Lactobacillus crispatus were the most common lactic acid bacteria, and glycoside hydrolases were the most abundant carbohydrate-active enzymes (CAZymes). Antibiotic resistome profiling results indicated that Chinese chicken samples harbored a higher relative abundance but less diversity of antimicrobial resistance genes (ARGs) than European samples. We also proposed the effects of geography and host species on the gut resistome. Our study provides the largest integrated metagenomic dataset from the chicken gut to date and demonstrates its value in exploring chicken gut microbial genes.


2021 ◽  
Vol 7 (11) ◽  
Author(s):  
Arnoud H. M. van Vliet ◽  
Oliver J. Charity ◽  
Mark Reuter

Microbial genomes are highly adaptable, with mobile genetic elements (MGEs) such as integrative conjugative elements (ICEs) mediating the dissemination of new genetic information throughout bacterial populations. This is countered by defence mechanisms such as CRISPR-Cas systems, which limit invading MGEs by sequence-specific targeting. Here we report the distribution of the pVir, pTet and PCC42 plasmids and a new 70–129 kb ICE (CampyICE1) in the foodborne bacterial pathogens Campylobacter jejuni and Campylobacter coli . CampyICE1 contains a degenerated Type II-C CRISPR system consisting of a sole Cas9 protein, which is distinct from the previously described Cas9 proteins from C. jejuni and C. coli . CampyICE1 is conserved in structure and gene order, containing blocks of genes predicted to be involved in recombination, regulation and conjugation. CampyICE1 was detected in 134/5829 (2.3 %) C . jejuni genomes and 92/1347 (6.8 %) C . coli genomes. Similar ICEs were detected in a number of non-jejuni/coli Campylobacter species, although these lacked a CRISPR-Cas system. CampyICE1 carries three separate short CRISPR spacer arrays containing a combination of 108 unique spacers and 16 spacer-variant families. A total of 69 spacers and 10 spacer-variant families (63.7 %) were predicted to target Campylobacter plasmids. The presence of a functional CampyICE1 Cas9 protein and matching anti-plasmid spacers was associated with the absence of the pVir, pTet and pCC42 plasmids (188/214 genomes, 87.9 %), suggesting that the CampyICE1-encoded CRISPR-Cas has contributed to the exclusion of competing plasmids. In conclusion, the characteristics of the CRISPR-Cas9 system on CampyICE1 suggests a history of plasmid warfare in Campylobacter .


2021 ◽  
Author(s):  
Arun Das ◽  
Michael C Schatz

In modern sequencing experiments, identifying the sources of the reads is a crucial need. In metagenomics, where each read comes from one of potentially many members of a community, it can be important to identify the exact species the read is from. In other settings, it is important to distinguish which reads are from the targeted sample and which are from potential contaminants. In both cases, identification of the correct source of a read enables further investigation of relevant reads, while minimizing wasted work. This task is particularly challenging for long reads, which can have a substantial error rate that obscures the origins of each read. Existing tools for the read classification problem are often alignment or index-based, but such methods can have large time and/or space overheads. In this work, we investigate the effectiveness of several sampling and sketching-based approaches for read classification. In these approaches, a chosen sampling or sketching algorithm is used to generate a reduced representation (a "screen") of potential source genomes for a query readset before reads are streamed in and compared against this screen. Using a query read's similarity to the elements of the screen, the methods predict the source of the read. Such an approach requires limited pre-processing, stores and works with only a subset of the input data, and is able to perform classification with a high degree of accuracy. The sampling and sketching approaches investigated include uniform sampling, methods based on MinHash and its weighted and order variants, a minimizer-based technique, and a novel clustering-based sketching approach. We demonstrate the effectiveness of these techniques both in identifying the source microbial genomes for reads from a metagenomic long read sequencing experiment, and in distinguishing between long reads from organisms of interest and potential contaminant reads. We then compare these approaches to existing alignment, index and sketching-based tools for read classification, and demonstrate how such a method is a viable alternative for determining the source of query reads. Finally, we present a reference implementation of these approaches at https://github.com/arun96/sketching.


Sign in / Sign up

Export Citation Format

Share Document