scholarly journals Insights into the Human Virome Using CRISPR Spacers from Microbiomes

Viruses ◽  
2018 ◽  
Vol 10 (9) ◽  
pp. 479 ◽  
Author(s):  
Claudio Hidalgo-Cantabrana ◽  
Rosemary Sanozky-Dawes ◽  
Rodolphe Barrangou

Due to recent advances in next-generation sequencing over the past decade, our understanding of the human microbiome and its relationship to health and disease has increased dramatically. Yet, our insights into the human virome, and its interplay with important microbes that impact human health, is relatively limited. Prokaryotic and eukaryotic viruses are present throughout the human body, comprising a large and diverse population which influences several niches and impacts our health at various body sites. The presence of prokaryotic viruses like phages, has been documented at many different body sites, with the human gut being the richest ecological niche. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and associated proteins constitute the adaptive immune system of bacteria, which prevents attack by invasive nucleic acid. CRISPR-Cas systems function by uptake and integration of foreign genetic element sequences into the CRISPR array, which constitutes a genomic archive of iterative vaccination events. Consequently, CRISPR spacers can be investigated to reconstruct interplay between viruses and bacteria, and metagenomic sequencing data can be exploited to provide insights into host-phage interactions within a niche. Here, we show how the CRISPR spacer content of commensal and pathogenic bacteria can be used to determine the evidence of their phage exposure. This framework opens new opportunities for investigating host-virus dynamics in metagenomic data, and highlights the need to dedicate more efforts for virome sampling and sequencing.

2020 ◽  
Vol 8 (5) ◽  
pp. 684
Author(s):  
Nathanael J. Bangayan ◽  
Baochen Shi ◽  
Jerry Trinh ◽  
Emma Barnard ◽  
Gabriela Kasimatis ◽  
...  

The microbiome plays an important role in human physiology. The composition of the human microbiome has been described at the phylum, class, genus, and species levels, however, it is largely unknown at the strain level. The importance of strain-level differences in microbial communities has been increasingly recognized in understanding disease associations. Current methods for identifying strain populations often require deep metagenomic sequencing and a comprehensive set of reference genomes. In this study, we developed a method, metagenomic multi-locus sequence typing (MG-MLST), to determine strain-level composition in a microbial community by combining high-throughput sequencing with multi-locus sequence typing (MLST). We used a commensal bacterium, Propionibacterium acnes, as an example to test the ability of MG-MLST in identifying the strain composition. Using simulated communities, MG-MLST accurately predicted the strain populations in all samples. We further validated the method using MLST gene amplicon libraries and metagenomic shotgun sequencing data of clinical skin samples. MG-MLST yielded consistent results of the strain composition to those obtained from nearly full-length 16S rRNA clone libraries and metagenomic shotgun sequencing analysis. When comparing strain-level differences between acne and healthy skin microbiomes, we demonstrated that strains of RT2/6 were highly associated with healthy skin, consistent with previous findings. In summary, MG-MLST provides a quantitative analysis of the strain populations in the microbiome with diversity and richness. It can be applied to microbiome studies to reveal strain-level differences between groups, which are critical in many microorganism-related diseases.


2018 ◽  
Vol 57 (2) ◽  
Author(s):  
Qun Yan ◽  
Yu Mi Wi ◽  
Matthew J. Thoendel ◽  
Yash S. Raval ◽  
Kerryl E. Greenwood-Quaintance ◽  
...  

ABSTRACT We previously demonstrated that shotgun metagenomic sequencing can detect bacteria in sonicate fluid, providing a diagnosis of prosthetic joint infection (PJI). A limitation of the approach that we used is that data analysis was time-consuming and specialized bioinformatics expertise was required, both of which are barriers to routine clinical use. Fortunately, automated commercial analytic platforms that can interpret shotgun metagenomic data are emerging. In this study, we evaluated the CosmosID bioinformatics platform using shotgun metagenomic sequencing data derived from 408 sonicate fluid samples from our prior study with the goal of evaluating the platform vis-à-vis bacterial detection and antibiotic resistance gene detection for predicting staphylococcal antibacterial susceptibility. Samples were divided into a derivation set and a validation set, each consisting of 204 samples; results from the derivation set were used to establish cutoffs, which were then tested in the validation set for identifying pathogens and predicting staphylococcal antibacterial resistance. Metagenomic analysis detected bacteria in 94.8% (109/115) of sonicate fluid culture-positive PJIs and 37.8% (37/98) of sonicate fluid culture-negative PJIs. Metagenomic analysis showed sensitivities ranging from 65.7 to 85.0% for predicting staphylococcal antibacterial resistance. In conclusion, the CosmosID platform has the potential to provide fast, reliable bacterial detection and identification from metagenomic shotgun sequencing data derived from sonicate fluid for the diagnosis of PJI. Strategies for metagenomic detection of antibiotic resistance genes for predicting staphylococcal antibacterial resistance need further development.


2020 ◽  
Author(s):  
Maxence Queyrel ◽  
Edi Prifti ◽  
Jean-Daniel Zucker

AbstractAnalysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and are stored as fastq files. Conventional processing pipelines consist multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Recent studies have demonstrated that training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimentionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life datasets as well a simulated one, we demonstrated that this original approach reached very high performances, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.


Author(s):  
Wei Zhang ◽  
Longlong Wang ◽  
Ke Liu ◽  
Xiaofeng Wei ◽  
Kai Yang ◽  
...  

Abstract Motivation T and B cell receptors (TCRs and BCRs) play a pivotal role in the adaptive immune system by recognizing an enormous variety of external and internal antigens. Understanding these receptors is critical for exploring the process of immunoreaction and exploiting potential applications in immunotherapy and antibody drug design. Although a large number of samples have had their TCR and BCR repertoires sequenced using high-throughput sequencing in recent years, very few databases have been constructed to store these kinds of data. To resolve this issue, we developed a database. Results We developed a database, the Pan Immune Repertoire Database (PIRD), located in China National GeneBank (CNGBdb), to collect and store annotated TCR and BCR sequencing data, including from Homo sapiens and other species. In addition to data storage, PIRD also provides functions of data visualization and interactive online analysis. Additionally, a manually curated database of TCRs and BCRs targeting known antigens (TBAdb) was also deposited in PIRD. Availability and implementation PIRD can be freely accessed at https://db.cngb.org/pird.


2020 ◽  
Vol 2 (3) ◽  
Author(s):  
Joshua D Podlevsky ◽  
Corey M Hudson ◽  
Jerilyn A Timlin ◽  
Kelly P Williams

Abstract CRISPR arrays and CRISPR-associated (Cas) proteins comprise a widespread adaptive immune system in bacteria and archaea. These systems function as a defense against exogenous parasitic mobile genetic elements that include bacteriophages, plasmids and foreign nucleic acids. With the continuous spread of antibiotic resistance, knowledge of pathogen susceptibility to bacteriophage therapy is becoming more critical. Additionally, gene-editing applications would benefit from the discovery of new cas genes with favorable properties. While next-generation sequencing has produced staggering quantities of data, transitioning from raw sequencing reads to the identification of CRISPR/Cas systems has remained challenging. This is especially true for metagenomic data, which has the highest potential for identifying novel cas genes. We report a comprehensive computational pipeline, CasCollect, for the targeted assembly and annotation of cas genes and CRISPR arrays—even isolated arrays—from raw sequencing reads. Benchmarking our targeted assembly pipeline demonstrates significantly improved timing by almost two orders of magnitude compared with conventional assembly and annotation, while retaining the ability to detect CRISPR arrays and cas genes. CasCollect is a highly versatile pipeline and can be used for targeted assembly of any specialty gene set, reconfigurable for user provided Hidden Markov Models and/or reference nucleotide sequences.


Biocelebes ◽  
2022 ◽  
Vol 15 (2) ◽  
pp. 113-124
Author(s):  
Musjaya, M Guli

The immune sistem is a way of the body’s defense sistem to save the host from the invasion of outside pathogen. Based on how respon to disease, that differentiated into two immune system are innate and adaptive system. Because it an cant throgh the stomach, these pathogenic bacteria go to the small intestin as a site infection. In the intestine, V. cholerae bactesia adhere and colonize and invasion to intestinal epihelial cells. Protection mechanism  to V. cholerae are the natural defense presence of tick mucosa on the surface of epithelial cells can  inhibit pathogene to adhere tointestinal epithelial cells. One anothet defense namely innate immune system did by phagocytic cells to attac pathogen agent and adaptive immune system involves IgA to opsonization so that can increase intestinal mucosal immune system


2021 ◽  
Author(s):  
Kayla Sprenger ◽  
Simone Conti ◽  
Victor Ovchinnikov ◽  
Arup K Chakraborty ◽  
martin karplus

The design of vaccines against highly mutable pathogens, such as HIV and influenza, requires a detailed understanding of how the adaptive immune system responds to encountering multiple variant antigens (Ags). Here, we describe a multiscale model of B cell receptor (BCR) affinity maturation that employs actual BCR nucleotide sequences and treats BCR/Ag interactions in atomistic detail. We apply the model to simulate the maturation of a broadly neutralizing Ab (bnAb) against HIV. Starting from a germline precursor sequence of the VRC01 anti-HIV Ab, we simulate BCR evolution in response to different vaccination protocols and different Ags, which were previously designed by us. The simulation results provide qualitative guidelines for future vaccine design and reveal unique insights into bnAb evolution against the CD4 binding site of HIV. Our model makes possible direct comparisons of simulated BCR populations with results of deep sequencing data, which will be explored in future applications.


2021 ◽  
Vol 17 (3) ◽  
pp. e1008814
Author(s):  
Emmi Jokinen ◽  
Jani Huuhtanen ◽  
Satu Mustjoki ◽  
Markus Heinonen ◽  
Harri Lähdesmäki

Adaptive immune system uses T cell receptors (TCRs) to recognize pathogens and to consequently initiate immune responses. TCRs can be sequenced from individuals and methods analyzing the specificity of the TCRs can help us better understand individuals’ immune status in different disorders. For this task, we have developed TCRGP, a novel Gaussian process method that predicts if TCRs recognize specified epitopes. TCRGP can utilize the amino acid sequences of the complementarity determining regions (CDRs) from TCRα and TCRβ chains and learn which CDRs are important in recognizing different epitopes. Our comprehensive evaluation with epitope-specific TCR sequencing data shows that TCRGP achieves on average higher prediction accuracy in terms of AUROC score than existing state-of-the-art methods in epitope-specificity predictions. We also propose a novel analysis approach for combined single-cell RNA and TCRαβ (scRNA+TCRαβ) sequencing data by quantifying epitope-specific TCRs with TCRGP and identify HBV-epitope specific T cells and their transcriptomic states in hepatocellular carcinoma patients.


2020 ◽  
Author(s):  
Jiakuo Yan ◽  
Xiaoyang Wu ◽  
Jun Chen ◽  
Yao Chen ◽  
Honghai Zhang

Abstract Sable (Martes zibellina), a member of family Mustelidae, order Carnivora, is primarily distributed in the cold northern zone of Eurasia. The purpose of this study was to explore the intestinal flora of the sable by metagenomic library-based techniques. Libraries were sequenced on an Illumina HiSeq 4000 instrument. The effective sequencing data of each sample was above 6,000 M, and the ratio of clean reads to raw reads was over 98%. The total ORF length was approximately 603,031, equivalent to 347.36 Mbp. We investigated gene functions with the KEGG database and identified 7,140 KEGG ortholog (KO) groups comprising 129,788 genes across all of the samples. We selected a subset of genes with the highest abundances to construct cluster heat maps. From the results of the KEGG metabolic pathway annotations, we acquired information on gene functions, as represented by the categories of metabolism, environmental information processing, genetic information processing, cellular processes and organismal systems. We then investigated gene function with the CAZy database and identified functional carbohydrate hydrolases corresponding to genes in the intestinal microorganisms of sable. This finding is consistent with the fact that the sable is adapted to cold environments and requires a large amount of energy to maintain its metabolic activity. We also investigated gene functions with the eggNOG database; the main functions of genes included gene duplication, recombination and repair, transport and metabolism of amino acids, and transport and metabolism of carbohydrates. In this study, we attempted to identify the complex structure of the microbial population of sable based on metagenomic sequencing methods, which use whole metagenomic data, and to map the obtained sequences to known genes or pathways in existing databases, such as CAZy, KEGG, and eggNOG. We then explored the genetic composition and functional diversity of the microbial community based on the mapped functional categories.


Author(s):  
Gülendam Bozdayı ◽  
Işıl Fidan

The viral component of the human microbiome is referred as ‘virobiota’. The virobiota is the sum of all viruses found in or on humans. The set of all genes of virobiota is referred as ‘virome’. The human virome consists of virus-derived genetic elements found in human genome constituted of viruses that infect eukaryotic cells, bacteriophages, prokaryotic cells, and, endogenous retroviruses. The development of new sequencing technologies, such as high-throughput sequencing techniques allowed the analysis of the human virome. Many new viruses have been discovered lately, using new generation sequencing technology. In recent years, there has been an increase in the studies of the human virome as changes in virome have been observed in diseases. The alterations in the human virome may be associated with infectious, inflammatory diseases, cancer and autoimmunity. The understanding of how the virome affects human health and disease can provide the development of potential therapeutic approaches that target the members of the virome.


Sign in / Sign up

Export Citation Format

Share Document