scholarly journals Single cell genome sequencing of laboratory mouse microbiota improves taxonomic and functional resolution of this model microbial community

2021 ◽  
Author(s):  
Svetlana Lyalina ◽  
Ramunas Stepanauskas ◽  
Frank Wu ◽  
Shomyseh Sanjabi ◽  
Katherine S Pollard

Laboratory mice are widely studied as models of mammalian biology, including the microbiota. However, much of the taxonomic and functional diversity of the mouse gut microbiome is missed in current metagenomic studies, because genome databases have not achieved a balanced representation of the diverse members of this ecosystem. Towards solving this problem, we used flow cytometry and low-coverage sequencing to capture the genomes of 764 single cells from the stool of three laboratory mice. From these, we generated 298 high-coverage microbial genome assemblies, which we annotated for open reading frames and phylogenetic placement. These genomes increase the gene catalog and phylogenetic breadth of the mouse microbiota, adding 135 novel species with the greatest increase in diversity to the Muribaculaceae and Bacteroidaceae families. This new diversity also improves the read mapping rate, taxonomic classifier performance, and gene detection rate of mouse stool metagenomes. The novel microbial functions revealed through our single-cell genomes highlight previously invisible pathways that may be important for life in the murine gastrointestinal tract.

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Vivekananda Sarangi ◽  
Alexandre Jourdon ◽  
Taejeong Bae ◽  
Arijit Panda ◽  
Flora Vaccarino ◽  
...  

Abstract Background The study of mosaic mutation is important since it has been linked to cancer and various disorders. Single cell sequencing has become a powerful tool to study the genome of individual cells for the detection of mosaic mutations. The amount of DNA in a single cell needs to be amplified before sequencing and multiple displacement amplification (MDA) is widely used owing to its low error rate and long fragment length of amplified DNA. However, the phi29 polymerase used in MDA is sensitive to template fragmentation and presence of sites with DNA damage that can lead to biases such as allelic imbalance, uneven coverage and over representation of C to T mutations. It is therefore important to select cells with uniform amplification to decrease false positives and increase sensitivity for mosaic mutation detection. Results We propose a method, Scellector (single cell selector), which uses haplotype information to detect amplification quality in shallow coverage sequencing data. We tested Scellector on single human neuronal cells, obtained in vitro and amplified by MDA. Qualities were estimated from shallow sequencing with coverage as low as 0.3× per cell and then confirmed using 30× deep coverage sequencing. The high concordance between shallow and high coverage data validated the method. Conclusion Scellector can potentially be used to rank amplifications obtained from single cell platforms relying on a MDA-like amplification step, such as Chromium Single Cell profiling solution.


2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i251-i257
Author(s):  
Kerem Wainer-Katsir ◽  
Michal Linial

ABSTRACT Summary Current technologies for single-cell transcriptomics allow thousands of cells to be analyzed in a single experiment. The increased scale of these methods raises the risk of cell doublets contamination. Available tools and algorithms for identifying doublets and estimating their occurrence in single-cell experimental data focus on doublets of different species, cell types or individuals. In this study, we analyze transcriptomic data from single cells having an identical genetic background. We claim that the ratio of monoallelic to biallelic expression provides a discriminating power toward doublets’ identification. We present a pipeline called BIallelic Ratio for Doublets (BIRD) that relies on heterologous genetic variations, from single-cell RNA sequencing. For each dataset, doublets were artificially created from the actual data and used to train a predictive model. BIRD was applied on Smart-seq data from 163 primary fibroblast single cells. The model achieved 100% accuracy in annotating the randomly simulated doublets. Bonafide doublets were verified based on a biallelic expression signal amongst X-chromosome of female fibroblasts. Data from 10X Genomics microfluidics of human peripheral blood cells achieved in average 83% (±3.7%) accuracy, and an area under the curve of 0.88 (±0.04) for a collection of ∼13 300 single cells. BIRD addresses instances of doublets, which were formed from cell mixtures of identical genetic background and cell identity. Maximal performance is achieved for high-coverage data from Smart-seq. Success in identifying doublets is data specific which varies according to the experimental methodology, genomic diversity between haplotypes, sequence coverage and depth. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Xiangqi Bai ◽  
Zhana Duren ◽  
Lin Wan ◽  
Li C. Xia

AbstractLatest developments in high-throughput single-cell genome (scDNA-) and transcriptome sequencing (scRNA-seq) technologies enabled cell-resolved investigation of tissue clones. However, it remains challenging to cluster single cells of the same tissue origin across scRNA- and scDNA-seq platforms. In this work, we present a computational framework – CCNMF, which uses a novel Coupled-Clone Non-negative Matrix Factorization technique to jointly infer clonal structure for paired scDNA- and scRNA-seq data of the same specimen. CCNMF clusters single cells through statistically modeling their shared clonal structure and coupling copy number and gene expression profiles by their global correlation. We validated CCNMF using both simulated and real cell mixture benchmarks and fully demonstrated its robustness and accuracy. As real world applications of CCNMF, we analyzed data from a gastric cancer cell line, an ovarian cancer cell mixture, and a triple-negative breast cancer xenograft. We resolved the underlying clonal structures and identified dosage-sensitive genes between co-existing clones. In summary, CCNMF is a coherent computational framework that simul-taneously resolves genome and transcriptome clonal structures, facilitating understanding of how cellular gene expression changes along with clonal genome alternations.AvailabilityThe R package of CCNMF is available at https://github.com/XQBai/CCNMF.


2021 ◽  
Author(s):  
Tori Tonn ◽  
Hakan Ozadam ◽  
Crystal Han ◽  
Alia Segura ◽  
Duc Tran ◽  
...  

Technological limitations precluded transcriptome-wide analyses of translation at single cell resolution. To solve this challenge, we developed a novel microfluidic isotachophoresis approach, named RIBOsome profiling via IsoTachoPhoresis (Ribo-ITP), and characterized translation in single oocytes and embryos during early mouse development. We identified differential translation efficiency as a key regulatory mechanism of genes involved in centrosome organization and N6-methyladenosine modification of RNAs. Our high coverage measurements enabled the first analysis of allele-specific ribosome engagement in early development and led to the discovery of stage-specific differential engagement of zygotic RNAs with ribosomes. Finally, by integrating our measurements with proteomics data, we discovered that ribosome occupancy in germinal vesicle stage oocytes is the predominant determinant of protein abundance in the zygote. Taken together, these findings resolve the long-standing paradox of low correlation between RNA expression and protein abundance in early embryonic development. The novel Ribo-ITP approach will enable numerous applications by providing high coverage and high resolution ribosome occupancy measurements from ultra-low input samples including single cells.


2016 ◽  
Author(s):  
Sarah A. Vitak ◽  
Kristof A. Torkenczy ◽  
Jimi L. Rosenkrantz ◽  
Andrew J. Fields ◽  
Lena Christiansen ◽  
...  

AbstractSingle cell genome sequencing has proven to be a valuable tool for the detection of somatic variation, particularly in the context of tumor evolution and neuronal heterogeneity. Current technologies suffer from high per-cell library construction costs which restrict the number of cells that can be assessed, thus imposing limitations on the ability to quantitatively measure genomic heterogeneity within a tissue. Here, we present Single cell Combinatorial Indexed Sequencing (SCI-seq) as a means of simultaneously generating thousands of low-pass single cell libraries for the purpose of somatic copy number variant detection. In total, we constructed libraries for 16,698 single cells from a combination of cultured cell lines, frontal cortex tissue from Macaca mulatta, and two human adenocarcinomas. This novel technology provides the opportunity for low-cost, deep characterization of somatic copy number variation in single cells, providing a foundational knowledge across both healthy and diseased tissues.


2019 ◽  
Author(s):  
Fiona K. Hamey ◽  
Winnie W.Y. Lau ◽  
Iwo Kucinski ◽  
Xiaonan Wang ◽  
Evangelia Diamanti ◽  
...  

AbstractDifferentiation of hematopoietic stem and progenitor cells ensure a continuous supply of mature blood cells. Recent models of differentiation are represented as a landscape, in which individual progenitors traverse a continuum of multipotent cell states before reaching an entry point that marks lineage commitment. Basophils and mast cells have received little attention in these models and their differentiation trajectories are yet to be explored. Here, we have performed multicolor flow cytometry and high-coverage single-cell RNA sequencing analyses to chart the differentiation of hematopoietic progenitors into basophils and mast cells in mouse. Analysis of flow cytometry data reconstructed a detailed map of the differentiation, including a bifurcation of progenitors into two specific trajectories. Molecular profiling and pseudotime ordering of the single cells revealed gene expression changes during differentiation, with temporally separated regulation of mast cell protease genes. We validate that basophil and mast cell signature genes increased along the trajectories into their respective lineage, and we demonstrate how genes critical for each respective lineage are upregulated during the formation of the mature cells. Cell fate assays showed that multicolor flow cytometry and transcriptional profiling successfully predict the bipotent phenotype of a previously uncharacterized population of basophil-mast cell progenitor-like cells in mouse peritoneum. Taken together, we provide a detailed roadmap of basophil and mast cell development through a combination of molecular and functional profiling.


2018 ◽  
Author(s):  
Jie Liu ◽  
Galip Gürkan Yardımcı ◽  
Dejun Lin ◽  
William Stafford Noble

AbstractSingle-cell Hi-C (scHi-C) data promises to enable scientists to interrogate the 3D architecture of DNA in the nucleus of the cell, studying how this structure varies stochastically or along developmental or cell cycle axes. However, Hi-C data analysis requires methods that take into account the unique characteristics of this type of data. In this work, we explore whether methods that have been developed previously for the analysis of bulk Hi-C data can be applied to scHi-C data. In this work, we apply methods designed for analysis of bulk Hi-C data to scHi-C data in conjunction with unsupervised embedding. We find that one of these methods, HiCRep, when used in conjunction with multidimensional scaling (MDS), strongly outperforms three other methods, including a technique that has been used previously for scHi-C analysis. We also provide evidence that the HiCRep/MDS method is robust to extremely low per-cell sequencing depth, that this robustness is improved even further when high-coverage and low-coverage cells are projected together, and that the method can be used to jointly embed cells from multiple published datasets.


2021 ◽  
Author(s):  
Jie Liang ◽  
Alan Perez-Rathke

Computational modeling of 3D chromatin plays an important role in understanding the principles of genome organization. We discuss methods for modeling 3D chromatin structures, with focus on a minimalistic polymer model which inverts population Hi-C into high-resolution, high-coverage single-cell chromatin conformations. Utilizing only basic physical properties such as nuclear volume and no adjustable parameters, this model uncovers a few specific Hi-C interactions (15-35 for enhancer-rich loci in human cells) that can fold chromatin into individual conformations consistent with single-cell imaging, Dip-C, and FISH-measured genomic distance distributions. Aggregating an ensemble of conformations also reproduces population Hi-C interaction frequencies. Furthermore, this single-cell modeling approach allows quantification of structural heterogeneity and discovery of specific many-body units of chromatin interactions. This minimalistic 3D chromatin polymer model has revealed a number of insights: 1) chromatin scaling rules are a result of volume-confined polymers; 2) TADs form as a byproduct of 3D chromatin folding driven by specific interactions; 3) chromatin folding at many loci is driven by a small number of specific interactions; 4) cell subpopulations equipped with different chromatin structural scaffolds are developmental stage-dependent; and 5) characterization of the functional landscape and epigenetic marks of many-body units which are simultaneously spatially co-interacting within enhancer-rich, euchromatic regions. The implications of these findings in understanding the genome structure-function relationship are also discussed.


2019 ◽  
Author(s):  
Kerem Wainer-Katsir ◽  
Michal Linial

ABSTRACTMotivationCurrent technologies for single-cell transcriptomics allow thousands of cells to be analyzed in a single experiment. The increased scale of these methods led to a higher risk of cell doublets’ contamination. Available tools and algorithms for identifying doublets and estimating their occurrence in single-cell expression data focus on cell doublets from different species, cell types or individuals.ResultsIn this study, we analyze transcriptomic data from single cells having an identical genetic background. We claim that the ratio of monoallelic to biallelic expression provides a discriminating power towards doublets’ identification. We present a pipeline called BIRD (BIallelic Ratio for Doublets) that relies on heterologous genetic variations extracted from single-cell RNA-seq (scRNA-seq). For each dataset, doublets were artificially created from the actual data and used to train a predictive model. BIRD was applied on Smart-Seq data from 163 primary fibroblasts. The model achieved 100% accuracy in annotating the randomly simulated doublets. Bonafide doublets from female-origin fibroblasts were verified by the unexpected biallelic expression from X-chromosome. Data from 10X Genomics microfluidics of peripheral blood cells analyzed by BIRD achieved in average 83% (± 3.7%) accuracy with an area under the curve of 0.88 (± 0.04) for a collection of ∼13,300 single cells.ConclusionsBIRD addresses instances of doublets which were formed from cell mixtures of identical genetic background and cell identity. Maximal performance is achieved with high coverage data. Success in identifying doublets is data specific which varies according to the experimental methodology, genomic diversity between haplotypes, sequence coverage, and depth.


2014 ◽  
Author(s):  
Narjes S. Movahedi ◽  
Zeinab Taghavi ◽  
Mallory Embree ◽  
Harish Nagarajan ◽  
Karsten Zengler ◽  
...  

As the vast majority of all microbes are unculturable, single-cell sequencing has become a significant method to gain insight into microbial physiology. Single-cell sequencing methods, currently powered by multiple displacement genome amplification (MDA), have passed important milestones such as finishing and closing the genome of a prokaryote. However, the quality and reliability of genome assemblies from single cells are still unsatisfactory due to uneven coverage depth and the absence of scattered chunks of the genome in the final collection of reads caused by MDA bias. In this work, our new algorithm Hybrid De novo Assembler (HyDA) demonstrates the power of co-assembly of multiple single-cell genomic data sets through significant improvement of the assembly quality in terms of predicted functional elements and length statistics. Co-assemblies contain significantly more base pairs and protein coding genes, cover more subsystems, and consist of longer contigs compared to individual assemblies by the same algorithm as well as state-of-the-art single-cell assemblers SPAdes and IDBA-UD. Hybrid De novo Assembler (HyDA) is also able to avoid chimeric assemblies by detecting and separating shared and exclusive pieces of sequence for input data sets. By replacing one deep single-cell sequencing experiment with a few single-cell sequencing experiments of lower depth, the co-assembly method can hedge against the risk of failure and loss of the sample, without significantly increasing sequencing cost. Application of the single-cell co-assembler HyDA to the study of three uncultured members of an alkane-degrading methanogenic community validated the usefulness of the co-assembly concept.


Sign in / Sign up

Export Citation Format

Share Document