scholarly journals Nuclei multiplexing with barcoded antibodies for single-nucleus genomics

2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jellert T. Gaublomme ◽  
Bo Li ◽  
Cristin McCabe ◽  
Abigail Knecht ◽  
Yiming Yang ◽  
...  

Abstract Single-nucleus RNA-seq (snRNA-seq) enables the interrogation of cellular states in complex tissues that are challenging to dissociate or are frozen, and opens the way to human genetics studies, clinical trials, and precise cell atlases of large organs. However, such applications are currently limited by batch effects, processing, and costs. Here, we present an approach for multiplexing snRNA-seq, using sample-barcoded antibodies to uniquely label nuclei from distinct samples. Comparing human brain cortex samples profiled with or without hashing antibodies, we demonstrate that nucleus hashing does not significantly alter recovered profiles. We develop DemuxEM, a computational tool that detects inter-sample multiplets and assigns singlets to their sample of origin, and validate its accuracy using sex-specific gene expression, species-mixing and natural genetic variation. Our approach will facilitate tissue atlases of isogenic model organisms or from multiple biopsies or longitudinal samples of one donor, and large-scale perturbation screens.

2018 ◽  
Author(s):  
Jellert T. Gaublomme ◽  
Bo Li ◽  
Cristin McCabe ◽  
Abigail Knecht ◽  
Eugene Drokhlyansky ◽  
...  

AbstractSingle-nucleus RNA-Seq (snRNA-seq) enables the interrogation of cellular states in complex tissues that are challenging to dissociate, including frozen clinical samples. This opens the way, in principle, to large studies, such as those required for human genetics, clinical trials, or precise cell atlases of large organs. However, such applications are currently limited by batch effects, sequential processing, and costs. To address these challenges, we present an approach for multiplexing snRNA-seq, using sample-barcoded antibodies against the nuclear pore complex to uniquely label nuclei from distinct samples. Comparing human brain cortex samples profiled in multiplex with or without hashing antibodies, we demonstrate that nucleus hashing does not significantly alter the recovered transcriptome profiles. We further developed demuxEM, a novel computational tool that robustly detects inter-sample nucleus multiplets and assigns singlets to their samples of origin by antibody barcodes, and validated its accuracy using gender-specific gene expression, species-mixing and natural genetic variation. Nucleus hashing significantly reduces cost per nucleus, recovering up to about 5 times as many single nuclei per microfluidc channel. Our approach provides a robust technique for diverse studies including tissue atlases of isogenic model organisms or from a single larger human organ, multiple biopsies or longitudinal samples of one donor, and large-scale perturbation screens.


2016 ◽  
Author(s):  
Alan Medlar ◽  
Laura Laakso ◽  
Andreia Miraldo ◽  
Ari Löytynoja

AbstractHigh-throughput RNA-seq data has become ubiquitous in the study of non-model organisms, but its use in comparative analysis remains a challenge. Without a reference genome for mapping, sequence data has to be de novo assembled, producing large numbers of short, highly redundant contigs. Preparing these assemblies for comparative analyses requires the removal of redundant isoforms, assignment of orthologs and converting fragmented transcripts into gene alignments. In this article we present Glutton, a novel tool to process transcriptome assemblies for downstream evolutionary analyses. Glutton takes as input a set of fragmented, possibly erroneous transcriptome assemblies. Utilising phylogeny-aware alignment and reference data from a closely related species, it reconstructs one transcript per gene, finds orthologous sequences and produces accurate multiple alignments of coding sequences. We present a comprehensive analysis of Glutton’s performance across a wide range of divergence times between study and reference species. We demonstrate the impact choice of assembler has on both the number of alignments and the correctness of ortholog assignment and show substantial improvements over heuristic methods, without sacrificing correctness. Finally, using inference of Darwinian selection as an example of downstream analysis, we show that Glutton-processed RNA-seq data give results comparable to those obtained from full length gene sequences even with distantly related reference species. Glutton is available from http://wasabiapp.org/software/glutton/ and is licensed under the GPLv3.


2021 ◽  
Vol 22 (S11) ◽  
Author(s):  
Jooseong Oh ◽  
Sung-Gwon Lee ◽  
Chungoo Park

Abstract Background Paralogs formed through gene duplication and isoforms formed through alternative splicing have been important processes for increasing protein diversity and maintaining cellular homeostasis. Despite their recognized importance and the advent of large-scale genomic and transcriptomic analyses, paradoxically, accurate annotations of all gene loci to allow the identification of paralogs and isoforms remain surprisingly incomplete. In particular, the global analysis of the transcriptome of a non-model organism for which there is no reference genome is especially challenging. Results To reliably discriminate between the paralogs and isoforms in RNA-seq data, we redefined the pre-existing sequence features (sequence similarity, inverse count of consecutive identical or non-identical blocks, and match-mismatch fraction) previously derived from full-length cDNAs and EST sequences and described newly discovered genomic and transcriptomic features (twilight zone of protein sequence alignment and expression level difference). In addition, the effectiveness and relevance of the proposed features were verified with two widely used support vector machine (SVM) and random forest (RF) models. From nine RNA-seq datasets, all AUC (area under the curve) scores of ROC (receiver operating characteristic) curves were over 0.9 in the RF model and significantly higher than those in the SVM model. Conclusions In this study, using an RF model with five proposed RNA-seq features, we implemented our method called Paralogs and Isoforms Classifier based on Machine-learning approaches (PIC-Me) and showed that it outperformed an existing method. Finally, we envision that our tool will be a valuable computational resource for the genomics community to help with gene annotation and will aid in comparative transcriptomics and evolutionary genomics studies, especially those on non-model organisms.


2020 ◽  
Vol 17 (8) ◽  
pp. 793-798 ◽  
Author(s):  
Bo Li ◽  
Joshua Gould ◽  
Yiming Yang ◽  
Siranush Sarkizova ◽  
Marcin Tabaka ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Momoko Hamano ◽  
Seitaro Nomura ◽  
Midori Iida ◽  
Issei Komuro ◽  
Yoshihiro Yamanishi

AbstractHeart failure is a heterogeneous disease with multiple risk factors and various pathophysiological types, which makes it difficult to understand the molecular mechanisms involved. In this study, we proposed a trans-omics approach for predicting molecular pathological mechanisms of heart failure and identifying marker genes to distinguish heterogeneous phenotypes, by integrating multiple omics data including single-cell RNA-seq, ChIP-seq, and gene interactome data. We detected a significant increase in the expression level of natriuretic peptide A (Nppa), after stress loading with transverse aortic constriction (TAC), and showed that cardiomyocytes with high Nppa expression displayed specific gene expression patterns. Multiple NADH ubiquinone complex family, which are associated with the mitochondrial electron transport system, were negatively correlated with Nppa expression during the early stages of cardiac hypertrophy. Large-scale ChIP-seq data analysis showed that Nkx2-5 and Gtf2b were transcription factors characteristic of high-Nppa-expressing cardiomyocytes. Nppa expression levels may, therefore, represent a useful diagnostic marker for heart failure.


2019 ◽  
Author(s):  
Bo Li ◽  
Joshua Gould ◽  
Yiming Yang ◽  
Siranush Sarkizova ◽  
Marcin Tabaka ◽  
...  

AbstractMassively parallel single-cell and single-nucleus RNA-seq (sc/snRNA-seq) have opened the way to systematic tissue atlases in health and disease, but as the scale of data generation is growing, so does the need for computational pipelines for scaled analysis. Here, we developed Cumulus, a cloud-based framework for analyzing large scale sc/snRNA-seq datasets. Cumulus combines the power of cloud computing with improvements in algorithm implementations to achieve high scalability, low cost, user-friendliness, and integrated support for a comprehensive set of features. We benchmark Cumulus on the Human Cell Atlas Census of Immune Cells dataset of bone marrow cells and show that it substantially improves efficiency over conventional frameworks, while maintaining or improving the quality of results, enabling large-scale studies.


2021 ◽  
Author(s):  
Kathleen M Chen ◽  
Aaron K Wong ◽  
Olga G Troyanskaya ◽  
Jian Zhou

Sequence is at the basis of how the genome shapes chromatin organization, regulates gene expression, and impacts traits and diseases. Epigenomic profiling efforts have enabled large-scale identification of regulatory elements, yet we still lack a sequence-based map to systematically identify regulatory activities from any sequence, which is necessary for predicting the effects of any variant on these activities. We address this challenge with Sei, a new framework for integrating human genetics data with sequence information to discover the regulatory basis of traits and diseases. Our framework systematically learns a vocabulary for the regulatory activities of sequences, which we call sequence classes, using a new deep learning model that predicts a compendium of 21,907 chromatin profiles across >1,300 cell lines and tissues, the most comprehensive to-date. Sequence classes allow for a global view of sequence and variant effects by quantifying diverse regulatory activities, such as loss or gain of cell-type-specific enhancer function. We show that sequence class predictions are supported by experimental data, including tissue-specific gene expression, expression QTLs, and evolutionary constraints based on population allele frequencies. Finally, we applied our framework to human genetics data. Sequence classes uniquely provide a non-overlapping partitioning of GWAS heritability by tissue-specific regulatory activity categories, which we use to characterize the regulatory architecture of 47 traits and diseases from UK Biobank. Furthermore, the predicted loss or gain of sequence class activities suggest specific mechanistic hypotheses for individual regulatory pathogenic mutations. We provide this framework as a resource to further elucidate the sequence basis of human health and disease.


Author(s):  
Zhixu Qiu ◽  
Siyuan Chen ◽  
Yuhong Qi ◽  
Chunni Liu ◽  
Jingjing Zhai ◽  
...  

Abstract Transcriptional switch (TS) is a widely observed phenomenon caused by changes in the relative expression of transcripts from the same gene, in spatial, temporal or other dimensions. TS has been associated with human diseases, plant development and stress responses. Its investigation is often hampered by a lack of suitable tools allowing comprehensive and flexible TS analysis for high-throughput RNA sequencing (RNA-Seq) data. Here, we present deepTS, a user-friendly web-based implementation that enables a fully interactive, multifunctional identification, visualization and analysis of TS events for large-scale RNA-Seq datasets from pairwise, temporal and population experiments. deepTS offers rich functionality to streamline RNA-Seq-based TS analysis for both model and non-model organisms and for those with or without reference transcriptome. The presented case studies highlight the capabilities of deepTS and demonstrate its potential for the transcriptome-wide TS analysis of pairwise, temporal and population RNA-Seq data. We believe deepTS will help research groups, regardless of their informatics expertise, perform accessible, reproducible and collaborative TS analyses of large-scale RNA-Seq data.


2021 ◽  
Vol 22 (11) ◽  
pp. 5902
Author(s):  
Stefan Nagel ◽  
Claudia Pommerenke ◽  
Corinna Meyer ◽  
Hans G. Drexler

Recently, we documented a hematopoietic NKL-code mapping physiological expression patterns of NKL homeobox genes in human myelopoiesis including monocytes and their derived dendritic cells (DCs). Here, we enlarge this map to include normal NKL homeobox gene expressions in progenitor-derived DCs. Analysis of public gene expression profiling and RNA-seq datasets containing plasmacytoid and conventional dendritic cells (pDC and cDC) demonstrated HHEX activity in both entities while cDCs additionally expressed VENTX. The consequent aim of our study was to examine regulation and function of VENTX in DCs. We compared profiling data of VENTX-positive cDC and monocytes with VENTX-negative pDC and common myeloid progenitor entities and revealed several differentially expressed genes encoding transcription factors and pathway components, representing potential VENTX regulators. Screening of RNA-seq data for 100 leukemia/lymphoma cell lines identified prominent VENTX expression in an acute myelomonocytic leukemia cell line, MUTZ-3 containing inv(3)(q21q26) and t(12;22)(p13;q11) and representing a model for DC differentiation studies. Furthermore, extended gene analyses indicated that MUTZ-3 is associated with the subtype cDC2. In addition to analysis of public chromatin immune-precipitation data, subsequent knockdown experiments and modulations of signaling pathways in MUTZ-3 and control cell lines confirmed identified candidate transcription factors CEBPB, ETV6, EVI1, GATA2, IRF2, MN1, SPIB, and SPI1 and the CSF-, NOTCH-, and TNFa-pathways as VENTX regulators. Live-cell imaging analyses of MUTZ-3 cells treated for VENTX knockdown excluded impacts on apoptosis or induced alteration of differentiation-associated cell morphology. In contrast, target gene analysis performed by expression profiling of knockdown-treated MUTZ-3 cells revealed VENTX-mediated activation of several cDC-specific genes including CSFR1, EGR2, and MIR10A and inhibition of pDC-specific genes like RUNX2. Taken together, we added NKL homeobox gene activities for progenitor-derived DCs to the NKL-code, showing that VENTX is expressed in cDCs but not in pDCs and forms part of a cDC-specific gene regulatory network operating in DC differentiation and function.


Sign in / Sign up

Export Citation Format

Share Document