Time- and cost-efficient high-throughput transcriptomics enabled by Bulk RNA Barcoding and sequencing

ABSTRACTGenome-wide gene expression analyses by RNA sequencing (RNA-seq) have quickly become a standard in molecular biology because of the widespread availability of high throughput sequencing technologies. While powerful, RNA-seq still has several limitations, including the time and cost of library preparation, which makes it difficult to profile many samples simultaneously. To deal with these constraints, the single-cell transcriptomics field has implemented the early multiplexing principle, making the library preparation of hundreds of samples (cells) markedly more affordable. However, the current standard methods for bulk transcriptomics (such as TruSeq Stranded mRNA) remain expensive, and relatively little effort has been invested to develop cheaper, but equally robust methods. Here, we present a novel approach, Bulk RNA Barcoding and sequencing (BRB-seq), that combines the multiplexing-driven cost-effectiveness of a single-cell RNA-seq workflow with the performance of a bulk RNA-seq procedure. BRB-seq produces 3’ enriched cDNA libraries that exhibit similar gene expression quantification to TruSeq and that maintain this quality, also in terms of number of detected differentially expressed genes, even with low quality RNA samples. We show that BRB-seq is about 25 times less expensive than TruSeq, enabling the generation of ready to sequence libraries for up to 192 samples in a day with only 2 hours of hands-on time. We conclude that BRB-seq constitutes a powerful alternative to TruSeq as a standard bulk RNA-seq approach. Moreover, we anticipate that this novel method will eventually replace RT-qPCR-based gene expression screens given its capacity to generate genome-wide transcriptomic data at a cost that is comparable to profiling 4 genes using RT-qPCR.‘SoftwareWe developed a suite of open source tools (BRB-seqTools) to aid with processing BRB-seq data and generating count matrices that are used for further analyses. This suite can perform demultiplexing, generate count/UMI matrices and trim BRB-seq constructs and is freely available at http://github.com/DeplanckeLab/BRB-seqToolsHighlightsRapid (~2h hands on time) and low-cost approach to perform transcriptomics on hundreds of RNA samplesStrand specificity preservedPerformance: number of detected genes is equal to Illumina TruSeq Stranded mRNA at same sequencing depthHigh capacity: low cost allows increasing the number of biological replicatesProduces reliable data even with low quality RNA samples (down to RIN value = 2)Complete user-friendly sequencing data pre-processing and analysis pipeline allowing result acquisition in a day

Download Full-text

Exploring the Changing Landscape of Cell-to-Cell Variation After CTCF Knockdown via Single Cell RNA-seq

10.21203/rs.2.15870/v2 ◽

2019 ◽

Author(s):

Wei Wang ◽

Gang Ren ◽

Ni Hong ◽

Wenfei Jin

Keyword(s):

Gene Expression ◽

Transcription Factors ◽

Single Cell ◽

Zinc Finger ◽

Ctcf Binding ◽

Rna Seq ◽

Expression Noise ◽

Genome Wide ◽

Cell Variation ◽

Variable Genes

Abstract Background: CCCTC-Binding Factor (CTCF), also known as 11-zinc finger protein, participates in many cellular processes, including insulator activity, transcriptional regulation and organization of chromatin architecture. Based on single cell flow cytometry and single cell RNA-FISH analyses, our previous study showed that deletion of CTCF binding site led to a significantly increase of cellular variation of its target gene. However, the effect of CTCF on genome-wide landscape of cell-to-cell variation is unclear. Results: We knocked down CTCF in EL4 cells using shRNA, and conducted single cell RNA-seq on both wild type (WT) cells and CTCF-Knockdown (CTCF-KD) cells using Fluidigm C1 system. Principal component analysis of single cell RNA-seq data showed that WT and CTCF-KD cells concentrated in two different clusters on PC1, indicating gene expression profiles of WT and CTCF-KD cells were systematically different. Interestingly, GO terms including regulation of transcription, DNA binding, Zinc finger and transcription factor binding were significantly enriched in CTCF-KD-specific highly variable genes, indicating tissue-specific genes such as transcription factors were highly sensitive to CTCF level. The dysregulation of transcription factors potentially explain why knockdown of CTCF lead to systematic change of gene expression. In contrast, housekeeping genes such as rRNA processing, DNA repair and tRNA processing were significantly enriched in WT-specific highly variable genes, potentially due to a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells. We further found cellular variation-increased genes were significantly enriched in down-regulated genes, indicating CTCF knockdown simultaneously reduced the expression levels and increased the expression noise of its regulated genes. Conclusions: To our knowledge, this is the first attempt to explore genome-wide landscape of cellular variation after CTCF knockdown. Our study not only advances our understanding of CTCF function in maintaining gene expression and reducing expression noise, but also provides a framework for examining gene function.

Download Full-text

CHARTS: A web application for characterizing and comparing tumor subpopulations in publicly available single-cell RNA-seq datasets

10.1101/2020.09.23.310441 ◽

2020 ◽

Author(s):

Matthew N. Bernstein ◽

Zijian Ni ◽

Michael Collins ◽

Mark E. Burkard ◽

Christina Kendziorski ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Web Application ◽

Cellular Heterogeneity ◽

Rna Seq ◽

Individual Gene ◽

Genome Wide ◽

Progression Of Disease ◽

Cell Subpopulations ◽

Available Information

AbstractBackgroundSingle-cell RNA-seq (scRNA-seq) enables the profiling of genome-wide gene expression at the single-cell level and in so doing facilitates insight into and information about cellular heterogeneity within a tissue. Perhaps nowhere is this more important than in cancer, where tumor and tumor microenvironment heterogeneity directly impact development, maintenance, and progression of disease. While publicly available scRNA-seq cancer datasets offer unprecedented opportunity to better understand the mechanisms underlying tumor progression, metastasis, drug resistance, and immune evasion, much of the available information has been underutilized, in part, due to the lack of tools available for aggregating and analysing these data.ResultsWe present CHARacterizing Tumor Subpopulations (CHARTS), a computational pipeline and web application for analyzing, characterizing, and integrating publicly available scRNA-seq cancer datasets. CHARTS enables the exploration of individual gene expression, cell type, malignancy-status, differentially expressed genes, and gene set enrichment results in subpopulations of cells across multiple tumors and datasets.ConclusionCHARTS is an easy to use, comprehensive platform for exploring single-cell subpopulations within tumors across the ever-growing collection of public scRNA-seq cancer datasets. CHARTS is freely available at charts.morgridge.org.

Download Full-text

Comparative analysis of sequencing technologies platforms for single-cell transcriptomics

10.1101/463117 ◽

2018 ◽

Cited By ~ 1

Author(s):

Kedar Nath Natarajan ◽

Zhichao Miao ◽

Miaomiao Jiang ◽

Xiaoyun Huang ◽

Hongpo Zhou ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Single Cells ◽

K562 Cells ◽

Library Preparation ◽

Rna Seq ◽

Illumina Hiseq ◽

Technical Variability ◽

Sequencing Technologies ◽

Sequencing Platforms

AbstractAll single-cell RNA-seq protocols and technologies require library preparation prior to sequencing on a platform such as Illumina. Here, we present the first report to utilize the BGISEQ-500 platform for scRNA-seq, and compare the sensitivity and accuracy to Illumina sequencing. We generate a scRNA-seq resource of 468 unique single-cells and 1,297 matched single cDNA samples, performing SMARTer and Smart-seq2 protocols on mESCs and K562 cells with RNA spike-ins. We sequence these libraries on both BGISEQ-500 and Illumina HiSeq platforms using single- and paired-end reads. The two platforms have comparable sensitivity and accuracy in terms of quantification of gene expression, and low technical variability. Our study provides a standardised scRNA-seq resource to benchmark new scRNA-seq library preparation protocols and sequencing platforms.

Download Full-text

Exploratory bioinformatics analysis reveals importance of “junk” DNA in early embryo development

10.1101/079921 ◽

2016 ◽

Author(s):

Steven Xijin Ge

Keyword(s):

Gene Expression ◽

Stem Cells ◽

Transposable Elements ◽

Single Cell ◽

Exploratory Analysis ◽

List Type ◽

Dependent Manner ◽

Rna Seq ◽

Long Terminal Repeats ◽

Terminal Repeats

AbstractBackgroundInstead of testing predefined hypotheses, the goal of exploratory data analysis (EDA) is to find what data can tell us. Following this strategy, we re-analyzed a large body of genomic data to investigate how the early mouse embryos develop from fertilized eggs through a complex, poorly understood process.ResultsStarting with a single-cell RNA-seq dataset of 259 mouse embryonic cells from zygote to blastocyst stages, we reconstructed the temporal and spatial dynamics of gene expression. Our analyses revealed similarities in the expression patterns of regular genes and those of retrotransposons, and the enrichment of transposable elements in the promoters of corresponding genes. Long Terminal Repeats (LTRs) are associated with transient, strong induction of many nearby genes at the 2-4 cell stages, probably by providing binding sites for Obox and other homeobox factors. The presence of B1 and B2 SINEs (Short Interspersed Nuclear Elements) in promoters is highly correlated with broad upregulation of intracellular genes in a dosage-and distance-dependent manner. Such enhancer-like effects are also found for human Alu and bovine tRNA SINEs. Promoters for genes specifically expressed in embryonic stem cells (ESCs) are rich in B1 and B2 SINEs, but low in CpG islands.ConclusionsOur results provide evidence that transposable elements may play a significant role in establishing the expression landscape in early embryos and stem cells. This study also demonstrates that open-ended, exploratory analysis aimed at a broad understanding of a complex process can pinpoint specific mechanisms for further study.Major findingSingle-cell RNA-seq data enables estimation of retrotransposon expression during PDSimilar expression dynamics of retrotransposons and regular genes during PDLong terminal repeats may be essential for the 1st wave of gene expressionObox homeobox factors are possible regulators of PD, upstream of Zscan4SINE repeats predict expression of nearby genes in murine, human and bovine embryosExploratory analysis of large single-cell data pinpoints developmental pathways

Download Full-text

TM3’seq: A Tagmentation-Mediated 3’ Sequencing Approach for Improving Scalability of RNAseq Experiments

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400821 ◽

2019 ◽

Vol 10 (1) ◽

pp. 143-150 ◽

Cited By ~ 4

Author(s):

Luisa F. Pallares ◽

Serge Picard ◽

Julien F. Ayroles

Keyword(s):

Single Cell ◽

Large Scale ◽

Enriched Library ◽

Library Preparation ◽

Rna Seq ◽

Medical Genomics ◽

Genome Wide ◽

Standard Tool ◽

Commercial Kits ◽

The Cost

RNA-seq has become the standard tool for collecting genome-wide expression data in diverse fields, from quantitative genetics and medical genomics to ecology and developmental biology. However, RNA-seq library preparation is still prohibitive for many laboratories. Recently, the field of single-cell transcriptomics has reduced costs and increased throughput by adopting early barcoding and pooling of individual samples —producing a single final library containing all samples. In contrast, RNA-seq protocols where each sample is processed individually are significantly more expensive and lower throughput than single-cell approaches. Yet, many projects depend on individual library generation to preserve important samples or for follow-up re-sequencing experiments. Improving on currently available RNA-seq methods we have developed TM3′seq, a 3′-enriched library preparation protocol that uses Tn5 transposase and preserves sample identity at each step. TM3′seq is designed for high-throughput processing of individual samples (96 samples in 6h, with only 3h hands-on time) at a fraction of the cost of commercial kits ($1.5 per sample). The protocol was tested in a range of human and Drosophila melanogaster RNA samples, recovering transcriptomes of the same quality and reliability than the commercial NEBNext kit. We expect that the cost- and time-efficient features of TM3′seq make large-scale RNA-seq experiments more permissive for the entire scientific community.

Download Full-text

Exploring the changing landscape of cell-to-cell variation after CTCF knockdown via single cell RNA-seq

BMC Genomics ◽

10.1186/s12864-019-6379-5 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 3

Author(s):

Wei Wang ◽

Gang Ren ◽

Ni Hong ◽

Wenfei Jin

Keyword(s):

Gene Expression ◽

Transcription Factors ◽

Single Cell ◽

Zinc Finger ◽

Ctcf Binding ◽

Rna Seq ◽

Expression Noise ◽

Genome Wide ◽

Cell Variation ◽

Variable Genes

Abstract Background CCCTC-Binding Factor (CTCF), also known as 11-zinc finger protein, participates in many cellular processes, including insulator activity, transcriptional regulation and organization of chromatin architecture. Based on single cell flow cytometry and single cell RNA-FISH analyses, our previous study showed that deletion of CTCF binding site led to a significantly increase of cellular variation of its target gene. However, the effect of CTCF on genome-wide landscape of cell-to-cell variation remains unclear. Results We knocked down CTCF in EL4 cells using shRNA, and conducted single cell RNA-seq on both wild type (WT) cells and CTCF-Knockdown (CTCF-KD) cells using Fluidigm C1 system. Principal component analysis of single cell RNA-seq data showed that WT and CTCF-KD cells concentrated in two different clusters on PC1, indicating that gene expression profiles of WT and CTCF-KD cells were systematically different. Interestingly, GO terms including regulation of transcription, DNA binding, zinc finger and transcription factor binding were significantly enriched in CTCF-KD-specific highly variable genes, implying tissue-specific genes such as transcription factors were highly sensitive to CTCF level. The dysregulation of transcription factors potentially explains why knockdown of CTCF leads to systematic change of gene expression. In contrast, housekeeping genes such as rRNA processing, DNA repair and tRNA processing were significantly enriched in WT-specific highly variable genes, potentially due to a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells. We further found that cellular variation-increased genes were significantly enriched in down-regulated genes, indicating CTCF knockdown simultaneously reduced the expression levels and increased the expression noise of its regulated genes. Conclusions To our knowledge, this is the first attempt to explore genome-wide landscape of cellular variation after CTCF knockdown. Our study not only advances our understanding of CTCF function in maintaining gene expression and reducing expression noise, but also provides a framework for examining gene function.

Download Full-text

Global Prediction of Chromatin Accessibility Using RNA-seq from Small Number of Cells

10.1101/035816 ◽

2016 ◽

Cited By ~ 1

Author(s):

Weiqiang Zhou ◽

Zhicheng Ji ◽

Hongkai Ji

Keyword(s):

Single Cell ◽

High Throughput ◽

Regulatory Element ◽

Cell Number ◽

Chromatin Accessibility ◽

Small Cell ◽

Rna Seq ◽

Genome Wide ◽

Gene Regulatory ◽

Number Of Cells

Conventional high-throughput technologies for mapping regulatory element activities such as ChIP-seq, DNase-seq and FAIRE-seq cannot analyze samples with small number of cells. The recently developed ATAC-seq allows regulome mapping in small-cell-number samples, but its signal in single cell or samples with ≤500 cells remains discrete or noisy. Compared to these technologies, measuring transcriptome by RNA-seq in single-cell and small-cell-number samples is more mature. Here we show that one can globally predict chromatin accessibility and infer regulome using RNA-seq. Genome-wide chromatin accessibility predicted by RNA-seq from 30 cells is comparable with ATAC-seq from 500 cells. Predictions based on single-cell RNA-seq can more accurately reconstruct bulk chromatin accessibility than using single-cell ATAC-seq by pooling the same number of cells. Integrating ATAC-seq with predictions from RNA-seq increases power of both methods. Thus, transcriptome-based prediction can provide a new tool for decoding gene regulatory programs in small-cell-number samples.

Download Full-text

Low-cost and High-throughput RNA-seq Library Preparation for Illumina Sequencing from Plant Tissue

BIO-PROTOCOL ◽

10.21769/bioprotoc.3799 ◽

2020 ◽

Vol 10 (20) ◽

Author(s):

Marta Bjornson ◽

Kaisa Kajala ◽

Cyril Zipfel ◽

Pingtao Ding

Keyword(s):

High Throughput ◽

Illumina Sequencing ◽

Plant Tissue ◽

Low Cost ◽

Library Preparation ◽

Rna Seq

Download Full-text

Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

10.1101/052225 ◽

2016 ◽

Cited By ~ 7

Author(s):

Bo Wang ◽

Junjie Zhu ◽

Emma Pierson ◽

Daniele Ramazzotti ◽

Serafim Batzoglou

Keyword(s):

Gene Expression ◽

Single Cell ◽

High Throughput ◽

Cell Populations ◽

Gene Expression Measurement ◽

Data Sets ◽

Similarity Learning ◽

Rna Seq ◽

High Level ◽

Cell Data

AbstractSingle-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical to identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts. Here, we propose a novel similarity-learning framework, SIMLR (single-cell interpretation via multi-kernel learning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization applications. Benchmarking against state-of-the-art methods for these applications, we used SIMLR to re-analyse seven representative single-cell data sets, including high-throughput droplet-based data sets with tens of thousands of cells. We show that SIMLR greatly improves clustering sensitivity and accuracy, as well as the visualization and interpretability of the data.

Download Full-text

Exploring the Changing Landscape of Cell-to-Cell Variation After CTCF Knockdown via Single Cell RNA-seq

10.1101/862847 ◽

2019 ◽

Author(s):

Wei Wang ◽

Gang Ren ◽

Ni Hong ◽

Wenfei Jin

Keyword(s):

Gene Expression ◽

Transcription Factors ◽

Single Cell ◽

Zinc Finger ◽

Ctcf Binding ◽

Rna Seq ◽

Expression Noise ◽

Genome Wide ◽

Cell Variation ◽

Variable Genes

AbstractBackgroundCCCTC-Binding Factor (CTCF), also known as 11-zinc finger protein, participates in many cellular processes, including insulator activity, transcriptional regulation and organization of chromatin architecture. Based on single cell flow cytometry and single cell RNA-FISH analyses, our previous study showed that deletion of CTCF binding site led to a significantly increase of cellular variation of its target gene. However, the effect of CTCF on genome-wide landscape of cell-to-cell variation is unclear.ResultsWe knocked down CTCF in EL4 cells using shRNA, and conducted single cell RNA-seq on both wild type (WT) cells and CTCF-Knockdown (CTCF-KD) cells using Fluidigm C1 system. Principal component analysis of single cell RNA-seq data showed that WT and CTCF-KD cells concentrated in two different clusters on PC1, indicating gene expression profiles of WT and CTCF-KD cells were systematically different. Interestingly, GO terms including regulation of transcription, DNA binding, Zinc finger and transcription factor binding were significantly enriched in CTCF-KD-specific highly variable genes, indicating tissue-specific genes such as transcription factors were highly sensitive to CTCF level. The dysregulation of transcription factors potentially explain why knockdown of CTCF lead to systematic change of gene expression. In contrast, housekeeping genes such as rRNA processing, DNA repair and tRNA processing were significantly enriched in WT-specific highly variable genes, potentially due to a higher cellular variation of cell activity in WT cells compared to CTCF-KD cells. We further found cellular variation-increased genes were significantly enriched in down-regulated genes, indicating CTCF knockdown simultaneously reduced the expression levels and increased the expression noise of its regulated genes.ConclusionsTo our knowledge, this is the first attempt to explore genome-wide landscape of cellular variation after CTCF knockdown. Our study not only advances our understanding of CTCF function in maintaining gene expression and reducing expression noise, but also provides a framework for examining gene function.

Download Full-text