Scalable and cost-effective ribonuclease-based rRNA depletion for transcriptomics

AbstractBacterial RNA sequencing (RNA-seq) is a powerful approach for quantitatively delineating the global transcriptional profiles of microbes in order to gain deeper understanding of their physiology and function. Cost-effective bacterial RNA-seq requires efficient physical removal of ribosomal RNA (rRNA), which otherwise dominates transcriptomic reads. However, current methods to effectively deplete rRNA of diverse non-model bacterial species are lacking. Here, we describe a probe and ribonuclease based strategy for bacterial rRNA removal. We implemented the method using either chemically synthesized oligonucleotides or amplicon-based single-stranded DNA probes and validated the technique on three novel gut microbiota isolates from three distinct phyla. We further showed that different probe sets can be used on closely related species. We provide a detailed methods protocol, probe sets for >5,000 common microbes from RefSeq, and an online tool to generate custom probe libraries. This approach lays the groundwork for large-scale and cost-effective bacterial transcriptomics studies.

Download Full-text

Scalable and cost-effective ribonuclease-based rRNA depletion for transcriptomics

Nucleic Acids Research ◽

10.1093/nar/gkz1169 ◽

2019 ◽

Vol 48 (4) ◽

pp. e20-e20 ◽

Cited By ~ 13

Author(s):

Yiming Huang ◽

Ravi U Sheth ◽

Andrew Kaufman ◽

Harris H Wang

Keyword(s):

Large Scale ◽

Bacterial Species ◽

Cost Effective ◽

Rna Seq ◽

Closely Related Species ◽

Online Tool ◽

Bacterial Rna ◽

Powerful Approach ◽

Rrna Depletion ◽

And Function

Abstract Bacterial RNA sequencing (RNA-seq) is a powerful approach for quantitatively delineating the global transcriptional profiles of microbes in order to gain deeper understanding of their physiology and function. Cost-effective bacterial RNA-seq requires efficient physical removal of ribosomal RNA (rRNA), which otherwise dominates transcriptomic reads. However, current methods to effectively deplete rRNA of diverse non-model bacterial species are lacking. Here, we describe a probe and ribonuclease based strategy for bacterial rRNA removal. We implemented the method using either chemically synthesized oligonucleotides or amplicon-based single-stranded DNA probes and validated the technique on three novel gut microbiota isolates from three distinct phyla. We further showed that different probe sets can be used on closely related species. We provide a detailed methods protocol, probe sets for >5000 common microbes from RefSeq, and an online tool to generate custom probe libraries. This approach lays the groundwork for large-scale and cost-effective bacterial transcriptomics studies.

Download Full-text

A simple, cost-effective, and robust method for rRNA depletion in RNA-sequencing studies

10.1101/2020.01.06.896837 ◽

2020 ◽

Cited By ~ 4

Author(s):

Peter H. Culviner ◽

Chantal K. Guegler ◽

Michael T. Laub

Keyword(s):

Gene Expression ◽

Rna Sequencing ◽

Ribosomal Rna ◽

Bacterial Species ◽

Cost Effective ◽

Rna Seq ◽

Robust Method ◽

Total Rna ◽

Metagenomic Sample ◽

Rrna Depletion

AbstractThe profiling of gene expression by RNA-sequencing (RNA-seq) has enabled powerful studies of global transcriptional patterns in all organisms, including bacteria. Because the vast majority of RNA in bacteria is ribosomal RNA (rRNA), it is standard practice to deplete the rRNA from a total RNA sample such that the reads in an RNA-seq experiment derive predominantly from mRNA. One of the most commonly used commercial kits for rRNA depletion, the Ribo-Zero kit from Illumina, was recently discontinued. Here, we report the development a simple, cost-effective, and robust method for depleting rRNA that can be easily implemented by any lab or facility. We first developed an algorithm for designing biotinylated oligonucleotides that will hybridize tightly and specifically to the 23S, 16S, and 5S rRNAs from any species of interest. Precipitation of these oligonucleotides bound to rRNA by magnetic streptavidin beads then depletes rRNA from a complex, total RNA sample such that ~75-80% of reads in a typical RNA-seq experiment derive from mRNA. Importantly, we demonstrate a high correlation of RNA abundance or fold-change measurements in RNA-seq experiments between our method and the previously available Ribo-Zero kit. Complete details on the methodology are provided, including open-source software for designing oligonucleotides optimized for any bacterial species or metagenomic sample of interest.ImportanceThe ability to examine global patterns of gene expression in microbes through RNA-sequencing has fundamentally transformed microbiology. However, RNA-seq depends critically on the removal of ribosomal RNA from total RNA samples. Otherwise, rRNA would comprise upwards of 90% of the reads in a typical RNA-seq experiment, limiting the reads coming from messenger RNA or requiring high total read depth. A commonly used, kit for rRNA subtraction from Illumina was recently discontinued. Here, we report the development of a ‘do-it-yourself’ kit for rapid, cost-effective, and robust depletion of rRNA from total RNA. We present an algorithm for designing biotinylated oligonucleotides that will hybridize to the rRNAs from a target set of species. We then demonstrate that the designed oligos enable sufficient rRNA depletion to produce RNA-seq data with 75-80% of reads comming from mRNA. The methodology presented should enable RNA-seq studies on any species or metagenomic sample of interest.

Download Full-text

A Simple, Cost-Effective, and Robust Method for rRNA Depletion in RNA-Sequencing Studies

mBio ◽

10.1128/mbio.00010-20 ◽

2020 ◽

Vol 11 (2) ◽

Cited By ~ 9

Author(s):

Peter H. Culviner ◽

Chantal K. Guegler ◽

Michael T. Laub

Keyword(s):

Gene Expression ◽

Rna Sequencing ◽

Bacterial Species ◽

Extended Period ◽

Cost Effective ◽

Rna Seq ◽

Robust Method ◽

Total Rna ◽

Metagenomic Sample ◽

Rrna Depletion

ABSTRACT The profiling of gene expression by RNA sequencing (RNA-seq) has enabled powerful studies of global transcriptional patterns in all organisms, including bacteria. Because the vast majority of RNA in bacteria is rRNA, it is standard practice to deplete the rRNA from a total RNA sample such that the reads in an RNA-seq experiment derive predominantly from mRNA. One of the most commonly used commercial kits for rRNA depletion, the Ribo-Zero kit from Illumina, was recently discontinued abruptly and for an extended period of time. Here, we report the development of a simple, cost-effective, and robust method for depleting rRNA that can be easily implemented by any lab or facility. We first developed an algorithm for designing biotinylated oligonucleotides that will hybridize tightly and specifically to the 23S, 16S, and 5S rRNAs from any species of interest. Precipitation of these oligonucleotides bound to rRNA by magnetic streptavidin-coated beads then depletes rRNA from a complex, total RNA sample such that ∼75 to 80% of reads in a typical RNA-seq experiment derive from mRNA. Importantly, we demonstrate a high correlation of RNA abundance or fold change measurements in RNA-seq experiments between our method and the Ribo-Zero kit. Complete details on the methodology are provided, including open-source software for designing oligonucleotides optimized for any bacterial species or community of interest. IMPORTANCE The ability to examine global patterns of gene expression in microbes through RNA sequencing has fundamentally transformed microbiology. However, RNA-seq depends critically on the removal of rRNA from total RNA samples. Otherwise, rRNA would comprise upward of 90% of the reads in a typical RNA-seq experiment, limiting the reads coming from mRNA or requiring high total read depth. A commonly used kit for rRNA subtraction from Illumina was recently unavailable for an extended period of time, disrupting routine rRNA depletion. Here, we report the development of a “do-it-yourself” kit for rapid, cost-effective, and robust depletion of rRNA from total RNA. We present an algorithm for designing biotinylated oligonucleotides that will hybridize to the rRNAs from a target set of species. We then demonstrate that the designed oligonucleotides enable sufficient rRNA depletion to produce RNA-seq data with 75 to 80% of reads coming from mRNA. The methodology presented should enable RNA-seq studies on any species or metagenomic sample of interest.

Download Full-text

Gene Expression Imputation with Generative Adversarial Imputation Nets

10.1101/2020.06.09.141689 ◽

2020 ◽

Author(s):

Ramon Viñas ◽

Tiago Azevedo ◽

Eric R. Gamazon ◽

Pietro Liò

Keyword(s):

Gene Expression ◽

Large Scale ◽

Biological Significance ◽

Predictive Performance ◽

Cost Effective ◽

Rna Seq ◽

Comprehensive Collection ◽

Genomic Studies ◽

Biological Discovery ◽

Cancer Types

AbstractA question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we present GAIN-GTEx, a method for gene expression imputation based on Generative Adversarial Imputation Networks. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We compare our model to several standard and state-of-the-art imputation methods and show that GAIN-GTEx is significantly superior in terms of predictive performance and runtime. Furthermore, our results indicate strong generalisation on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

Download Full-text

Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

ISRN Bioinformatics ◽

10.1155/2013/481545 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 9

Author(s):

Shanrong Zhao ◽

Kurt Prenger ◽

Lance Smith

Keyword(s):

Data Analysis ◽

Large Scale ◽

Scale Up ◽

Local Environment ◽

Transcriptome Profiling ◽

Cost Effective ◽

Rna Seq ◽

Practical Challenge ◽

Amazon Web Services ◽

Computational Resources

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.

Download Full-text

CAFU: a Galaxy framework for exploring unmapped RNA-Seq data

Briefings in Bioinformatics ◽

10.1093/bib/bbz018 ◽

2019 ◽

Vol 21 (2) ◽

pp. 676-686 ◽

Cited By ~ 5

Author(s):

Siyuan Chen ◽

Chengzhi Ren ◽

Jingjing Zhai ◽

Jiantao Yu ◽

Xuyang Zhao ◽

...

Keyword(s):

Large Scale ◽

Biological Information ◽

Machine Learning Techniques ◽

Data Sets ◽

Rna Seq ◽

Mixed Species ◽

Short Reads ◽

Comprehensive Collection ◽

Expression Characterization ◽

And Function

Abstract A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU.

Download Full-text

A streamlined, cost-effective, and specific method to deplete transcripts for RNA-seq

10.1101/2020.05.21.109033 ◽

2020 ◽

Cited By ~ 1

Author(s):

Amber Baldwin ◽

Adam R Morris ◽

Neelanjan Mukherjee

Keyword(s):

Cost Effective ◽

Circular Rnas ◽

Specific Method ◽

Rna Seq ◽

Gold Standard Method ◽

Wide Range ◽

Rrna Depletion ◽

Commercial Kits ◽

The Cost

RNA-sequencing is a powerful and increasingly prevalent method to answer biological questions. Depletion of ribosomal RNA (rRNA), which accounts for 80% of total RNA, is an extremely important step to increase the power of RNA-seq. Selection for polyadenylated RNA is a commonly used approach that excludes rRNA, as well as, important non-polyadenylated RNAs, such as histones, circular RNAs, and many long noncoding RNAs. Commercial methods to deplete rRNA are cost-prohibitive and the gold standard method is no longer available as a standalone kit. Alternative non-commercial methods suffer from inconsistent depletion. Through careful characterization of all reaction parameters, we developed an optimized RNaseH-based depletion of human rRNA. Our method exhibited comparable or better rRNA depletion compared to commercial kits at a fraction of the cost and across a wide-range of input RNA amounts.

Download Full-text

RiboRid: A low cost, advanced, and ultra-efficient method to remove ribosomal RNA for bacterial transcriptomics

PLoS Genetics ◽

10.1371/journal.pgen.1009821 ◽

2021 ◽

Vol 17 (9) ◽

pp. e1009821

Author(s):

Donghui Choe ◽

Richard Szubin ◽

Saugat Poudel ◽

Anand Sastry ◽

Yoseb Song ◽

...

Keyword(s):

Large Scale ◽

Low Cost ◽

Rnase H ◽

Ribosome Profiling ◽

Rna Seq ◽

Transcription Start Sites ◽

Main Challenge ◽

Rrna Depletion ◽

A Cell ◽

Transcriptomic Studies

RNA sequencing techniques have enabled the systematic elucidation of gene expression (RNA-Seq), transcription start sites (differential RNA-Seq), transcript 3′ ends (Term-Seq), and post-transcriptional processes (ribosome profiling). The main challenge of transcriptomic studies is to remove ribosomal RNAs (rRNAs), which comprise more than 90% of the total RNA in a cell. Here, we report a low-cost and robust bacterial rRNA depletion method, RiboRid, based on the enzymatic degradation of rRNA by thermostable RNase H. This method implemented experimental considerations to minimize nonspecific degradation of mRNA and is capable of depleting pre-rRNAs that often comprise a large portion of RNA, even after rRNA depletion. We demonstrated the highly efficient removal of rRNA up to a removal efficiency of 99.99% for various transcriptome studies, including RNA-Seq, Term-Seq, and ribosome profiling, with a cost of approximately $10 per sample. This method is expected to be a robust method for large-scale high-throughput bacterial transcriptomic studies.

Download Full-text

Lasy-Seq: a high-throughput library preparation method for RNA-Seq and its application in the analysis of plant responses to fluctuating temperatures

10.1101/463596 ◽

2018 ◽

Author(s):

Mari Kamitani ◽

Makoto Kashima ◽

Ayumi Tezuka ◽

Atsushi J. Nagano

Keyword(s):

High Throughput ◽

Large Scale ◽

Preparation Method ◽

Low Cost ◽

Cost Effective ◽

Plant Responses ◽

Library Preparation ◽

Rna Seq ◽

Temperature Responses ◽

Library Preparation Method

AbstractRNA-Seq is a whole-transcriptome analysis method used to research biological mechanisms and functions; its use in large-scale experiments is limited by costs and labour. In this study, we established a high-throughput and cost effective RNA-Seq library preparation method that did not require mRNA enrichment. The method adds unique index sequences to samples during reverse transcription (RT) that is conducted at a higher temperature (≥62°C) to suppress RT of A-rich sequences in rRNA, and then pools all samples into a single tube. Both single-read and paired end sequencing of libraries is enabled. We found that the pooled RT products contained large amounts of RNA, mainly rRNA, and caused over-estimations of the quantity of DNA, resulting in unstable tagmentation results. Degradation of RNA before tagmentation was necessary for the stable preparation of libraries. We named this protocol low-cost and easy RNA-Seq (Lasy-Seq), and used it to investigate temperature responses in Arabidopsis thaliana. We analysed how sub-ambient temperatures (10–30°C) affected the plant transcriptomes, using time-courses of RNA-Seq from plants grown in randomly fluctuating temperature conditions. Our results suggest that there are diverse mechanisms behind plant temperature responses at different time scales.

Download Full-text

A portable and cost-effective microfluidic system for massively parallel single-cell transcriptome profiling

10.1101/818450 ◽

2019 ◽

Cited By ~ 4

Author(s):

Chuanyu Liu ◽

Tao Wu ◽

Fei Fan ◽

Ya Liu ◽

Liang Wu ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Transcriptional Profiling ◽

Transcriptome Profiling ◽

Cost Effective ◽

Microfluidic System ◽

Resource Limited ◽

Cell Transcriptome ◽

Species Specific ◽

And Function

AbstractSingle-cell technologies are becoming increasingly widespread and have been revolutionizing our understanding of cell identity, state, diversity and function. However, current platforms can be slow to apply to large-scale studies and resource-limited clinical arenas due to a variety of reasons including cost, infrastructure, sample quality and requirements. Here we report DNBelab C4 (C4), a negative pressure orchestrated, portable and cost-effective device that enables high-throughput single-cell transcriptional profiling. C4 system can efficiently allow discrimination of species-specific cells at high resolution and dissect tissue heterogeneity in different organs, such as murine lung and cerebral cortex. Finally, we show that the C4 system is comparable to existing platforms but has huge benefits in cost and portability and, as such, it will be of great interest for the wider scientific community.

Download Full-text