scholarly journals RiboRid: A low cost, advanced, and ultra-efficient method to remove ribosomal RNA for bacterial transcriptomics

PLoS Genetics ◽  
2021 ◽  
Vol 17 (9) ◽  
pp. e1009821
Author(s):  
Donghui Choe ◽  
Richard Szubin ◽  
Saugat Poudel ◽  
Anand Sastry ◽  
Yoseb Song ◽  
...  

RNA sequencing techniques have enabled the systematic elucidation of gene expression (RNA-Seq), transcription start sites (differential RNA-Seq), transcript 3′ ends (Term-Seq), and post-transcriptional processes (ribosome profiling). The main challenge of transcriptomic studies is to remove ribosomal RNAs (rRNAs), which comprise more than 90% of the total RNA in a cell. Here, we report a low-cost and robust bacterial rRNA depletion method, RiboRid, based on the enzymatic degradation of rRNA by thermostable RNase H. This method implemented experimental considerations to minimize nonspecific degradation of mRNA and is capable of depleting pre-rRNAs that often comprise a large portion of RNA, even after rRNA depletion. We demonstrated the highly efficient removal of rRNA up to a removal efficiency of 99.99% for various transcriptome studies, including RNA-Seq, Term-Seq, and ribosome profiling, with a cost of approximately $10 per sample. This method is expected to be a robust method for large-scale high-throughput bacterial transcriptomic studies.

2021 ◽  
Vol 2115 (1) ◽  
pp. 012026
Author(s):  
Sonam Solanki ◽  
Gunendra Mahore

Abstract In the current process of producing vermicompost on a large-scale, the main challenge is to keep the worms alive. This is achieved by maintaining temperature and moisture in their living medium. It is a difficult task to maintain these parameters throughout the process. Currently, this is achieved by building infrastructure but this method requires a large initial investment and long-run maintenance. Also, these methods are limited to small-scale production. For large-scale production, a unit is developed which utilises natural airflow with water and automation. The main aim of this unit is to provide favourable conditions to worms in large-scale production with very low investment and minimum maintenance in long term. The key innovation of this research is that the technology used in the unit should be practical and easy to adopt by small farmers. For long-term maintenance of the technology lesser number of parts are used.


Author(s):  
Calla L Telzrow ◽  
Paul J Zwack ◽  
Shannon Esher Righi ◽  
Fred S Dietrich ◽  
Cliburn Chan ◽  
...  

Abstract RNA sequencing (RNA-Seq) experiments focused on gene expression involve removal of ribosomal RNA (rRNA) because it is the major RNA constituent of cells. This process, called RNA enrichment, is done primarily to reduce cost: without rRNA removal, deeper sequencing must be performed to compensate for the sequencing reads wasted on rRNA. The ideal RNA enrichment method removes all rRNA without affecting other RNA in the sample. We tested the performance of three RNA enrichment methods on RNA isolated from Cryptococcus neoformans, a fungal pathogen of humans. We find that the RNase H depletion method is more efficient in depleting rRNA and more specific in recapitulating non-rRNA levels present in unenriched controls than the commonly-used Poly(A) isolation method. The RNase H depletion method is also more effective than the Ribo-Zero depletion method as measured by rRNA depletion efficiency and recapitulation of protein-coding RNA levels present in unenriched controls, while the Ribo-Zero depletion method more closely recapitulates annotated non-coding RNA (ncRNA) levels. Finally, we leverage these data to accurately map the C. neoformans mitochondrial rRNA genes, and also demonstrate that RNA-Seq data generated with the RNase H and Ribo-Zero depletion methods can be used to explore novel C. neoformans long non-coding RNA genes.


2017 ◽  
Author(s):  
Bo-Hyun You ◽  
Sang-Ho Yoon ◽  
Jin-Wu Nam

AbstractThe advent of high-throughput RNA-sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising ninety-nine billion RNAs-seq reads from the ENCODE, human BodyMap projects, The Cancer Genome Atlas, and GTEx, CAFE enabled us to predict the directions of about eighty-nine billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalogue that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of non-coding genomes.


2017 ◽  
Author(s):  
Bo Wang ◽  
Daniele Ramazzotti ◽  
Luca De Sano ◽  
Junjie Zhu ◽  
Emma Pierson ◽  
...  

AbstractMotivationWe here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a cell-to-cell similarity measure from single-cell RNA-seq data. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of cells. SIMLR was benchmarked against state-of-the-art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization.Availability and ImplementationSIMLR is available on GitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on [email protected] or [email protected] InformationSupplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Bo Li ◽  
Joshua Gould ◽  
Yiming Yang ◽  
Siranush Sarkizova ◽  
Marcin Tabaka ◽  
...  

AbstractMassively parallel single-cell and single-nucleus RNA-seq (sc/snRNA-seq) have opened the way to systematic tissue atlases in health and disease, but as the scale of data generation is growing, so does the need for computational pipelines for scaled analysis. Here, we developed Cumulus, a cloud-based framework for analyzing large scale sc/snRNA-seq datasets. Cumulus combines the power of cloud computing with improvements in algorithm implementations to achieve high scalability, low cost, user-friendliness, and integrated support for a comprehensive set of features. We benchmark Cumulus on the Human Cell Atlas Census of Immune Cells dataset of bone marrow cells and show that it substantially improves efficiency over conventional frameworks, while maintaining or improving the quality of results, enabling large-scale studies.


2014 ◽  
Author(s):  
Jimin Song ◽  
Kevin C Chen

Recently, a wealth of epigenomic data has been generated by biochemical assays and next-generation sequencing (NGS) technologies. In particular, histone modification data generated by the ENCODE project and other large-scale projects show specific patterns associated with regulatory elements in the human genome.It is important to build a unified statistical model to decipher the patterns of multiple histone modifications in a cell type to annotate chromatin states such as transcription start sites, enhancers and transcribed regions rather than to map histone modifications individually to regulatory elements. Several genome-wide statistical models have been developed based on hidden Markov models (HMMs). These methods typically use the Expectation-Maximization (EM) algorithm to estimate the parameters of the model.Here we used spectral learning, a state-of-the-art parameter estimation algorithm in machine learning.We found that spectral learning plus a few (up to five) iterations of local optimization of the likelihood outperforms the standard EM algorithm.We also evaluated our software implementation called Spectacle on independent biological datasets and found that Spectacle annotated experimentally defined functional elements such as enhancers significantly better than a previous state-of-the-art method. Spectacle can be downloaded from https://github.com/jiminsong/Spectacle .


2019 ◽  
Vol 48 (4) ◽  
pp. e20-e20 ◽  
Author(s):  
Yiming Huang ◽  
Ravi U Sheth ◽  
Andrew Kaufman ◽  
Harris H Wang

Abstract Bacterial RNA sequencing (RNA-seq) is a powerful approach for quantitatively delineating the global transcriptional profiles of microbes in order to gain deeper understanding of their physiology and function. Cost-effective bacterial RNA-seq requires efficient physical removal of ribosomal RNA (rRNA), which otherwise dominates transcriptomic reads. However, current methods to effectively deplete rRNA of diverse non-model bacterial species are lacking. Here, we describe a probe and ribonuclease based strategy for bacterial rRNA removal. We implemented the method using either chemically synthesized oligonucleotides or amplicon-based single-stranded DNA probes and validated the technique on three novel gut microbiota isolates from three distinct phyla. We further showed that different probe sets can be used on closely related species. We provide a detailed methods protocol, probe sets for >5000 common microbes from RefSeq, and an online tool to generate custom probe libraries. This approach lays the groundwork for large-scale and cost-effective bacterial transcriptomics studies.


2021 ◽  
Author(s):  
Parashar Dhapola ◽  
Johan Rodhe ◽  
Rasmus Olofzon ◽  
Thomas Bonald ◽  
Eva Erlandsson ◽  
...  

The increasing capacity to perform large-scale single-cell genomic experiments continues to outpace the ability to efficiently handle growing datasets. Herein we present Scarf, a modularly designed Python package that seamlessly interoperates with other single-cell toolkits and allows for memory efficient single-cell analysis of millions of cells on a laptop or low-cost devices like single board computers. We demonstrate Scarf's memory and compute-time efficiency by applying it to the largest existing single-cell RNA-Seq and ATAC-Seq datasets. Scarf wraps memory efficient implementations of a graph-based t-stochastic neighbour embedding and hierarchical clustering algorithm. Moreover, Scarf performs accurate reference-anchored mapping of datasets while maintaining memory efficiency. By implementing a novel data downsampling algorithm, Scarf additionally has the capacity to generate representative sampling of cells from a given dataset wherein rare cell populations and lineage differentiation trajectories are conserved. Together, Scarf provides a framework wherein any researcher can perform advanced processing, downsampling, reanalysis and integration of atlas-scale datasets on standard laptop computers.


2018 ◽  
Author(s):  
Mari Kamitani ◽  
Makoto Kashima ◽  
Ayumi Tezuka ◽  
Atsushi J. Nagano

AbstractRNA-Seq is a whole-transcriptome analysis method used to research biological mechanisms and functions; its use in large-scale experiments is limited by costs and labour. In this study, we established a high-throughput and cost effective RNA-Seq library preparation method that did not require mRNA enrichment. The method adds unique index sequences to samples during reverse transcription (RT) that is conducted at a higher temperature (≥62°C) to suppress RT of A-rich sequences in rRNA, and then pools all samples into a single tube. Both single-read and paired end sequencing of libraries is enabled. We found that the pooled RT products contained large amounts of RNA, mainly rRNA, and caused over-estimations of the quantity of DNA, resulting in unstable tagmentation results. Degradation of RNA before tagmentation was necessary for the stable preparation of libraries. We named this protocol low-cost and easy RNA-Seq (Lasy-Seq), and used it to investigate temperature responses in Arabidopsis thaliana. We analysed how sub-ambient temperatures (10–30°C) affected the plant transcriptomes, using time-courses of RNA-Seq from plants grown in randomly fluctuating temperature conditions. Our results suggest that there are diverse mechanisms behind plant temperature responses at different time scales.


2019 ◽  
Author(s):  
Yiming Huang ◽  
Ravi U Sheth ◽  
Andrew Kaufman ◽  
Harris H Wang

AbstractBacterial RNA sequencing (RNA-seq) is a powerful approach for quantitatively delineating the global transcriptional profiles of microbes in order to gain deeper understanding of their physiology and function. Cost-effective bacterial RNA-seq requires efficient physical removal of ribosomal RNA (rRNA), which otherwise dominates transcriptomic reads. However, current methods to effectively deplete rRNA of diverse non-model bacterial species are lacking. Here, we describe a probe and ribonuclease based strategy for bacterial rRNA removal. We implemented the method using either chemically synthesized oligonucleotides or amplicon-based single-stranded DNA probes and validated the technique on three novel gut microbiota isolates from three distinct phyla. We further showed that different probe sets can be used on closely related species. We provide a detailed methods protocol, probe sets for >5,000 common microbes from RefSeq, and an online tool to generate custom probe libraries. This approach lays the groundwork for large-scale and cost-effective bacterial transcriptomics studies.


Sign in / Sign up

Export Citation Format

Share Document