Peer Review #1 of "BingleSeq: a user-friendly R package for bulk and single-cell RNA-Seq data analysis (v0.1)"

AbstractSummaryMolecular heterogeneities bring great challenges for cancer diagnosis and treatment. Recent advance in single cell RNA-sequencing (scRNA-seq) technology make it possible to study cancer transcriptomic heterogeneities at single cell level. Here, we develop an R package named scCancer which focuses on processing and analyzing scRNA-seq data for cancer research. Except basic data processing steps, this package takes several special considerations for cancer-specific features. Firstly, the package introduced comprehensive quality control metrics. Secondly, it used a data-driven machine learning algorithm to accurately identify major cancer microenvironment cell populations. Thirdly, it estimated a malignancy score to classify malignant (cancerous) and non-malignant cells. Then, it analyzed intra-tumor heterogeneities by key cellular phenotypes (such as cell cycle and stemness) and gene signatures. Finally, a user-friendly graphic report was generated for all the analyses.Availabilityhttp://lifeome.net/software/sccancer/[email protected]

Download Full-text

BingleSeq: A user-friendly R package for Bulk and Single-cell RNA-Seq Data Analysis

10.1101/2020.06.16.148239 ◽

2020 ◽

Author(s):

Daniel Dimitrov ◽

Quan Gu

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

High Throughput Sequencing ◽

Gene Annotation ◽

Differential Expression Analysis ◽

Transcriptome Profiling ◽

R Package ◽

Rna Seq ◽

The Individual ◽

User Friendly

AbstractRNA sequencing is a high-throughput sequencing technique considered as an indispensable research tool used in a broad range of transcriptome analysis studies. The most common application of RNA Sequencing is Differential Expression analysis and it is used to determine genetic loci with distinct expression across different conditions. On the other hand, an emerging field called single-cell RNA sequencing is used for transcriptome profiling at the individual cell level. The standard protocols for both these types of analyses include the processing of sequencing libraries and result in the generation of count matrices. An obstacle to these analyses and the acquisition of meaningful results is that both require programming expertise.BingleSeq was developed as an intuitive application that provides a user-friendly solution for the analysis of count matrices produced by both Bulk and Single-cell RNA-Seq experiments. This was achieved by building an interactive dashboard-like user interface and incorporating three state-of-the-art software packages for each type of the aforementioned analyses, alongside additional features such as key visualisation techniques, functional gene annotation analysis and rank-based consensus for differential gene analysis results, among others. As a result, BingleSeq puts the best and most widely used packages and tools for RNA-Seq analyses at the fingertips of biologists with no programming experience.

Download Full-text

BingleSeq: a user-friendly R package for bulk and single-cell RNA-Seq data analysis

PeerJ ◽

10.7717/peerj.10469 ◽

2020 ◽

Vol 8 ◽

pp. e10469

Author(s):

Daniel Dimitrov ◽

Quan Gu

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Differential Expression Analysis ◽

Transcriptome Profiling ◽

R Package ◽

Rna Seq ◽

Single Cell Rna Sequencing ◽

The Individual ◽

Visualization Techniques ◽

User Friendly

Background RNA sequencing is an indispensable research tool used in a broad range of transcriptome analysis studies. The most common application of RNA Sequencing is differential expression analysis and it is used to determine genetic loci with distinct expression across different conditions. An emerging field called single-cell RNA sequencing is used for transcriptome profiling at the individual cell level. The standard protocols for both of these approaches include the processing of sequencing libraries and result in the generation of count matrices. An obstacle to these analyses and the acquisition of meaningful results is that they require programing expertise. Although some effort has been directed toward the development of user-friendly RNA-Seq analysis analysis tools, few have the flexibility to explore both Bulk and single-cell RNA sequencing. Implementation BingleSeq was developed as an intuitive application that provides a user-friendly solution for the analysis of count matrices produced by both Bulk and Single-cell RNA-Seq experiments. This was achieved by building an interactive dashboard-like user interface which incorporates three state-of-the-art software packages for each type of the aforementioned analyses. Furthermore, BingleSeq includes additional features such as visualization techniques, extensive functional annotation analysis and rank-based consensus for differential gene analysis results. As a result, BingleSeq puts some of the best reviewed and most widely used packages and tools for RNA-Seq analyses at the fingertips of biologists with no programing experience. Availability BingleSeq is as an easy-to-install R package available on GitHub at https://github.com/dbdimitrov/BingleSeq/.

Download Full-text

LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data

Nucleic Acids Research ◽

10.1093/nar/gkz655 ◽

2019 ◽

Vol 47 (18) ◽

pp. e111-e111 ◽

Cited By ~ 12

Author(s):

Changlin Wan ◽

Wennan Chang ◽

Yu Zhang ◽

Fenil Shah ◽

Xiaoyu Lu ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Single Cells ◽

Cell Types ◽

R Package ◽

Data Sets ◽

Rna Seq ◽

Mrna Metabolism ◽

Transcriptional Regulatory ◽

User Friendly

Abstract A key challenge in modeling single-cell RNA-seq data is to capture the diversity of gene expression states regulated by different transcriptional regulatory inputs across individual cells, which is further complicated by largely observed zero and low expressions. We developed a left truncated mixture Gaussian (LTMG) model, from the kinetic relationships of the transcriptional regulatory inputs, mRNA metabolism and abundance in single cells. LTMG infers the expression multi-modalities across single cells, meanwhile, the dropouts and low expressions are treated as left truncated. We demonstrated that LTMG has significantly better goodness of fitting on an extensive number of scRNA-seq data, comparing to three other state-of-the-art models. Our biological assumption of the low non-zero expressions, rationality of the multimodality setting, and the capability of LTMG in extracting expression states specific to cell types or functions, are validated on independent experimental data sets. A differential gene expression test and a co-regulation module identification method are further developed. We experimentally validated that our differential expression test has higher sensitivity and specificity, compared with other five popular methods. The co-regulation analysis is capable of retrieving gene co-regulation modules corresponding to perturbed transcriptional regulations. A user-friendly R package with all the analysis power is available at https://github.com/zy26/LTMGSCA.

Download Full-text

Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes

10.1101/2020.08.24.265298 ◽

2020 ◽

Author(s):

Dustin J. Sokolowski ◽

Mariela Faykoo-Martinez ◽

Lauren Erdman ◽

Huayun Hou ◽

Cadia Chan ◽

...

Keyword(s):

Single Cell ◽

Differential Expression ◽

Differentially Expressed Genes ◽

Cell Types ◽

R Package ◽

Differentially Expressed ◽

Rna Seq ◽

Kidney Regeneration ◽

Cell Type ◽

User Friendly

AbstractRNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.HighlightsscMappR integrates scRNA-seq and bulk RNA-seq to re-calibrate bulk differentially expressed genes (DEGs).scMappR correctly identified immune-cell expressed DEGs from a bulk RNA-seq analysis of mouse kidney regeneration.scMappR is deployed as a user-friendly R package available at CRAN.

Download Full-text

SC2MeNetDrug: A computational tool to uncover inter-cell signaling targets and identify relevant drugs based on single cell RNA-seq data

10.1101/2021.11.15.468755 ◽

2021 ◽

Author(s):

Jiarui Feng ◽

Simon Peter Goedegebuure ◽

Amanda Zeng ◽

Ye Bi ◽

Ting Wang ◽

...

Keyword(s):

Data Analysis ◽

Cell Signaling ◽

Single Cell ◽

Communication Networks ◽

Cell Types ◽

Rna Seq ◽

Computational Tool ◽

User Friendly ◽

Effective Drugs ◽

Cell Cell

Single cell RNA sequencing (scRNA-seq) is a powerful technology to investigate the transcriptional programs in stromal, immune and tumor cells or neuron cells within the tumor or Alzheimer's Disease (AD) brain microenvironment (ME) or niche. Cell-cell communications within ME play important roles in disease progression and immunotherapy response, and are novel and critical therapeutic targets. Though many tools of scRNA-seq analysis have been developed to investigate the heterogeneity and sub-populations of cells, few were designed for uncovering cell-cell communications of ME and predict the potentially effective drugs to inhibit the communications. Moreover, the data analysis processes of discovering signaling communication networks and effective drugs using scRNA-seq data are complex and involving a set of critical analysis processes and external supportive data resources, which are difficult for researchers who have no strong computational background and training in scRNA-seq data analysis. To address these challenges, in this study, we developed a novel computational tool, SC2MeNetDrug (https://fuhaililab.github.io/sc2MeNetDrug/). It was specifically designed using scRNA-seq data to identify cell types within MEs, uncover the dysfunctional signaling pathways within individual cell types, inter-cell signaling communications, and predict effective drugs that can potentially disrupt cell-cell signaling communications. SC2MeNetDrug provided a user-friendly graphical user interface to encapsulate the data analysis modules, which can facilitate the scRNA-seq data based-discovery of novel inter-cell signaling communications and novel therapeutic regimens.

Download Full-text

User-friendly, scalable tools and workflows for single-cell analysis

10.1101/2020.04.08.032698 ◽

2020 ◽

Cited By ~ 3

Author(s):

P. Moreno ◽

N. Huang ◽

J.R. Manning ◽

S. Mohammed ◽

A. Solovyev ◽

...

Keyword(s):

Data Analysis ◽

Single Cell ◽

Programming Languages ◽

Single Cell Analysis ◽

Command Line ◽

Cell Analysis ◽

Rna Seq ◽

Interactive Analysis ◽

User Friendly ◽

Analysis Environment

AbstractSingle-cell RNA-Seq (scRNA-Seq) data analysis requires expertise in command-line tools, programming languages and scaling on compute infrastructure. As scRNA-Seq becomes widespread, computational pipelines need to be more accessible, simpler and scalable. We introduce an interactive analysis environment for scRNA-Seq, based on Galaxy, with ~70 functions from major single-cell analysis tools, which can be run on compute clusters, cloud providers or single machines, to bring compute to the data in scRNA-Seq.

Download Full-text

OEFinder: A user interface to identify and visualize ordering effects in single-cell RNA-seq data

10.1101/025437 ◽

2015 ◽

Cited By ~ 1

Author(s):

Ning Leng ◽

Jeea Choi ◽

Li-Fang Chu ◽

James Thomson ◽

Christina Kendziorski ◽

...

Keyword(s):

Gene Expression ◽

User Interface ◽

Single Cell ◽

Statistical Method ◽

R Package ◽

Data Sets ◽

Graphical Interface ◽

Rna Seq ◽

Ordering Effects ◽

User Friendly

A recent paper identified an artifact in multiple single-cell RNA-seq (scRNA-seq) data sets generated by the Fluidigm C1 platform. Specifically, Leng* et al. showed significantly increased gene expression in cells captured from sites with small or large plate output IDs. We refer to this artifact as an ordering effect (OE). Including OE genes in downstream analyses could lead to biased results. To address this problem, we developed a statistical method and software called OEFinder to identify a sorted list of OE genes. OEFinder is available as an R package along with user-friendly graphical interface implementations that allows users to check for potential artifacts in scRNA-seq data generated by the Fluidigm C1 platform.

Download Full-text

ExperimentSubset: an R package to manage subsets of Bioconductor Experiment objects

Bioinformatics ◽

10.1093/bioinformatics/btab179 ◽

2021 ◽

Author(s):

Irzam Sarfraz ◽

Muhammad Asif ◽

Joshua D Campbell

Keyword(s):

Single Cell ◽

R Package ◽

Poor Quality ◽

Data Matrix ◽

Supplementary Information ◽

Data Provenance ◽

Rna Seq ◽

Efficient Management ◽

The Matrix ◽

The Relationship

Abstract Motivation R Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for storing one or more matrix-like assays along with associated row and column data. These objects have been used to facilitate the storage and analysis of high-throughput genomic data generated from technologies such as single-cell RNA sequencing. One common computational task in many genomics analysis workflows is to perform subsetting of the data matrix before applying down-stream analytical methods. For example, one may need to subset the columns of the assay matrix to exclude poor-quality samples or subset the rows of the matrix to select the most variable features. Traditionally, a second object is created that contains the desired subset of assay from the original object. However, this approach is inefficient as it requires the creation of an additional object containing a copy of the original assay and leads to challenges with data provenance. Results To overcome these challenges, we developed an R package called ExperimentSubset, which is a data container that implements classes for efficient storage and streamlined retrieval of assays that have been subsetted by rows and/or columns. These classes are able to inherently provide data provenance by maintaining the relationship between the subsetted and parent assays. We demonstrate the utility of this package on a single-cell RNA-seq dataset by storing and retrieving subsets at different stages of the analysis while maintaining a lower memory footprint. Overall, the ExperimentSubset is a flexible container for the efficient management of subsets. Availability and implementation ExperimentSubset package is available at Bioconductor: https://bioconductor.org/packages/ExperimentSubset/ and Github: https://github.com/campbio/ExperimentSubset. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics

International Journal of Molecular Sciences ◽

10.3390/ijms22031399 ◽

2021 ◽

Vol 22 (3) ◽

pp. 1399

Author(s):

Salim Ghannoum ◽

Waldir Leoncio Netto ◽

Damiano Fantini ◽

Benjamin Ragan-Kelley ◽

Amirabbas Parizadeh ◽

...

Keyword(s):

Single Cell ◽

Biomarker Discovery ◽

Enrichment Analysis ◽

Myxoid Liposarcoma ◽

R Package ◽

Differential Analysis ◽

A Cell ◽

Reproducible Analysis ◽

Transcriptomic Level ◽

User Friendly

The growing attention toward the benefits of single-cell RNA sequencing (scRNA-seq) is leading to a myriad of computational packages for the analysis of different aspects of scRNA-seq data. For researchers without advanced programing skills, it is very challenging to combine several packages in order to perform the desired analysis in a simple and reproducible way. Here we present DIscBIO, an open-source, multi-algorithmic pipeline for easy, efficient and reproducible analysis of cellular sub-populations at the transcriptomic level. The pipeline integrates multiple scRNA-seq packages and allows biomarker discovery with decision trees and gene enrichment analysis in a network context using single-cell sequencing read counts through clustering and differential analysis. DIscBIO is freely available as an R package. It can be run either in command-line mode or through a user-friendly computational pipeline using Jupyter notebooks. We showcase all pipeline features using two scRNA-seq datasets. The first dataset consists of circulating tumor cells from patients with breast cancer. The second one is a cell cycle regulation dataset in myxoid liposarcoma. All analyses are available as notebooks that integrate in a sequential narrative R code with explanatory text and output data and images. R users can use the notebooks to understand the different steps of the pipeline and will guide them to explore their scRNA-seq data. We also provide a cloud version using Binder that allows the execution of the pipeline without the need of downloading R, Jupyter or any of the packages used by the pipeline. The cloud version can serve as a tutorial for training purposes, especially for those that are not R users or have limited programing skills. However, in order to do meaningful scRNA-seq analyses, all users will need to understand the implemented methods and their possible options and limitations.

Download Full-text