InterMineR: an R package for InterMine databases

Konstantinos A Kyritsis; Bing Wang; Julie Sullivan; Rachel Lyne; Gos Micklem

doi:10.1093/bioinformatics/btz039

BloodGen3Module: Blood transcriptional module repertoire analysis and visualization using R

Bioinformatics ◽

10.1093/bioinformatics/btab121 ◽

2021 ◽

Author(s):

Darawan Rinchai ◽

Jessica Roelands ◽

Mohammed Toufiq ◽

Wouter Hendrickx ◽

Matthew C Altman ◽

...

Keyword(s):

Transcript Abundance ◽

R Package ◽

Supplementary Information ◽

Illustrative Case ◽

Bioinformatic Tools ◽

Transcriptional Module ◽

Wide Range ◽

Downstream Analysis ◽

Computing Module ◽

Parallel Workflow

Abstract Motivation We previously described the construction and characterization of generic and reusable blood transcriptional module repertoires. More recently we released a third iteration (“BloodGen3” module repertoire) that comprises 382 functionally annotated gene sets (modules) and encompasses 14,168 transcripts. Custom bioinformatic tools are needed to support downstream analysis, visualization and interpretation relying on such fixed module repertoires. Results We have developed and describe here a R package, BloodGen3Module. The functions of our package permit group comparison analyses to be performed at the module-level, and to display the results as annotated fingerprint grid plots. A parallel workflow for computing module repertoire changes for individual samples rather than groups of samples is also available; these results are displayed as fingerprint heatmaps. An illustrative case is used to demonstrate the steps involved in generating blood transcriptome repertoire fingerprints of septic patients. Taken together, this resource could facilitate the analysis and interpretation of changes in blood transcript abundance observed across a wide range of pathological and physiological states. Availability The BloodGen3Module package and documentation are freely available from Github: https://github.com/Drinchai/BloodGen3Module Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

RCy3: Network biology using Cytoscape from within R

F1000Research ◽

10.12688/f1000research.20887.3 ◽

2019 ◽

Vol 8 ◽

pp. 1774 ◽

Cited By ~ 1

Author(s):

Julia A. Gustavsen ◽

Shraddha Pai ◽

Ruth Isserlin ◽

Barry Demchak ◽

Alexander R. Pico

Keyword(s):

Shortest Path ◽

Future Development ◽

Enrichment Analysis ◽

Network Biology ◽

R Package ◽

Programming Environment ◽

R Packages ◽

R Programming ◽

Shortest Path Algorithms ◽

Rest Api

RCy3 is an R package in Bioconductor that communicates with Cytoscape via its REST API, providing access to the full feature set of Cytoscape from within the R programming environment. RCy3 has been redesigned to streamline its usage and future development as part of a broader Cytoscape Automation effort. Over 100 new functions have been added, including dozens of helper functions specifically for intuitive data overlay operations. Over 40 Cytoscape apps have implemented automation support so far, making hundreds of additional operations accessible via RCy3. Two-way conversion with networks from \textit{igraph} and \textit{graph} ensures interoperability with existing network biology workflows and dozens of other Bioconductor packages. These capabilities are demonstrated in a series of use cases involving public databases, enrichment analysis pipelines, shortest path algorithms and more. With RCy3, bioinformaticians will be able to quickly deliver reproducible network biology workflows as integrations of Cytoscape functions, complex custom analyses and other R packages.

Download Full-text

primirTSS: an R package for identifying cell-specific microRNA transcription start sites

Bioinformatics ◽

10.1093/bioinformatics/btaa173 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3605-3606

Author(s):

Pumin Li ◽

Qi Xu ◽

Xu Hua ◽

Zhongwei Xie ◽

Jie Li ◽

...

Keyword(s):

Conservation Score ◽

R Package ◽

Supplementary Information ◽

Transcription Start ◽

Pol Ii ◽

Transcription Start Sites ◽

Web Interfaces ◽

Multiple Datasets ◽

R Programming ◽

Programming Interfaces

Abstract Summary The R/Bioconductor package primirTSS is a fast and convenient tool that allows implementation of the analytical method to identify transcription start sites of microRNAs by integrating ChIP-seq data of H3K4me3 and Pol II. It further ensures the precision by employing the conservation score and sequence features. The tool showed a good performance when using H3K4me3 or Pol II Chip-seq data alone as input, which brings convenience to applications where multiple datasets are hard to acquire. This flexible package is provided with both R-programming interfaces as well as graphical web interfaces. Availability and implementation primirTSS is available at: http://bioconductor.org/packages/primirTSS. The documentation of the package including an accompanying tutorial was deposited at: https://bioconductor.org/packages/release/bioc/vignettes/primirTSS/inst/doc/primirTSS.html. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

RCy3: Network biology using Cytoscape from within R

F1000Research ◽

10.12688/f1000research.20887.2 ◽

2019 ◽

Vol 8 ◽

pp. 1774 ◽

Cited By ~ 13

Author(s):

Julia A. Gustavsen ◽

Shraddha Pai ◽

Ruth Isserlin ◽

Barry Demchak ◽

Alexander R. Pico

Keyword(s):

Shortest Path ◽

Future Development ◽

Enrichment Analysis ◽

Network Biology ◽

R Package ◽

Programming Environment ◽

R Packages ◽

R Programming ◽

Shortest Path Algorithms ◽

Rest Api

RCy3 is an R package in Bioconductor that communicates with Cytoscape via its REST API, providing access to the full feature set of Cytoscape from within the R programming environment. RCy3 has been redesigned to streamline its usage and future development as part of a broader Cytoscape Automation effort. Over 100 new functions have been added, including dozens of helper functions specifically for intuitive data overlay operations. Over 40 Cytoscape apps have implemented automation support so far, making hundreds of additional operations accessible via RCy3. Two-way conversion with networks from \textit{igraph} and \textit{graph} ensures interoperability with existing network biology workflows and dozens of other Bioconductor packages. These capabilities are demonstrated in a series of use cases involving public databases, enrichment analysis pipelines, shortest path algorithms and more. With RCy3, bioinformaticians will be able to quickly deliver reproducible network biology workflows as integrations of Cytoscape functions, complex custom analyses and other R packages.

Download Full-text

EnImpute: imputing dropout events in single-cell RNA-sequencing data via ensemble learning

Bioinformatics ◽

10.1093/bioinformatics/btz435 ◽

2019 ◽

Vol 35 (22) ◽

pp. 4827-4829 ◽

Cited By ~ 6

Author(s):

Xiao-Fei Zhang ◽

Le Ou-Yang ◽

Shuo Yang ◽

Xing-Ming Zhao ◽

Xiaohua Hu ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Ensemble Learning ◽

R Package ◽

Supplementary Information ◽

Sequencing Data ◽

Single Cell Rna Sequencing ◽

The Individual ◽

Downstream Analysis ◽

Shiny Application

Abstract Summary Imputation of dropout events that may mislead downstream analyses is a key step in analyzing single-cell RNA-sequencing (scRNA-seq) data. We develop EnImpute, an R package that introduces an ensemble learning method for imputing dropout events in scRNA-seq data. EnImpute combines the results obtained from multiple imputation methods to generate a more accurate result. A Shiny application is developed to provide easier implementation and visualization. Experiment results show that EnImpute outperforms the individual state-of-the-art methods in almost all situations. EnImpute is useful for correcting the noisy scRNA-seq data before performing downstream analysis. Availability and implementation The R package and Shiny application are available through Github at https://github.com/Zhangxf-ccnu/EnImpute. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform

Bioinformatics ◽

10.1093/bioinformatics/btv560 ◽

2015 ◽

Vol 32 (2) ◽

pp. 286-288 ◽

Cited By ~ 190

Author(s):

Belinda Phipson ◽

Jovana Maksimovic ◽

Alicia Oshlack

Keyword(s):

Dna Methylation ◽

Cost Effective ◽

R Package ◽

Supplementary Information ◽

450K Array ◽

Illumina Humanmethylation450 ◽

Bioconductor Project ◽

Illumina Humanmethylation450 Beadchip ◽

Differential Variability ◽

Differential Methylation Analysis

Abstract Summary: DNA methylation is one of the most commonly studied epigenetic modifications due to its role in both disease and development. The Illumina HumanMethylation450 BeadChip is a cost-effective way to profile >450 000 CpGs across the human genome, making it a popular platform for profiling DNA methylation. Here we introduce missMethyl, an R package with a suite of tools for performing normalization, removal of unwanted variation in differential methylation analysis, differential variability testing and gene set analysis for the 450K array. Availability and implementation: missMethyl is an R package available from the Bioconductor project at www.bioconductor.org. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads

10.1101/377762 ◽

2018 ◽

Cited By ~ 7

Author(s):

Yang Liao ◽

Gordon K. Smyth ◽

Wei Shi

Keyword(s):

Rna Sequencing ◽

High Performance ◽

R Package ◽

Ease Of Use ◽

Rna Seq ◽

Counting Functions ◽

Unix Command ◽

R Programming ◽

Downstream Analysis ◽

True Values

AbstractThe first steps in the analysis of RNA sequencing (RNA-seq) data are usually to map the reads to a reference genome and then to count reads by gene, by exon or by exon-exon junction. These two steps are at once the most common and also typically the most expensive computational steps in an RNA-seq analysis. These steps are typically undertaken using Unix command-line or Python software tools, even when downstream analysis is to be undertaken using R.We present Rsubread, a Bioconductor software package that provides high-performance alignment and counting functions for RNA-seq reads. Rsubread provides the ease-of-use of the R programming environment, creating a matrix of read counts directly as an R object ready for downstream analysis. It has no software dependencies other than R itself. Using SEQC data and simulations, we compare Rsubread to the popular non-R tools TopHat2, STAR and HTSeq. We also compare to counting functions provided in the Bioconductor infrastructure packages. We show that Rsubread is faster, uses less memory and produces read count summaries that more accurately correlate with true values. The results show that users can adopt the R environment for alignment and quantification without suffering any loss of performance.

Download Full-text

RCy3: Network Biology using Cytoscape from within R

10.1101/793166 ◽

2019 ◽

Author(s):

Julia A. Gustavsen ◽

Shraddha Pai ◽

Ruth Isserlin ◽

Barry Demchak ◽

Alexander R. Pico

Keyword(s):

Shortest Path ◽

Future Development ◽

Enrichment Analysis ◽

Network Biology ◽

R Package ◽

Programming Environment ◽

R Packages ◽

R Programming ◽

Shortest Path Algorithms ◽

Rest Api

AbstractRCy3 is an R package in Bioconductor that communicates with Cytoscape via its REST API, providing access to the full feature set of Cytoscape from within the R programming environment. RCy3 has been redesigned to streamline its usage and future development as part of a broader Cytoscape Automation effort. Over 100 new functions have been added, including dozens of helper functions specifically for intuitive data overlay operations. Over 40 Cytoscape apps have implemented automation support so far, making hundreds of additional operations accessible via RCy3. Two-way conversion with networks from igraph and graph ensures interoperability with existing network biology workflows and dozens of other Bioconductor packages. These capabilities are demonstrated in a series of use cases involving public databases, enrichment analysis pipelines, shortest path algorithms and more. With RCy3, bioinformaticians will be able to quickly deliver reproducible network biology workflows as integrations of Cytoscape functions, complex custom analyses and other R packages.

Download Full-text

NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data

10.1101/2021.08.02.453487 ◽

2021 ◽

Author(s):

Federico Agostinis ◽

Chiara Romualdi ◽

Gabriele Sales ◽

Davide Risso

Keyword(s):

Dimensionality Reduction ◽

Single Cell ◽

R Package ◽

Batch Effect ◽

Supplementary Information ◽

Bioconductor Package ◽

Rna Seq ◽

Sequencing Data ◽

Bioconductor Project ◽

Single Cell Rna Sequencing

Summary: We present NewWave, a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA sequencing data. To achieve scalability, NewWave uses mini-batch optimization and can work with out-of-memory data, enabling users to analyze datasets with millions of cells. Availability and implementation: NewWave is implemented as an open-source R package available through the Bioconductor project at https://bioconductor.org/packages/NewWave/ Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

scRMD: imputation for single cell RNA-seq data via robust matrix decomposition

Bioinformatics ◽

10.1093/bioinformatics/btaa139 ◽

2020 ◽

Vol 36 (10) ◽

pp. 3156-3161 ◽

Cited By ~ 9

Author(s):

Chong Chen ◽

Changjing Wu ◽

Linjie Wu ◽

Xiaochen Wang ◽

Minghua Deng ◽

...

Keyword(s):

Data Analysis ◽

Single Cell ◽

Differential Expression ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Matrix Decomposition ◽

Transcriptome Profiling ◽

R Package ◽

Supplementary Information ◽

Downstream Analysis

Abstract Motivation Single cell RNA-sequencing (scRNA-seq) technology enables whole transcriptome profiling at single cell resolution and holds great promises in many biological and medical applications. Nevertheless, scRNA-seq often fails to capture expressed genes, leading to the prominent dropout problem. These dropouts cause many problems in down-stream analysis, such as significant increase of noises, power loss in differential expression analysis and obscuring of gene-to-gene or cell-to-cell relationship. Imputation of these dropout values can be beneficial in scRNA-seq data analysis. Results In this article, we model the dropout imputation problem as robust matrix decomposition. This model has minimal assumptions and allows us to develop a computational efficient imputation method called scRMD. Extensive data analysis shows that scRMD can accurately recover the dropout values and help to improve downstream analysis such as differential expression analysis and clustering analysis. Availability and implementation The R package scRMD is available at https://github.com/XiDsLab/scRMD. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text