scholarly journals BloodGen3Module: Blood transcriptional module repertoire analysis and visualization using R

Author(s):  
Darawan Rinchai ◽  
Jessica Roelands ◽  
Mohammed Toufiq ◽  
Wouter Hendrickx ◽  
Matthew C Altman ◽  
...  

Abstract Motivation We previously described the construction and characterization of generic and reusable blood transcriptional module repertoires. More recently we released a third iteration (“BloodGen3” module repertoire) that comprises 382 functionally annotated gene sets (modules) and encompasses 14,168 transcripts. Custom bioinformatic tools are needed to support downstream analysis, visualization and interpretation relying on such fixed module repertoires. Results We have developed and describe here a R package, BloodGen3Module. The functions of our package permit group comparison analyses to be performed at the module-level, and to display the results as annotated fingerprint grid plots. A parallel workflow for computing module repertoire changes for individual samples rather than groups of samples is also available; these results are displayed as fingerprint heatmaps. An illustrative case is used to demonstrate the steps involved in generating blood transcriptome repertoire fingerprints of septic patients. Taken together, this resource could facilitate the analysis and interpretation of changes in blood transcript abundance observed across a wide range of pathological and physiological states. Availability The BloodGen3Module package and documentation are freely available from Github: https://github.com/Drinchai/BloodGen3Module Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Xiaofan Lu ◽  
Jialin Meng ◽  
Yujie Zhou ◽  
Liyun Jiang ◽  
Fangrong Yan

Abstract Summary Stratification of cancer patients into distinct molecular subgroups based on multi-omics data is an important issue in the context of precision medicine. Here, we present MOVICS, an R package for multi-omics integration and visualization in cancer subtyping. MOVICS provides a unified interface for 10 state-of-the-art multi-omics integrative clustering algorithms, and incorporates the most commonly used downstream analyses in cancer subtyping researches, including characterization and comparison of identified subtypes from multiple perspectives, and verification of subtypes in external cohort using two model-free approaches for multiclass prediction. MOVICS also creates feature rich customizable visualizations with minimal effort. By analysing two published breast cancer cohort, we signifies that MOVICS can serve a wide range of users and assist cancer therapy by moving away from the ‘one-size-fits-all’ approach to patient care. Availability and implementation MOVICS package and online tutorial are freely available at https://github.com/xlucpu/MOVICS. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (22) ◽  
pp. 4827-4829 ◽  
Author(s):  
Xiao-Fei Zhang ◽  
Le Ou-Yang ◽  
Shuo Yang ◽  
Xing-Ming Zhao ◽  
Xiaohua Hu ◽  
...  

Abstract Summary Imputation of dropout events that may mislead downstream analyses is a key step in analyzing single-cell RNA-sequencing (scRNA-seq) data. We develop EnImpute, an R package that introduces an ensemble learning method for imputing dropout events in scRNA-seq data. EnImpute combines the results obtained from multiple imputation methods to generate a more accurate result. A Shiny application is developed to provide easier implementation and visualization. Experiment results show that EnImpute outperforms the individual state-of-the-art methods in almost all situations. EnImpute is useful for correcting the noisy scRNA-seq data before performing downstream analysis. Availability and implementation The R package and Shiny application are available through Github at https://github.com/Zhangxf-ccnu/EnImpute. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Darawan Rinchai ◽  
Jessica Roelands ◽  
Wouter Hendrickx ◽  
Matthew C. Altman ◽  
Davide Bedognetti ◽  
...  

AbstractTranscriptional modules have been widely used for the analysis, visualization and interpretation of transcriptome data. We have previously described the construction and characterization of generic and reusable blood transcriptional module repertoires. The third and latest version that we have recently made available comprises 382 functionally annotated gene sets (modules) and encompasses 14,168 transcripts. We developed R scripts for performing module repertoire analyses and custom fingerprint visualization. These are made available here along with detailed descriptions. An illustrative public transcriptome dataset and corresponding intermediate output files are also included as supplementary material. Briefly, the steps involved in module repertoire analysis and visualization include: First, the annotation of the gene expression data matrix with module membership information. Second, running of statistical tests to determine for each module the proportion of its constitutive genes which are differentially expressed. Third, the results are expressed “at the module level” as percent of genes increased or decreased and plotted in a fingerprint grid format. A parallel workflow has been developed for computing module repertoire changes for individual samples rather than groups of samples. Such results are plotted in a heatmap format. The use case that is presented illustrates the steps involved in the generation of blood transcriptome repertoire fingerprints of septic patients at both group and individual levels.


2020 ◽  
Vol 36 (10) ◽  
pp. 3156-3161 ◽  
Author(s):  
Chong Chen ◽  
Changjing Wu ◽  
Linjie Wu ◽  
Xiaochen Wang ◽  
Minghua Deng ◽  
...  

Abstract Motivation Single cell RNA-sequencing (scRNA-seq) technology enables whole transcriptome profiling at single cell resolution and holds great promises in many biological and medical applications. Nevertheless, scRNA-seq often fails to capture expressed genes, leading to the prominent dropout problem. These dropouts cause many problems in down-stream analysis, such as significant increase of noises, power loss in differential expression analysis and obscuring of gene-to-gene or cell-to-cell relationship. Imputation of these dropout values can be beneficial in scRNA-seq data analysis. Results In this article, we model the dropout imputation problem as robust matrix decomposition. This model has minimal assumptions and allows us to develop a computational efficient imputation method called scRMD. Extensive data analysis shows that scRMD can accurately recover the dropout values and help to improve downstream analysis such as differential expression analysis and clustering analysis. Availability and implementation The R package scRMD is available at https://github.com/XiDsLab/scRMD. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (19) ◽  
pp. 3567-3575 ◽  
Author(s):  
Anna M Plantinga ◽  
Jun Chen ◽  
Robert R Jenq ◽  
Michael C Wu

Abstract Motivation The human microbiome is notoriously variable across individuals, with a wide range of ‘healthy’ microbiomes. Paired and longitudinal studies of the microbiome have become increasingly popular as a way to reduce unmeasured confounding and to increase statistical power by reducing large inter-subject variability. Statistical methods for analyzing such datasets are scarce. Results We introduce a paired UniFrac dissimilarity that summarizes within-individual (or within-pair) shifts in microbiome composition and then compares these compositional shifts across individuals (or pairs). This dissimilarity depends on a novel transformation of relative abundances, which we then extend to more than two time points and incorporate into several phylogenetic and non-phylogenetic dissimilarities. The data transformation and resulting dissimilarities may be used in a wide variety of downstream analyses, including ordination analysis and distance-based hypothesis testing. Simulations demonstrate that tests based on these dissimilarities retain appropriate type 1 error and high power. We apply the method in two real datasets. Availability and implementation The R package pldist is available on GitHub at https://github.com/aplantin/pldist. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (17) ◽  
pp. 3206-3207 ◽  
Author(s):  
Konstantinos A Kyritsis ◽  
Bing Wang ◽  
Julie Sullivan ◽  
Rachel Lyne ◽  
Gos Micklem

Abstract Summary InterMineR is a package designed to provide a flexible interface between the R programming environment and biological databases built using the InterMine platform. The package offers access to the flexible query builder and the library of term enrichment tools of the InterMine framework, as well as interoperability with other Bioconductor packages. This facilitates automation of data retrieval tasks as well as downstream analysis with existing statistical tools in the R environment. Availability and implementation InterMineR is free and open source, released under the LGPL licence and available from the Bioconductor project and Github (https://bioconductor.org/packages/release/bioc/html/InterMineR.html, https://github.com/intermine/interMineR). Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Author(s):  
Gregory W. Vurture ◽  
Fritz J. Sedlazeck ◽  
Maria Nattestad ◽  
Charles J. Underwood ◽  
Han Fang ◽  
...  

AbstractSummaryGenomeScope is an open-source web tool to rapidly estimate the overall characteristics of a genome, including genome size, heterozygosity rate, and repeat content from unprocessed short reads. These features are essential for studying genome evolution, and help to choose parameters for downstream analysis. We demonstrate its accuracy on 324 simulated and 16 real datasets with a wide range in genome sizes, heterozygosity levels, and error rates.Availability and Implementationhttp://genomescope.org, https://github.com/schatzlab/[email protected] informationSupplementary data are available at Bioinformatics online.


Author(s):  
Nima Mousavi ◽  
Jonathan Margoliash ◽  
Neha Pusarla ◽  
Shubham Saini ◽  
Richard Yanicky ◽  
...  

Abstract Summary A rich set of tools have recently been developed for performing genome-wide genotyping of tandem repeats (TRs). However, standardized tools for downstream analysis of these results are lacking. To facilitate TR analysis applications, we present TRTools, a Python library and suite of command line tools for filtering, merging and quality control of TR genotype files. TRTools utilizes an internal harmonization module, making it compatible with outputs from a wide range of TR genotypers. Availability and implementation TRTools is freely available at https://github.com/gymreklab/TRTools. Detailed documentation is available at https://trtools.readthedocs.io. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (10) ◽  
pp. 3276-3278 ◽  
Author(s):  
Alemu Takele Assefa ◽  
Jo Vandesompele ◽  
Olivier Thas

Abstract Summary SPsimSeq is a semi-parametric simulation method to generate bulk and single-cell RNA-sequencing data. It is designed to simulate gene expression data with maximal retention of the characteristics of real data. It is reasonably flexible to accommodate a wide range of experimental scenarios, including different sample sizes, biological signals (differential expression) and confounding batch effects. Availability and implementation The R package and associated documentation is available from https://github.com/CenterForStatistics-UGent/SPsimSeq. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Nima Mousavi ◽  
Jonathan Margoliash ◽  
Neha Pusarla ◽  
Shubham Saini ◽  
Richard Yanicky ◽  
...  

AbstractSummaryA rich set of tools have recently been developed for performing genome-wide genotyping of tandem repeats (TRs). However, standardized tools for downstream analysis of these results are lacking. To facilitate TR analysis applications, we present TRTools, a Python library and a suite of command-line tools for filtering, merging, and quality control of TR genotype files. TRTools utilizes an internal harmonization module making it compatible with outputs from a wide range of TR genotypers.AvailabilityTRTools is freely available at https://github.com/gymreklab/[email protected] informationSupplementary data are available at bioRxiv.


Sign in / Sign up

Export Citation Format

Share Document