bioconductor project Latest Research Papers

NanoMethViz: An R/Bioconductor package for visualizing long-read methylation data

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009524 ◽

2021 ◽

Vol 17 (10) ◽

pp. e1009524

Author(s):

Shian Su ◽

Quentin Gouil ◽

Marnie E. Blewitt ◽

Dianne Cook ◽

Peter F. Hickey ◽

...

Keyword(s):

Cpg Islands ◽

R Package ◽

Data Format ◽

Bioconductor Project ◽

Modified Dna ◽

Long Read ◽

Effective Visualization ◽

Genomic Regions ◽

Methylation Patterns ◽

Compressed Data

A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. The lack of R/Bioconductor tools for the effective visualization of nanopore methylation profiles between samples from different experimental groups led us to develop the NanoMethViz R package. Our software can handle methylation output generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use dimensionality reduction to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot and heatmaps, allowing users to explore particular genes or genomic regions of interest. In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz.

GEOexplorer: an R/Bioconductor package for gene expression analysis and visualisation

10.1101/2021.10.06.459411 ◽

2021 ◽

Author(s):

Guy P Hunt ◽

Rafael Henkin ◽

Fabrizio Smeraldi ◽

Michael R Barnes

Keyword(s):

Gene Expression ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Exploratory Data Analysis ◽

Microarray Gene Expression ◽

Expression Studies ◽

Bioconductor Project ◽

Exploratory Data ◽

Differential Gene ◽

Gene Expression Studies

Background: Over the past three decades there have been numerous molecular biology developments that have led to an explosion in the number of gene expression studies being performed. Many of these gene expression studies publish their data to the public database GEO, making them freely available. By analysing gene expression datasets, researchers can identify genes that are differentially expressed between two groups. This can provide insights that lead to the development of new tests and treatments for diseases. Despite the wide availability of gene expression datasets, analysing them is difficult for several reasons. These reasons include the fact that most methods for performing gene expression analysis require programming proficiency. Results: We developed the GEOexplorer software package to overcome several of the difficulties in performing gene expression analysis. GEOexplorer was therefore developed as a web application, that can perform interactive and reproducible microarray gene expression analysis, while producing a wealth of interactive visualisations to facilitate result exploration. GEOexplorer is implemented in R using the Shiny framework and is fully integrated with the existing core structures of the Bioconductor project. Users can perform the essential steps of exploratory data analysis and differential gene expression analysis intuitively and generate a broad spectrum of publication ready outputs. Conclusion: GEOexplorer is distributed as an R package in the Bioconductor project (http://bioconductor.org/packages/GEOexplorer/). GEOexplorer provides a solution for performing interactive and reproducible analyses of microarray gene expression data, empowering life scientists to perform exploratory data analysis and differential gene expression analysis on GEO microarray datasets.

Memes: A motif analysis environment in R using tools from the MEME Suite

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008991 ◽

2021 ◽

Vol 17 (9) ◽

pp. e1008991

Author(s):

Spencer L. Nystrom ◽

Daniel J. McKay

Keyword(s):

Data Access ◽

R Package ◽

Comprehensive Analysis ◽

Multidimensional Data ◽

Biological Sequences ◽

Bioconductor Package ◽

Motif Analysis ◽

Bioconductor Project ◽

Analysis Environment ◽

Selection Of

Identification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package that provides a seamless R interface to a selection of popular MEME Suite tools. memes provides a novel “data aware” interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.

Plotgardener: Cultivating precise multi-panel figures in R

10.1101/2021.09.08.459338 ◽

2021 ◽

Author(s):

Nicole E Kramer ◽

Eric S Davis ◽

Craig D Wenger ◽

Erika M Deoudes ◽

Sarah M Parker ◽

...

Keyword(s):

Programming Languages ◽

Genomic Data ◽

Data Access ◽

Manuscript Preparation ◽

Data Sets ◽

New Paradigm ◽

Link Type ◽

Bioconductor Project ◽

Invaluable Tool ◽

R Programming

The R programming language is one of the most widely used programming languages for transforming raw genomic data sets into meaningful biological conclusions through analysis and visualization, which has been largely facilitated by infrastructure and tools developed by the Bioconductor project. However, existing plotting packages rely on relative positioning and sizing of plots, which is often sufficient for exploratory analysis but is poorly suited for the creation of publication-quality multi-panel images inherent to scientific manuscript preparation. We present plotgardener, a coordinate-based genomic data visualization package that offers a new paradigm for multi-plot figure generation in R. Plotgardener allows precise, programmatic control over the placement, aesthetics, and arrangements of plots while maximizing user experience through fast and memory-efficient data access, support for a wide variety of data and file types, and tight integration with the Bioconductor environment. Plotgardener also allows precise placement and sizing of ggplot2 plots, making it an invaluable tool for R users and data scientists from virtually any discipline.AvailabilityPackage: https://bioconductor.org/packages/plotgardenerCode: https://github.com/PhanstielLab/plotgardenerDocumentation: https://phanstiellab.github.io/plotgardener/

NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data

10.1101/2021.08.02.453487 ◽

2021 ◽

Author(s):

Federico Agostinis ◽

Chiara Romualdi ◽

Gabriele Sales ◽

Davide Risso

Keyword(s):

Dimensionality Reduction ◽

Single Cell ◽

R Package ◽

Batch Effect ◽

Supplementary Information ◽

Bioconductor Package ◽

Rna Seq ◽

Sequencing Data ◽

Bioconductor Project ◽

Single Cell Rna Sequencing

Summary: We present NewWave, a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA sequencing data. To achieve scalability, NewWave uses mini-batch optimization and can work with out-of-memory data, enabling users to analyze datasets with millions of cells. Availability and implementation: NewWave is implemented as an open-source R package available through the Bioconductor project at https://bioconductor.org/packages/NewWave/ Supplementary information: Supplementary data are available at Bioinformatics online.

Memes: an R interface to the MEME Suite

10.1101/2021.04.23.441089 ◽

2021 ◽

Author(s):

Spencer L. Nystrom ◽

Daniel J. McKay

Keyword(s):

Data Structures ◽

Source Code ◽

Data Access ◽

R Package ◽

Comprehensive Analysis ◽

Multidimensional Data ◽

Biological Sequences ◽

Bioconductor Package ◽

Motif Analysis ◽

Bioconductor Project

AbstractIdentification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package which provides a seamless R interface to the MEME Suite. memes provides a novel “data aware” interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the complex, multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.

ASpli: integrative analysis of splicing landscapes through RNA-Seq assays

Bioinformatics ◽

10.1093/bioinformatics/btab141 ◽

2021 ◽

Author(s):

Mancini Estefania ◽

Rabinovich Andres ◽

Iserte Javier ◽

Yanovsky Marcelo ◽

Chernomoretz Ariel

Keyword(s):

Alternative Splicing ◽

Intron Retention ◽

Real Data ◽

Data Availability ◽

Supplementary Information ◽

Rna Seq ◽

Sequencing Technologies ◽

Reconstruction Methods ◽

Genome Wide ◽

Bioconductor Project

Abstract Motivation Genome-wide analysis of alternative splicing has been a very active field of research since the early days of next generation sequencing technologies. Since then, ever-growing data availability and the development of increasingly sophisticated analysis methods have uncovered the complexity of the general splicing repertoire. A large number of splicing analysis methodologies exist, each of them presenting its own strengths and weaknesses. For instance, methods exclusively relying on junction information do not take advantage of the large majority of reads produced in an RNA-seq assay, isoform reconstruction methods might not detect novel intron retention events, some solutions can only handle canonical splicing events, and many existing methods can only perform pairwise comparisons. Results In this contribution, we present ASpli, a computational suite implemented in R statistical language, that allows the identification of changes in both, annotated and novel alternative-splicing events and can deal with simple, multi-factor or paired experimental designs. Our integrative computational workflow, that considers the same GLM model applied to different sets of reads and junctions, allows computation of complementary splicing signals. Analyzing simulated and real data, we found that the consolidation of these signals resulted in a robust proxy of the occurrence of splicing alterations. While the analysis of junctions allowed us to uncover annotated as well as non-annotated events, read coverage signals notably increased recall capabilities at a very competitive performance when compared against other state-of-the-art splicing analysis algorithms. Availability and implementation ASpli is freely available from the Bioconductor project site https://doi.org/doi:10.18129/B9.bioc.ASpli. Supplementary information Supplementary data are available at Bioinformatics online.

NanoMethViz: an R/Bioconductor package for visualizing long-read methylation data

10.1101/2021.01.18.426757 ◽

2021 ◽

Author(s):

Shian Su ◽

Quentin Gouil ◽

Marnie E. Blewitt ◽

Dianne Cook ◽

Peter F. Hickey ◽

...

Keyword(s):

Cpg Islands ◽

Bioconductor Package ◽

Bioconductor Project ◽

Modified Dna ◽

Long Read ◽

Visualization Of Data ◽

Effective Visualization ◽

Genomic Regions ◽

Methylation Patterns ◽

Compressed Data

AbstractMotivationA key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. Tools for effective visualization of data generated by this platform to assess changes in methylation profiles between samples from different experimental groups remains a challenge.ResultsTo make visualization of methylation changes more straightforward, we developed the R/Bioconductor package NanoMethViz. Our software can handle methylation calls generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use multidimensional scaling to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot, allowing users to explore particular genes or genomic regions of interest.In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz.

ideal: an R/Bioconductor package for interactive differential expression analysis

BMC Bioinformatics ◽

10.1186/s12859-020-03819-5 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Federico Marini ◽

Jan Linke ◽

Harald Binder

Keyword(s):

Differential Expression ◽

Expression Analysis ◽

Web Application ◽

Differential Expression Analysis ◽

Transcriptome Profiling ◽

Data Interpretation ◽

R Package ◽

Rna Seq ◽

Bioconductor Project ◽

Analysis Workflow

Abstract Background RNA sequencing (RNA-seq) is an ever increasingly popular tool for transcriptome profiling. A key point to make the best use of the available data is to provide software tools that are easy to use but still provide flexibility and transparency in the adopted methods. Despite the availability of many packages focused on detecting differential expression, a method to streamline this type of bioinformatics analysis in a comprehensive, accessible, and reproducible way is lacking. Results We developed the software package, which serves as a web application for interactive and reproducible RNA-seq analysis, while producing a wealth of visualizations to facilitate data interpretation. is implemented in R using the Shiny framework, and is fully integrated with the existing core structures of the Bioconductor project. Users can perform the essential steps of the differential expression analysis workflow in an assisted way, and generate a broad spectrum of publication-ready outputs, including diagnostic and summary visualizations in each module, all the way down to functional analysis. also offers the possibility to seamlessly generate a full HTML report for storing and sharing results together with code for reproducibility. Conclusion is distributed as an R package in the Bioconductor project (http://bioconductor.org/packages/ideal/), and provides a solution for performing interactive and reproducible analyses of summarized RNA-seq expression data, empowering researchers with many different profiles (life scientists, clinicians, but also experienced bioinformaticians) to make the ideal use of the data at hand.

rawR - Direct access to raw mass spectrometry data in R

10.1101/2020.10.30.362533 ◽

2020 ◽

Author(s):

Tobias Kockmann ◽

Christian Panse

Keyword(s):

Data Analysis ◽

Data Access ◽

Technical Note ◽

Mass Spectrometry Data ◽

Direct Access ◽

Robust Analysis ◽

Bioconductor Project ◽

Research Task ◽

Thermo Fisher Scientific ◽

Fisher Scientific

AbstractThe Bioconductor project has shown that the R statistical environment is a highly valuable tool for genomics data analysis1, but with respect to proteomics we are still missing low level infrastructure to enable performant and robust analysis workflows in R. Fundamentally important are libraries that provide raw data access. Our R package rawDiag has provided the proof-of-principle how access to mass spectromerty raw files can be realized by wrapping vendor-provided APIs, but rather focused on meta data analysis and visualization2. Our novel package rawR now provides complete, OS independent access to all spectral data logged in Thermo Fisher Scientific raw files. In this technical note we present implementation details and describe the main functionality provided by the rawR package. In addition, we report two use cases inspired by real-word research task that demonstrate the application of the package.Availabilityhttps://github.com/fgcz/rawR

bioconductor project
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

NanoMethViz: An R/Bioconductor package for visualizing long-read methylation data

GEOexplorer: an R/Bioconductor package for gene expression analysis and visualisation

Memes: A motif analysis environment in R using tools from the MEME Suite

Plotgardener: Cultivating precise multi-panel figures in R

NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data

Memes: an R interface to the MEME Suite

ASpli: integrative analysis of splicing landscapes through RNA-Seq assays

NanoMethViz: an R/Bioconductor package for visualizing long-read methylation data

ideal: an R/Bioconductor package for interactive differential expression analysis

rawR - Direct access to raw mass spectrometry data in R

Export Citation Format

bioconductor projectRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

NanoMethViz: An R/Bioconductor package for visualizing long-read methylation data

GEOexplorer: an R/Bioconductor package for gene expression analysis and visualisation

Memes: A motif analysis environment in R using tools from the MEME Suite

Plotgardener: Cultivating precise multi-panel figures in R

NewWave: a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA-seq data

Memes: an R interface to the MEME Suite

ASpli: integrative analysis of splicing landscapes through RNA-Seq assays

NanoMethViz: an R/Bioconductor package for visualizing long-read methylation data

ideal: an R/Bioconductor package for interactive differential expression analysis

rawR - Direct access to raw mass spectrometry data in R

bioconductor project
Recently Published Documents