RiboA: a web application to identify ribosome A-site locations in ribosome profiling data

Abstract Background Translation is a fundamental process in gene expression. Ribosome profiling is a method that enables the study of transcriptome-wide translation. A fundamental, technical challenge in analyzing Ribo-Seq data is identifying the A-site location on ribosome-protected mRNA fragments. Identification of the A-site is essential as it is at this location on the ribosome where a codon is translated into an amino acid. Incorrect assignment of a read to the A-site can lead to lower signal-to-noise ratio and loss of correlations necessary to understand the molecular factors influencing translation. Therefore, an easy-to-use and accurate analysis tool is needed to accurately identify the A-site locations. Results We present RiboA, a web application that identifies the most accurate A-site location on a ribosome-protected mRNA fragment and generates the A-site read density profiles. It uses an Integer Programming method that reflects the biological fact that the A-site of actively translating ribosomes is generally located between the second codon and stop codon of a transcript, and utilizes a wide range of mRNA fragment sizes in and around the coding sequence (CDS). The web application is containerized with Docker, and it can be easily ported across platforms. Conclusions The Integer Programming method that RiboA utilizes is the most accurate in identifying the A-site on Ribo-Seq mRNA fragments compared to other methods. RiboA makes it easier for the community to use this method via a user-friendly and portable web application. In addition, RiboA supports reproducible analyses by tracking all the input datasets and parameters, and it provides enhanced visualization to facilitate scientific exploration. RiboA is available as a web service at https://a-site.vmhost.psu.edu/. The code is publicly available at https://github.com/obrien-lab/aip_web_docker under the MIT license.

Download Full-text

Translational termination efficiency in both bacteria and mammals is regulated by the base following the stop codon

Biochemistry and Cell Biology ◽

10.1139/o95-118 ◽

1995 ◽

Vol 73 (11-12) ◽

pp. 1095-1103 ◽

Cited By ~ 51

Author(s):

Warren P. Tate ◽

Elizabeth S. Poole ◽

Julie A. Horsfield ◽

Sally A. Mannering ◽

Chris M. Brown ◽

...

Keyword(s):

Stop Codon ◽

Stop Signal ◽

Physical Contact ◽

Release Factor ◽

Wide Range ◽

Highly Expressed Genes ◽

Peptidyltransferase Center ◽

A Site ◽

Translational Termination

The translational stop signal and polypeptide release factor (RF) complexed with Escherichia coli ribosomes have been shown to be in close physical contact by site-directed photochemical cross-linking experiments. The RF has a protease-sensitive site in a highly conserved exposed loop that is proposed to interact with the peptidyltransferase center of the ribosome. Loss of peptidyl–tRNA hydrolysis activity and enhanced codon–ribosome binding by the cleaved RF is consistent with a model whereby the RF spans the decoding and peptidyltransferase centers of the ribosome with domains of the RF linked by conformational coupling. The cross-link between the stop signal and RF at the ribosomal decoding site is influenced by the base following the termination codon. This base determines the efficiency with which the stop signal is decoded by the RF in both mammalian and bacterial systems in vivo. The wide range of efficiencies correlates with the frequency with which the signals occur at natural termination sites, with rarely used weak signals often found at recoding sites and strong signals found in highly expressed genes. Stop signals are found at some recoding sites in viruses where −1 frame-shifting occurs, but the generally accepted mechanism of simultaneous slippage from the A and P sites does not explain their presence here. The HIV-1 gag-pol −1 frame shifting site has been used to show that stop signals significantly influence frame-shifting efficiency on prokaryotic ribosomes by a RF-mediated mechanism. These data can be explained by an E/P site simultaneous slippage mechanism whereby the stop codon actually enters the ribosomal A site and can influence the event.Key words: translational stop signal, decoding, release factor, frame-shifting.

Download Full-text

PinAPL-Py: A comprehensive web-application for the analysis of CRISPR/Cas9 screens

10.1101/147462 ◽

2017 ◽

Author(s):

Philipp N. Spahn ◽

Tyler Bath ◽

Ryan J. Weiss ◽

Jihoon Kim ◽

Jeffrey D. Esko ◽

...

Keyword(s):

Web Application ◽

Large Scale ◽

Sequencing Data ◽

Bioinformatic Tools ◽

Link Type ◽

Screening Experiments ◽

Independent Analysis ◽

Wide Range ◽

Set Up ◽

Sequence Quality

AbstractBackgroundLarge-scale genetic screens using CRISPR/Cas9 technology have emerged as a major tool for functional genomics. With its increased popularity, experimental biologists frequently acquire large sequencing datasets for which they often do not have an easy analysis option. While a few bioinformatic tools have been developed for this purpose, their utility is still hindered either due to limited functionality or the requirement of bioinformatic expertise.ResultsTo make sequencing data analysis of CRISPR/Cas9 screens more accessible to a wide range of scientists, we developed a Platform-independent Analysis of Pooled Screens using Python (PinAPL-Py), which is operated as an intuitive web-service. PinAPL-Py implements state-of-the-art tools and statistical models, assembled in a comprehensive workflow covering sequence quality control, automated sgRNA sequence extraction, alignment, sgRNA enrichment/depletion analysis and gene ranking. The workflow is set up to use a variety of popular sgRNA libraries as well as custom libraries that can be easily uploaded. Various analysis options are offered, suitable to analyze a large variety of CRISPR/Cas9 screening experiments. Analysis output includes ranked lists of sgRNAs and genes, and publication-ready plots.ConclusionsPinAPL-Py helps to advance genome-wide screening efforts by combining comprehensive functionality with user-friendly implementation. PinAPL-Py is freely accessible at http://pinapl-py.ucsd.edu with instructions, documentation and test datasets. The source code is available at https://github.com/LewisLabUCSD/PinAPL-Py

Download Full-text

riboviz: analysis and visualization of ribosome profiling datasets

10.1101/100032 ◽

2017 ◽

Cited By ~ 3

Author(s):

Oana Carja ◽

Tongji Xing ◽

Joshua B. Plotkin ◽

Premal Shah

Keyword(s):

Web Application ◽

High Throughput Sequencing ◽

Ribosome Profiling ◽

Analysis Pipeline ◽

Processing Pipeline ◽

Link Type ◽

A Cell ◽

Biological Discovery ◽

Public Datasets

AbstractUsing high-throughput sequencing to monitor translation in vivo, ribosome profiling can provide critical insights into the dynamics and regulation of protein synthesis in a cell. Since its introduction in 2009, this technique has played a key role in driving biological discovery, and yet it requires a rigorous computational toolkit for widespread adoption. We developed a processing pipeline and browser-based visualization, riboviz, that allows convenient exploration and analysis of riboseq datasets. In implementation, riboviz consists of a comprehensive and flexible backend analysis pipeline that allows the user to analyze their private unpublished dataset, along with a web application for comparison with previously published public datasets.Availability and implementationJavaScript and R source code and extra documentation are freely available from https://github.com/shahpr/RiboViz, while the web-application is live at www.riboviz.org.

Download Full-text

TFEA.ChIP: A tool kit for transcription factor binding site enrichment analysis capitalizing on ChIP-seq datasets

10.1101/303651 ◽

2018 ◽

Cited By ~ 3

Author(s):

Laura Puente-Santamaria ◽

Luis del Peso

Keyword(s):

Transcription Factors ◽

Web Application ◽

Area Under The Curve ◽

Enrichment Analysis ◽

Cell Types ◽

Gene Set Enrichment Analysis ◽

Link Type ◽

Gene Sets ◽

Wide Range ◽

Depth Analysis

AbstractThe identification of transcription factors (TFs) responsible for the co-regulation of specific sets of genes is a common problem in transcriptomics. Herein we describe TFEA.ChIP, a tool to estimate and visualize TF enrichment in gene lists representing transcriptional profiles. To generate the gene sets representing TF targets, we gathered ChIP-Seq experiments from the ENCODE Consortium and GEO datasets and used the correlation between Dnase Hypersensitive Sites across cell lines to generate a database linking TFs with the genes they interact with in each ChIP-Seq experiment. In its current state, TFEA.ChIP covers 327 different transcription factors from 1075 ChIP-Seq experiments, with over 150 cell types being represented. TFEA.ChIP accepts gene sets as well as sorted lists differentially expressed genes to compute enrichment scores for each of the datasets in its internal database using an Fisher’s exact association test or a Gene Set Enrichment Analysis. We validated TFEA.ChIP using a wide variety of gene sets representing signatures of genetic and chemical perturbations as input and found that the relevant TF was correctly identified in 103 of a total of 144 analyzed datasets with a median area under the curve (AUC) of 0.86. In depth analysis of an RNAseq dataset, illustrates that the use of ChIP-Seq data instead of PWM-based provides key biological context to interpret the results of the analysis. To facilitate its integration into transcriptome analysis pipelines and allow easy expansion and customization of the TF-gene database, we implemented TFEA.ChIP as an R package that can be downloaded from Bioconductor: https://www.bioconductor.org/packages/devel/bioc/html/TFEA.ChIP.html and github: https://github.com/LauraPS1/TFEA-drafts In addition, make it available to a wide range of researches, we have also developed a web application that runs the package from the server side and enables easy exploratory analysis through interactive graphs: https://www.iib.uam.es/TFEA.ChIP/

Download Full-text

RiboFlow, RiboR and RiboPy: An ecosystem for analyzing ribosome profiling data at read length resolution

10.1101/855445 ◽

2019 ◽

Author(s):

Hakan Ozadam ◽

Michael Geng ◽

Can Cenik

Keyword(s):

Ribosome Profiling ◽

Read Length ◽

Sequencing Data ◽

Software Ecosystem ◽

Binary File ◽

Link Type ◽

Wide Range ◽

Quality Control Metrics ◽

Multiple Metrics ◽

And Storage

AbstractSummaryRibosome occupancy measurements enable protein abundance estimation and infer mechanisms of translation. Recent studies have revealed that sequence read lengths in ribosome profiling data are highly variable and carry critical information. Consequently, data analyses require the computation and storage of multiple metrics for a wide range of ribosome footprint lengths. We developed a software ecosystem including a new efficient binary file format named ‘ribo’. Ribo files store all essential data grouped by ribosome footprint lengths. Users can assemble ribo files using our RiboFlow pipeline that processes raw ribosomal profiling sequencing data. RiboFlow is highly portable and customizable across a large number of computational environments with built-in capabilities for parallelization. We also developed interfaces for writing and reading ribo files in the R (RiboR) and Python (RiboPy) environments. Using RiboR and RiboPy, users can efficiently access ribosome profiling quality control metrics, generate essential plots, and carry out analyses. Altogether, these components create a complete software ecosystem for researchers to study translation through ribosome profiling.Availability and ImplementationFor a quickstart, please see https://ribosomeprofiling.github.io. Source code, installation instructions and links to documentation are available on GitHub: https://github.com/ribosomeprofiling

Download Full-text

miRBaseConverter: An R/Bioconductor Package for Converting and Retrieving miRNA Name, Accession, Sequence and Family Information in Different Versions of miRBase

10.1101/407148 ◽

2018 ◽

Cited By ~ 1

Author(s):

Taosheng Xu ◽

Ning Su ◽

Lin Liu ◽

Junpeng Zhang ◽

Hongqiang Wang ◽

...

Keyword(s):

Web Application ◽

Mature Mirnas ◽

R Package ◽

Mirna Sequence ◽

Bioconductor Package ◽

Related Data ◽

Link Type ◽

Comprehensive Information ◽

Wide Range ◽

Different Sources

AbstractBackgroundmiRBase is the primary repository for published miRNA sequence and annotation data, and serves as the “go-to” place for miRNA research. However, the definition and annotation of miRNAs have been changed significantly across different versions of miRBase. The changes cause inconsistency in miRNA related data between different databases and articles published at different times. Several tools have been developed for different purposes of querying and converting the information of miRNAs between different miRBase versions, but none of them individually can provide the comprehensive information about miRNAs in miRBase and users will need to use a number of different tools in their analyses.ResultsWe introducemiRBaseConverter,an R package integrating the latest miRBase version 22 available in Bioconductor to provide a suite of functions for converting and retrieving miRNA name (ID), accession, sequence, species, version and family information in different versions of miRBase. The package is implemented in R and available under the GPL-2 license from the Bioconductor website (http://bioconductor.org/packages/miRBaseConverter/). A Shiny-based GUI suitable for non-R users is also available as a standalone application from the package and also as a web application athttp://nugget.unisa.edu.au:3838/miRBaseConverter.miRBaseConverterhas a built-in database for querying miRNA information in all species and for both pre-mature and mature miRNAs defined by miRBase. In addition, it is the first tool for batch querying the miRNA family information. The package aims to provide a comprehensive and easy-to-use tool for miRNA research community where researchers often utilize published miRNA data from different sources.ConclusionsThe Bioconductor packagemiRBaseConverterand the Shiny-based web application are presented to provide a suite of functions for converting and retrieving miRNA name, accession, sequence, species, version and family information in different versions of miRBase. The package will serve a wide range of applications in miRNA research and could provide a full view of the miRNAs of interest.

Download Full-text

Building applications for interactive data exploration in systems biology

10.1101/141630 ◽

2017 ◽

Cited By ~ 1

Author(s):

Bjørn Fjukstad ◽

Vanessa Dumeaux ◽

Karina Standahl Olsen ◽

Michael Hallet ◽

Eiliv Lund ◽

...

Keyword(s):

Systems Biology ◽

User Interfaces ◽

Web Application ◽

Data Exploration ◽

Analysis Tool ◽

Biological Databases ◽

Breast Cancer Patients ◽

Transcriptional Profiles ◽

Link Type ◽

Interactive Data

AbstractAs the systems biology community generates and collects data at an unprecedented rate, there is a growing need for interactive data exploration tools to explore the datasets. These tools need to combine advanced statistical analyses, relevant knowledge from biological databases, and interactive visualizations in an application with clear user interfaces. To answer specific research questions tools must provide specialized user interfaces and visualizations. While these are application-specific, the underlying components of a data analysis tool can be shared and reused later. Application developers can therefore compose applications of reusable services rather than implementing a single monolithic application from the ground up for each project.Our approach for developing data exploration applications in systems biology builds on the microservice architecture. Microservice architectures separates an application into smaller components that communicate using language-agnostic protocols. We show that this design is suitable in bioinformatics applications where applications often use different tools, written in different languages, by different research groups. Packaging each service in a software container enables re-use and sharing of key components between applications, reducing development, deployment, and maintenance time.We demonstrate the viability of our approach through a web application, MIxT blood-tumor, for exploring and comparing transcriptional profiles from blood and tumor samples in breast cancer patients. The application integrates advanced statistical software, up-to-date information from biological databases, and modern data visualization libraries.The web application for exploring transcriptional profiles, MIxT, is online at mixt-blood-tumor.bci.mcgill.ca and open-sourced at github.com/fjukstad/mixt. Packages to build the supporting microservices are open-sourced as a part of Kvik at github.com/fjukstad/kvik.

Download Full-text

Identifying A- and P-site locations on ribosome-protected mRNA fragments using Integer Programming

10.1101/490755 ◽

2018 ◽

Cited By ~ 1

Author(s):

Nabeel Ahmed ◽

Pietro Sormanni ◽

Prajwal Ciryam ◽

Michele Vendruscolo ◽

Christopher M. Dobson ◽

...

Keyword(s):

Integer Programming ◽

Narrow Range ◽

Signal To Noise Ratio ◽

Embryonic Stem ◽

Site Location ◽

Signal To Noise ◽

Translation Elongation ◽

Reading Frame ◽

Ribosome Density ◽

A Site

AbstractIdentifying the A- and P-site locations on ribosome-protected mRNA fragments from Ribo-Seq experiments is a fundamental step in the quantitative analysis of transcriptome-wide translation properties at the codon level. Many analyses of Ribo-Seq data have utilized heuristic approaches applied to a narrow range of fragment sizes to identify the A-site. In this study, we use Integer Programming to identify A-site by maximizing an objective function that reflects the fact that the ribosome’s A-site on ribosome-protected fragments must reside between the second and stop codons of an mRNA. This identifies the A-site location as a function of the fragment’s size and its 5□ end reading frame in Ribo-Seq data generated from S. cerevisiae and mouse embryonic stem cells. The correctness of the identified A-site locations is demonstrated by showing that this method, as compared to others, yields the largest ribosome density at established stalling sites. By providing greater accuracy and utilization of a wider range of fragment sizes, our approach increases the signal-to-noise ratio of underlying biological signals associated with translation elongation at the codon length scale.

Download Full-text

Faculty Opinions recommendation of Ribosome profiling reveals pervasive and regulated stop codon readthrough in Drosophila melanogaster.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718198088.793491153 ◽

2014 ◽

Author(s):

Yves Barral ◽

Fabrice Caudron

Keyword(s):

Drosophila Melanogaster ◽

Stop Codon ◽

Ribosome Profiling ◽

Stop Codon Readthrough

Download Full-text

AB0210 ACREULAR: AN R PACKAGE FOR THE CALCULATION AND VISUALISATION OF ACR/EULAR RELATED RHEUMATOID ARTHRITIS MEASURES

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.2326 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 1405.1-1406

Author(s):

F. Morton ◽

J. Nijjar ◽

C. Goodyear ◽

D. Porter

Keyword(s):

Rheumatoid Arthritis ◽

Functional Status ◽

Rheumatic Diseases ◽

Web Application ◽

R Package ◽

Diagnostic Classification ◽

Microsoft Excel ◽

Link Type ◽

Large Joint ◽

Programming Skills

Background:The American College of Rheumatology (ACR) and the European League Against Rheumatism (EULAR) individually and collaboratively have produced/recommended diagnostic classification, response and functional status criteria for a range of different rheumatic diseases. While there are a number of different resources available for performing these calculations individually, currently there are no tools available that we are aware of to easily calculate these values for whole patient cohorts.Objectives:To develop a new software tool, which will enable both data analysts and also researchers and clinicians without programming skills to calculate ACR/EULAR related measures for a number of different rheumatic diseases.Methods:Criteria that had been developed by ACR and/or EULAR that had been approved for the diagnostic classification, measurement of treatment response and functional status in patients with rheumatoid arthritis were identified. Methods were created using the R programming language to allow the calculation of these criteria, which were incorporated into an R package. Additionally, an R/Shiny web application was developed to enable the calculations to be performed via a web browser using data presented as CSV or Microsoft Excel files.Results:acreular is a freely available, open source R package (downloadable fromhttps://github.com/fragla/acreular) that facilitates the calculation of ACR/EULAR related RA measures for whole patient cohorts. Measures, such as the ACR/EULAR (2010) RA classification criteria, can be determined using precalculated values for each component (small/large joint counts, duration in days, normal/abnormal acute-phase reactants, negative/low/high serology classification) or by providing “raw” data (small/large joint counts, onset/assessment dates, ESR/CRP and CCP/RF laboratory values). Other measures, including EULAR response and ACR20/50/70 response, can also be calculated by providing the required information. The accompanying web application is included as part of the R package but is also externally hosted athttps://fragla.shinyapps.io/shiny-acreular. This enables researchers and clinicians without any programming skills to easily calculate these measures by uploading either a Microsoft Excel or CSV file containing their data. Furthermore, the web application allows the incorporation of additional study covariates, enabling the automatic calculation of multigroup comparative statistics and the visualisation of the data through a number of different plots, both of which can be downloaded.Figure 1.The Data tab following the upload of data. Criteria are calculated by the selecting the appropriate checkbox.Figure 2.A density plot of DAS28 scores grouped by ACR/EULAR 2010 RA classification. Statistical analysis has been performed and shows a significant difference in DAS28 score between the two groups.Conclusion:The acreular R package facilitates the easy calculation of ACR/EULAR RA related disease measures for whole patient cohorts. Calculations can be performed either from within R or by using the accompanying web application, which also enables the graphical visualisation of data and the calculation of comparative statistics. We plan to further develop the package by adding additional RA related criteria and by adding ACR/EULAR related measures for other rheumatic disorders.Disclosure of Interests:Fraser Morton: None declared, Jagtar Nijjar Shareholder of: GlaxoSmithKline plc, Consultant of: Janssen Pharmaceuticals UK, Employee of: GlaxoSmithKline plc, Paid instructor for: Janssen Pharmaceuticals UK, Speakers bureau: Janssen Pharmaceuticals UK, AbbVie, Carl Goodyear: None declared, Duncan Porter: None declared

Download Full-text