The Application of an R Language-Based Platform cRacker for Phosphoproteomics Data Analysis

Author(s):  
Mingjie He ◽  
Zhi Li
Keyword(s):  
2017 ◽  
Vol 3 ◽  
pp. e129 ◽  
Author(s):  
Bruno Contrino ◽  
Eric Miele ◽  
Ronald Tomlinson ◽  
M. Paola Castaldi ◽  
Piero Ricchiuto

Background Mass Spectrometry (MS) based chemoproteomics has recently become a main tool to identify and quantify cellular target protein interactions with ligands/drugs in drug discovery. The complexity associated with these new types of data requires scientists with a limited computational background to perform systematic data quality controls as well as to visualize the results derived from the analysis to enable rapid decision making. To date, there are no readily accessible platforms specifically designed for chemoproteomics data analysis. Results We developed a Shiny-based web application named DOSCHEDA (Down Stream Chemoproteomics Data Analysis) to assess the quality of chemoproteomics experiments, to filter peptide intensities based on linear correlations between replicates, and to perform statistical analysis based on the experimental design. In order to increase its accessibility, DOSCHEDA is designed to be used with minimal user input and it does not require programming knowledge. Typical inputs can be protein fold changes or peptide intensities obtained from Proteome Discover, MaxQuant or other similar software. DOSCHEDA aggregates results from bioinformatics analyses performed on the input dataset into a dynamic interface, it encompasses interactive graphics and enables customized output reports. Conclusions DOSCHEDA is implemented entirely in R language. It can be launched by any system with R installed, including Windows, Mac OS and Linux distributions. DOSCHEDA is hosted on a shiny-server at https://doscheda.shinyapps.io/doscheda and is also available as a Bioconductor package (http://www.bioconductor.org/).


2015 ◽  
Author(s):  
Fabien Campagne ◽  
William ER Digan ◽  
Manuele Simi

AbstractData analysis tools have become essential to the study of biology. Here, we applied language workbench technology (LWT) to create data analysis languages tailored for biologists with a diverse range of experience: from beginners with no programming experience to expert bioinformaticians and statisticians. A key novelty of our approach is its ability to blend user interface with scripting in a single platform. This feature helps beginners and experts alike analyze data more productively. This new approach has several advantages over state of the art approaches currently popular for data analysis: experts can design simplified data analysis languages that require no programming experience, and behave like graphical user interfaces, yet have the advantages of scripting. We report on such a simple language, called MetaR, which we have used to teach complete beginners how to call differentially expressed genes and build heatmaps. We found that beginners can complete this task in less than 2 hours with MetaR, when more traditional teaching with R and its packages would require several training sessions (6-24hrs). Furthermore, MetaR seamlessly integrates with docker to enable reproducibility of analyses and simplified R package installations during training sessions. We used the same approach to develop the first composable R language. A composable language is a language that can be extended with micro-languages. We illustrate this capability with a Biomart micro-language designed to compose with R and help R programmers query Biomart interactively to assemble specific queries to retrieve data, (The same micro-language also composes with MetaR to help beginners query Biomart.) Our teaching experience suggests that language design with LWT can be a compelling approach for developing intelligent data analysis tools and can accelerate training for common data analysis task. LWT offers an interactive environment with the potential to promote exchanges between beginner and expert data analysts.


Author(s):  
Zev Ross ◽  
Hadley Wickham ◽  
David Robinson

The R language has withstood the test of time. Forty years after it was initially developed (in the form of the S language) R is being used by millions of programmers on workflows the inventors of the language could never have imagined. Although base R packages perform well in most settings, workflows can be made more efficient by developing packages with more consistent arguments, inputs and outputs and emphasizing constantly improving code over historical code consistency. The universe of R packages known as the tidyverse, including dplyr, tidyr and others, aim to improve workflows and make data analysis as smooth as possible by applying a set of core programming principles in package development.


2020 ◽  
Vol 6 ◽  
pp. e300
Author(s):  
Mathieu Fortin

The R language is widely used for data analysis. However, it does not allow for complex object-oriented implementation and it tends to be slower than other languages such as Java, C and C++. Consequently, it can be more computationally efficient to run native Java code in R. To do this, there exist at least two approaches. One is based on the Java Native Interface (JNI) and it has been successfully implemented in the rJava package. An alternative approach consists of running a local server in Java and linking it to an R environment through a socket connection. This alternative approach has been implemented in an R package called J4R. This article shows how this approach makes it possible to simplify the calls to Java methods and to integrate the R vectorization. The downside is a loss of performance. However, if the vectorization is used in conjunction with multithreading, this loss of performance can be compensated for.


F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 192 ◽  
Author(s):  
Emanuel Gonçalves ◽  
Julio Saez-Rodriguez

There is an increasing number of software packages to analyse biological experimental data in the R environment. In particular, Bioconductor, a repository of curated R packages, is one of the most comprehensive resources for bioinformatics and biostatistics. The use of these packages is increasing, but it requires a basic understanding of the R language, as well as the syntax of the specific package used. The availability of user graphical interfaces for these packages would decrease the learning curve and broaden their application.   Here, we present a Cytoscape plug-in termed Cyrface that allows Cytoscape plug-ins to connect to any function and package developed in R. Cyrface can be used to run R packages from within the Cytoscape environment making use of a graphical user interface. Moreover, it links the R packages with the capabilities of Cytoscape and its plug-ins, in particular network visualization and analysis. Cyrface’s utility has been demonstrated for two Bioconductor packages (CellNOptR and DrugVsDisease), and here we further illustrate its usage by implementing a workflow of data analysis and visualization. Download links, installation instructions and user guides can be accessed from the Cyrface homepage (http://www.ebi.ac.uk/saezrodriguez/cyrface/).


Author(s):  
Vera Costa ◽  
Rui Portocarrero Sarmento

Panel data is a regression analysis type that uses time data and spatial data. Thus, the behavior of groups, for example, enterprises or communities, is analyzed through a time scale. Panel data allows exploring variables that cannot be observed or measured or variables that evolve over time but not across groups or communities. In this chapter, two different techniques used in panel data analysis is explored: fixed effects (FE) and random effects (RE). First, theoretical concepts of panel data are presented. Additionally, a case study example of the use of this type of regression is provided. Panel data analysis is performed with R language, and a step-by-step approach is presented.


2021 ◽  
Vol 9 (09) ◽  
pp. 703-705
Author(s):  
Victor Sitnic ◽  
Valentina Stratan ◽  
Valeri Tutuianu ◽  
Cristina Popa ◽  
Veronica Balan

R is a free licensed programming language which presents a big interest as a tool for bioinformatics data analysis. It is essential in research activities related to the analysis of molecular-biological data and the identification of molecular markers. In this article we describe two simple techniquesof using FASTA type sequences and genomic data for the research of genetic markers. In order to apply the functions described below it is necessary to have installed the R language, the seqRFLP&Maftoolspackages, and optionally - the Integrated Development Environment Rstudio.


Author(s):  
Zev Ross ◽  
Hadley Wickham ◽  
David Robinson

The R language has withstood the test of time. Forty years after it was initially developed (in the form of the S language) R is being used by millions of programmers on workflows the inventors of the language could never have imagined. Although base R packages perform well in most settings, workflows can be made more efficient by developing packages with more consistent arguments, inputs and outputs and emphasizing constantly improving code over historical code consistency. The universe of R packages known as the tidyverse, including dplyr, tidyr and others, aim to improve workflows and make data analysis as smooth as possible by applying a set of core programming principles in package development.


2021 ◽  
Author(s):  
Ana Debón ◽  
Sonia Tarazona ◽  
Josep Domenech ◽  
Fernando Polo

The Universitat Politècnica de València and its Faculty of Business Administration and Management have created a new intensification, named, "Intelligent Data Analysis", that provides the student with sufficient knowledge to integrate data analysis in the sometimes routine tasks of a company.The statistical, computer and ICT-related skills obtained through the Business Administration and Management degree are enhanced with more advanced statiscal models for multivariate data analysis and with R language programming, which is very suitable for such data analysis. All these skills are acquired under the Project-Based Learning methodology.This project's main achievement has been the coordination between the different subjects of the intensification to use the same software, which has resulted in a continuity for the way in which students work with RStudio, R, and Rmakdown. This has provided them a high level of management and integration of data analysis in the students’ work routines which will later aid them to become more qualified professionals.


Sign in / Sign up

Export Citation Format

Share Document