scholarly journals AlphaMap: An open-source Python package for the visual annotation of proteomics data with sequence specific knowledge

2021 ◽  
Author(s):  
Eugenia Voytik ◽  
Isabell Bludau ◽  
Sander Willems ◽  
Fynn Hansen ◽  
Andreas-David Brunner ◽  
...  

Integrating experimental information across proteomic datasets with the wealth of publicly available sequence annotations is a crucial part in many proteomic studies that currently lacks an automated analysis platform. Here we present AlphaMap, a Python package that facilitates the visual exploration of peptide-level proteomics data. Identified peptides and post-translational modifications in proteomic datasets are mapped to their corresponding protein sequence and visualized together with prior knowledge from UniProt and with expected proteolytic cleavage sites. The functionality of AlphaMap can be accessed via an intuitive graphical user interface or - more flexibly - as a Python package that allows its integration into common analysis workflows for data visualization. AlphaMap produces publication-quality illustrations and can easily be customized to address a given research question. Availability and implementation: AlphaMap is implemented in Python and released under an Apache license. The source code and one-click installers are freely available at https://github.com/MannLabs/alphamap. Supplementary information: A detailed user guide for AlphaMap is provided as supplementary data.

2018 ◽  
Vol 35 (13) ◽  
pp. 2313-2314 ◽  
Author(s):  
Gang Peng ◽  
Rashaun Wilson ◽  
Yishuo Tang ◽  
TuKiet T Lam ◽  
Angus C Nairn ◽  
...  

Abstract Summary Large-scale, quantitative proteomics data are being generated at ever increasing rates by high-throughput, mass spectrometry technologies. However, due to the complexity of these large datasets as well as the increasing numbers of post-translational modifications (PTMs) that are being identified, developing effective methods for proteomic visualization has been challenging. ProteomicsBrowser was designed to meet this need for comprehensive data visualization. Using peptide information files exported from mass spectrometry search engines or quantitative tools as input, the peptide sequences are aligned to an internal protein database such as UniProtKB. Each identified peptide ion including those with PTMs is then visualized along the parent protein in the Browser. A unique property of ProteomicsBrowser is the ability to combine overlapping peptides in different ways to focus analysis of sequence coverage, charge state or PTMs. ProteomicsBrowser includes other useful functions, such as a data filtering tool and basic statistical analyses to qualify quantitative data. Availability and implementation ProteomicsBrowser is implemented in Java8 and is available at https://medicine.yale.edu/keck/nida/proteomicsbrowser.aspx and https://github.com/peng-gang/ProteomicsBrowser. Supplementary information Supplementary data are available at Bioinformatics online.


GigaScience ◽  
2021 ◽  
Vol 10 (5) ◽  
Author(s):  
Colin Farrell ◽  
Michael Thompson ◽  
Anela Tosevska ◽  
Adewale Oyetunde ◽  
Matteo Pellegrini

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.


2021 ◽  
Author(s):  
Benbo Gao ◽  
Jing Zhu ◽  
Soumya Negi ◽  
Xinmin Zhang ◽  
Stefka Gyoneva ◽  
...  

AbstractSummaryWe developed Quickomics, a feature-rich R Shiny-powered tool to enable biologists to fully explore complex omics data and perform advanced analysis in an easy-to-use interactive interface. It covers a broad range of secondary and tertiary analytical tasks after primary analysis of omics data is completed. Each functional module is equipped with customized configurations and generates both interactive and publication-ready high-resolution plots to uncover biological insights from data. The modular design makes the tool extensible with ease.AvailabilityResearchers can experience the functionalities with their own data or demo RNA-Seq and proteomics data sets by using the app hosted at http://quickomics.bxgenomics.com and following the tutorial, https://bit.ly/3rXIyhL. The source code under GPLv3 license is provided at https://github.com/interactivereport/[email protected], [email protected] informationSupplementary materials are available at https://bit.ly/37HP17g.


2019 ◽  
Author(s):  
Paul CANN ◽  
Malika CHABI ◽  
Aliénor DELSART ◽  
Chrystelle LE DANVIC ◽  
Jean-Michel SALIOU ◽  
...  

Abstract Abstract: Background : Small ungulates (sheep and goat) display a seasonal breeding, characterised by two successive periods, sexual activity (SA) and sexual rest (SR). Odours emitted by a sexually active male can reactivate the ovulation of anoestrus females. The plasticity of the olfactory system under these hormonal changes has never been explored at the peripheral level of odours reception. As it was shown in pig that the olfactory secretome (proteins secreted in the nasal mucus) could be modified under hormonal control, we monitored its composition in females of both species through several reproductive seasons, thanks to a non-invasive sampling of olfactory mucus. For this purpose, two-dimensional gel electrophoresis (2D-E), western-blot with specific antibodies, MALDI-TOF and high-resolution (nano-LC-MS/MS) mass spectrometry, RACE-PCR and molecular modelling were used. Results : In both species the olfactory secretome is composed of isoforms of OBP-like proteins, generated by post-translational modifications, as phosphorylation, N-glycosylation and O -GlcNAcylation. Important changes were observed in the olfactory secretome between the sexual rest and the sexual activity periods, characterised in ewe by the specific expression of SAL-like proteins and the emergence of OBPs O- GlcNAcylation. In goat, the differences between SA and SR did not come from new proteins expression, but from different post-translational modifications, the main difference between the SA and SR secretome being the number of isoforms of each protein. Proteomics data are available via ProteomeXchange with identifier PXD014833. Conclusion : Despite common behaviour, seasonal breeding, and genetic resources, the two species seem to adapt their sensory equipment in SA by different modalities: the variation of olfactory secretome in ewe could correspond to a specialization to detect male odours only in SA, whereas in goat the stability of the olfactory secretome could indicate a constant capacity of odours detection suggesting that the hallmark of SA in goat might be the emission of specific odours by the sexually active male. In both species, the olfactory secretome is a phenotype reflecting the physiological status of females, and could be used by breeders to monitor their receptivity to the male effect.


2021 ◽  
Author(s):  
Klaas Jan van Wijk ◽  
Eric W Deutsch ◽  
Qi Sun ◽  
Zhi Sun ◽  
Tami Leppert ◽  
...  

We developed a new resource, the Arabidopsis PeptideAtlas (www.peptideatlas.org/builds/arabidopsis/), to solve central questions about the Arabidopsis proteome, such as the significance of protein splice forms, post-translational modifications (PTMs), or simply obtain reliable information about specific proteins. PeptideAtlas is based on published mass spectrometry (MS) analyses collected through ProteomeXchange and reanalyzed through a uniform processing and metadata annotation pipeline. All matched MS-derived peptide data are linked to spectral, technical and biological metadata. Nearly 40 million out of ~143 million MSMS spectra were matched to the reference genome Araport11, identifying ~0.5 million unique peptides and 17858 uniquely identified proteins (only isoform per gene) at the highest confidence level (FDR 0.0004; 2 non-nested peptides ≥ 9 aa each), assigned canonical proteins, and 3543 lower confidence proteins. Physicochemical protein properties were evaluated for targeted identification of unobserved proteins. Additional proteins and isoforms currently not in Araport11 were identified, generated from pseudogenes, alternative start, stops and/or splice variants and sORFs; these features should be considered for updates to the Arabidopsis genome. Phosphorylation can be inspected through a sophisticated PTM viewer. This new PeptideAtlas is integrated with community resources including TAIR, tracks in JBrowse, PPDB and UniProtKB. Subsequent PeptideAtlas builds will incorporate millions more MS data.


Author(s):  
Jun Yan ◽  
Hongning Zhai ◽  
Ling Zhu ◽  
Sasha Sa ◽  
Xiaojun Ding

Abstract Motivation Data mining and data quality evaluation are indispensable constituents of quantitative proteomics, but few integrated tools available. Results We introduced obaDIA, a one-step pipeline to generate visualizable and comprehensive results for quantitative proteomics data. obaDIA supports fragment-level, peptide-level and protein-level abundance matrices from DIA technique, as well as protein-level abundance matrices from other quantitative proteomic techniques. The result contains abundance matrix statistics, differential expression analysis, protein functional annotation and enrichment analysis. Additionally, enrichment strategies which use total proteins or expressed proteins as background are optional, and HTML based interactive visualization for differentially expressed proteins in the KEGG pathway is offered, which helps biological significance mining. In short, obaDIA is an automatic tool for bioinformatics analysis for quantitative proteomics. Availability and implementation obaDIA is freely available from https://github.com/yjthu/obaDIA.git. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Michael Milton ◽  
Natalie Thorne

Abstract Summary aCLImatise is a utility for automatically generating tool definitions compatible with bioinformatics workflow languages, by parsing command-line help output. aCLImatise also has an associated database called the aCLImatise Base Camp, which provides thousands of pre-computed tool definitions. Availability and implementation The latest aCLImatise source code is available within a GitHub organisation, under the GPL-3.0 license: https://github.com/aCLImatise. In particular, documentation for the aCLImatise Python package is available at https://aclimatise.github.io/CliHelpParser/, and the aCLImatise Base Camp is available at https://aclimatise.github.io/BaseCamp/. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (20) ◽  
pp. 4140-4146 ◽  
Author(s):  
Ghazaleh Taherzadeh ◽  
Abdollah Dehzangi ◽  
Maryam Golchin ◽  
Yaoqi Zhou ◽  
Matthew P Campbell

Abstract Motivation Protein glycosylation is one of the most abundant post-translational modifications that plays an important role in immune responses, intercellular signaling, inflammation and host-pathogen interactions. However, due to the poor ionization efficiency and microheterogeneity of glycopeptides identifying glycosylation sites is a challenging task, and there is a demand for computational methods. Here, we constructed the largest dataset of human and mouse glycosylation sites to train deep learning neural networks and support vector machine classifiers to predict N-/O-linked glycosylation sites, respectively. Results The method, called SPRINT-Gly, achieved consistent results between ten-fold cross validation and independent test for predicting human and mouse glycosylation sites. For N-glycosylation, a mouse-trained model performs equally well in human glycoproteins and vice versa, however, due to significant differences in O-linked sites separate models were generated. Overall, SPRINT-Gly is 18% and 50% higher in Matthews correlation coefficient than the next best method compared in N-linked and O-linked sites, respectively. This improved performance is due to the inclusion of novel structure and sequence-based features. Availability and implementation http://sparks-lab.org/server/SPRINT-Gly/ Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (15) ◽  
pp. 4369-4371
Author(s):  
Andrew Whalen ◽  
Gregor Gorjanc ◽  
John M Hickey

Abstract Summary AlphaFamImpute is an imputation package for calling, phasing and imputing genome-wide genotypes in outbred full-sib families from single nucleotide polymorphism (SNP) array and genotype-by-sequencing (GBS) data. GBS data are increasingly being used to genotype individuals, especially when SNP arrays do not exist for a population of interest. Low-coverage GBS produces data with a large number of missing or incorrect naïve genotype calls, which can be improved by identifying shared haplotype segments between full-sib individuals. Here, we present AlphaFamImpute, an algorithm specifically designed to exploit the genetic structure of full-sib families. It performs imputation using a two-step approach. In the first step, it phases and imputes parental genotypes based on the segregation states of their offspring (i.e. which pair of parental haplotypes the offspring inherited). In the second step, it phases and imputes the offspring genotypes by detecting which haplotype segments the offspring inherited from their parents. With a series of simulations, we find that AlphaFamImpute obtains high-accuracy genotypes, even when the parents are not genotyped and individuals are sequenced at <1x coverage. Availability and implementation AlphaFamImpute is available as a Python package from the AlphaGenes website http://www.AlphaGenes.roslin.ed.ac.uk/AlphaFamImpute. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Deepank R Korandla ◽  
Jacob M Wozniak ◽  
Anaamika Campeau ◽  
David J Gonzalez ◽  
Erik S Wright

Abstract Motivation A core task of genomics is to identify the boundaries of protein coding genes, which may cover over 90% of a prokaryote's genome. Several programs are available for gene finding, yet it is currently unclear how well these programs perform and whether any offers superior accuracy. This is in part because there is no universal benchmark for gene finding and, therefore, most developers select their own benchmarking strategy. Results Here, we introduce AssessORF, a new approach for benchmarking prokaryotic gene predictions based on evidence from proteomics data and the evolutionary conservation of start and stop codons. We applied AssessORF to compare gene predictions offered by GenBank, GeneMarkS-2, Glimmer and Prodigal on genomes spanning the prokaryotic tree of life. Gene predictions were 88–95% in agreement with the available evidence, with Glimmer performing the worst but no clear winner. All programs were biased towards selecting start codons that were upstream of the actual start. Given these findings, there remains considerable room for improvement, especially in the detection of correct start sites. Availability and implementation AssessORF is available as an R package via the Bioconductor package repository. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document