scholarly journals MIEC-SVM: automated pipeline for protein peptide/ligand interaction prediction

2015 ◽  
Vol 32 (6) ◽  
pp. 940-942 ◽  
Author(s):  
Nan Li ◽  
Richard I. Ainsworth ◽  
Meixin Wu ◽  
Bo Ding ◽  
Wei Wang

Abstract Motivation: MIEC-SVM is a structure-based method for predicting protein recognition specificity. Here, we present an automated MIEC-SVM pipeline providing an integrated and user-friendly workflow for construction and application of the MIEC-SVM models. This pipeline can handle standard amino acids and those with post-translational modifications (PTMs) or small molecules. Moreover, multi-threading and support to Sun Grid Engine (SGE) are implemented to significantly boost the computational efficiency. Availability and implementation: The program is available at http://wanglab.ucsd.edu/MIEC-SVM. Contact: [email protected] Supplementary information : Supplementary data available at Bioinformatics online.

2019 ◽  
Vol 36 (5) ◽  
pp. 1647-1648 ◽  
Author(s):  
Bilal Wajid ◽  
Hasan Iqbal ◽  
Momina Jamil ◽  
Hafsa Rafique ◽  
Faria Anwar

Abstract Motivation Metabolomics is a data analysis and interpretation field aiming to study functions of small molecules within the organism. Consequently Metabolomics requires researchers in life sciences to be comfortable in downloading, installing and scripting of software that are mostly not user friendly and lack basic GUIs. As the researchers struggle with these skills, there is a dire need to develop software packages that can automatically install software pipelines truly speeding up the learning curve to build software workstations. Therefore, this paper aims to provide MetumpX, a software package that eases in the installation of 103 software by automatically resolving their individual dependencies and also allowing the users to choose which software works best for them. Results MetumpX is a Ubuntu-based software package that facilitate easy download and installation of 103 tools spread across the standard metabolomics pipeline. As far as the authors know MetumpX is the only solution of its kind where the focus lies on automating development of software workstations. Availability and implementation https://github.com/hasaniqbal777/MetumpX-bin. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Dan Zhang ◽  
Zhao-Chun Xu ◽  
Wei Su ◽  
Yu-He Yang ◽  
Hao Lv ◽  
...  

Abstract Motivation Protein carbonylation is one of the most important oxidative stress-induced post-translational modifications, which is generally characterized as stability, irreversibility and relative early formation. It plays a significant role in orchestrating various biological processes and has been already demonstrated to be related to many diseases. However, the experimental technologies for carbonylation sites identification are not only costly and time consuming, but also unable of processing a large number of proteins at a time. Thus, rapidly and effectively identifying carbonylation sites by computational methods will provide key clues for the analysis of occurrence and development of diseases. Results In this study, we developed a predictor called iCarPS to identify carbonylation sites based on sequence information. A novel feature encoding scheme called residues conical coordinates combined with their physicochemical properties was proposed to formulate carbonylated protein and non-carbonylated protein samples. To remove potential redundant features and improve the prediction performance, a feature selection technique was used. The accuracy and robustness of iCarPS were proved by experiments on training and independent datasets. Comparison with other published methods demonstrated that the proposed method is powerful and could provide powerful performance for carbonylation sites identification. Availability and implementation Based on the proposed model, a user-friendly webserver and a software package were constructed, which can be freely accessed at http://lin-group.cn/server/iCarPS. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 15 ◽  
Author(s):  
Akshatha Prasanna ◽  
Vidya Niranjan

Background: Since bacteria are the earliest known organisms, there has been significant interest in their variety and biology, most certainly concerning human health. Recent advances in Metagenomics sequencing (mNGS), a culture-independent sequencing technology have facilitated an accelerated development in clinical microbiology and our understanding of pathogens. Objective: For the implementation of mNGS in routine clinical practice to become feasible, a practical and scalable strategy for the study of mNGS data is essential. This study presents a robust automated pipeline to analyze clinical metagenomic data for pathogen identification and classification. Method: The proposed Clin-mNGS pipeline is an integrated, open-source, scalable, reproducible, and user-friendly framework scripted using the Snakemake workflow management software. The implementation avoids the hassle of manual installation and configuration of the multiple command-line tools and dependencies. The approach directly screens pathogens from clinical raw reads and generates consolidated reports for each sample. Results: The pipeline is demonstrated using publicly available data and is tested on a desktop Linux system and a High-performance cluster. The study compares variability in results from different tools and versions. The versions of the tools are made user modifiable. The pipeline results in quality check, filtered reads, host subtraction, assembled contigs, assembly metrics, relative abundances of bacterial species, antimicrobial resistance genes, plasmid finding, and virulence factors identification. The results obtained from the pipeline are evaluated based on sensitivity and positive predictive value. Conclusion: Clin-mNGS is an automated Snakemake pipeline validated for the analysis of microbial clinical metagenomics reads to perform taxonomic classification and antimicrobial resistance prediction.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Marcin Luzarowski ◽  
Rubén Vicente ◽  
Andrei Kiselev ◽  
Mateusz Wagner ◽  
Dennis Schlossarek ◽  
...  

AbstractProtein–metabolite interactions are of crucial importance for all cellular processes but remain understudied. Here, we applied a biochemical approach named PROMIS, to address the complexity of the protein–small molecule interactome in the model yeast Saccharomyces cerevisiae. By doing so, we provide a unique dataset, which can be queried for interactions between 74 small molecules and 3982 proteins using a user-friendly interface available at https://promis.mpimp-golm.mpg.de/yeastpmi/. By interpolating PROMIS with the list of predicted protein–metabolite interactions, we provided experimental validation for 225 binding events. Remarkably, of the 74 small molecules co-eluting with proteins, 36 were proteogenic dipeptides. Targeted analysis of a representative dipeptide, Ser-Leu, revealed numerous protein interactors comprising chaperones, proteasomal subunits, and metabolic enzymes. We could further demonstrate that Ser-Leu binding increases activity of a glycolytic enzyme phosphoglycerate kinase (Pgk1). Consistent with the binding analysis, Ser-Leu supplementation leads to the acute metabolic changes and delays timing of a diauxic shift. Supported by the dipeptide accumulation analysis our work attests to the role of Ser-Leu as a metabolic regulator at the interface of protein degradation and central metabolism.


Author(s):  
Ferhat Alkan ◽  
Joana Silva ◽  
Eric Pintó Barberà ◽  
William J Faller

Abstract Motivation Ribosome Profiling (Ribo-seq) has revolutionized the study of RNA translation by providing information on ribosome positions across all translated RNAs with nucleotide-resolution. Yet several technical limitations restrict the sequencing depth of such experiments, the most common of which is the overabundance of rRNA fragments. Various strategies can be employed to tackle this issue, including the use of commercial rRNA depletion kits. However, as they are designed for more standardized RNAseq experiments, they may perform suboptimally in Ribo-seq. In order to overcome this, it is possible to use custom biotinylated oligos complementary to the most abundant rRNA fragments, however currently no computational framework exists to aid the design of optimal oligos. Results Here, we first show that a major confounding issue is that the rRNA fragments generated via Ribo-seq vary significantly with differing experimental conditions, suggesting that a “one-size-fits-all” approach may be inefficient. Therefore we developed Ribo-ODDR, an oligo design pipeline integrated with a user-friendly interface that assists in oligo selection for efficient experiment-specific rRNA depletion. Ribo-ODDR uses preliminary data to identify the most abundant rRNA fragments, and calculates the rRNA depletion efficiency of potential oligos. We experimentally show that Ribo-ODDR designed oligos outperform commercially available kits and lead to a significant increase in rRNA depletion in Ribo-seq. Availability Ribo-ODDR is freely accessible at https://github.com/fallerlab/Ribo-ODDR Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (12) ◽  
pp. 3913-3915
Author(s):  
Hemi Luan ◽  
Xingen Jiang ◽  
Fenfen Ji ◽  
Zhangzhang Lan ◽  
Zongwei Cai ◽  
...  

Abstract Motivation Liquid chromatography–mass spectrometry-based non-targeted metabolomics is routinely performed to qualitatively and quantitatively analyze a tremendous amount of metabolite signals in complex biological samples. However, false-positive peaks in the datasets are commonly detected as metabolite signals by using many popular software, resulting in non-reliable measurement. Results To reduce false-positive calling, we developed an interactive web tool, termed CPVA, for visualization and accurate annotation of the detected peaks in non-targeted metabolomics data. We used a chromatogram-centric strategy to unfold the characteristics of chromatographic peaks through visualization of peak morphology metrics, with additional functions to annotate adducts, isotopes and contaminants. CPVA is a free, user-friendly tool to help users to identify peak background noises and contaminants, resulting in decrease of false-positive or redundant peak calling, thereby improving the data quality of non-targeted metabolomics studies. Availability and implementation The CPVA is freely available at http://cpva.eastus.cloudapp.azure.com. Source code and installation instructions are available on GitHub: https://github.com/13479776/cpva. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Gyanendra Gurung ◽  
Kshama Roy

Abstract The use of Geographic Information System (GIS) in managing pipeline database and automating routine engineering processes has become a standard practice in the pipeline industry. While maintaining a central database provides security, integrity, and easy management of data throughout the pipeline’s lifecycle, GIS enables spatial analysis of pipeline data in addition to streamlining access and visualization of results. One of the major benefits of GIS integration lies in the ease of automating the alignment sheet generation for pipelines. This paper introduces a simplified pipeline alignment sheet generation workflow using GIS datasets to produce highly customizable alignment sheets in AutoCAD, a much-preferred format in the pipeline industry. By utilizing existing GIS and AutoCAD features to generate the alignment sheet, writing complicated geo-processing or plotting algorithms is minimized, which in turn reduces the risks of committing any systematic errors. This robust and user-friendly workflow not only ensures safety but also leads to a cost-effective solution.


Author(s):  
Zhuohang Yu ◽  
Zengrui Wu ◽  
Weihua Li ◽  
Guixia Liu ◽  
Yun Tang

Abstract Summary MetaADEDB is an online database we developed to integrate comprehensive information on adverse drug events (ADEs). The first version of MetaADEDB was released in 2013 and has been widely used by researchers. However, it has not been updated for more than seven years. Here, we reported its second version by collecting more and newer data from the U.S. FDA Adverse Event Reporting System (FAERS) and Canada Vigilance Adverse Reaction Online Database, in addition to the original three sources. The new version consists of 744 709 drug–ADE associations between 8498 drugs and 13 193 ADEs, which has an over 40% increase in drug–ADE associations compared to the previous version. Meanwhile, we developed a new and user-friendly web interface for data search and analysis. We hope that MetaADEDB 2.0 could provide a useful tool for drug safety assessment and related studies in drug discovery and development. Availability and implementation The database is freely available at: http://lmmd.ecust.edu.cn/metaadedb/. Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Author(s):  
Stephen G. Gaffney ◽  
Jeffrey P. Townsend

ABSTRACTSummaryPathScore quantifies the level of enrichment of somatic mutations within curated pathways, applying a novel approach that identifies pathways enriched across patients. The application provides several user-friendly, interactive graphic interfaces for data exploration, including tools for comparing pathway effect sizes, significance, gene-set overlap and enrichment differences between projects.Availability and ImplementationWeb application available at pathscore.publichealth.yale.edu. Site implemented in Python and MySQL, with all major browsers supported. Source code available at github.com/sggaffney/pathscore with a GPLv3 [email protected] InformationAdditional documentation can be found at http://pathscore.publichealth.yale.edu/faq.


Author(s):  
Frédéric Lemoine ◽  
Luc Blassel ◽  
Jakub Voznica ◽  
Olivier Gascuel

AbstractMotivationThe first cases of the COVID-19 pandemic emerged in December 2019. Until the end of February 2020, the number of available genomes was below 1,000, and their multiple alignment was easily achieved using standard approaches. Subsequently, the availability of genomes has grown dramatically. Moreover, some genomes are of low quality with sequencing/assembly errors, making accurate re-alignment of all genomes nearly impossible on a daily basis. A more efficient, yet accurate approach was clearly required to pursue all subsequent bioinformatics analyses of this crucial data.ResultshCoV-19 genomes are highly conserved, with very few indels and no recombination. This makes the profile HMM approach particularly well suited to align new genomes, add them to an existing alignment and filter problematic ones. Using a core of ∼2,500 high quality genomes, we estimated a profile using HMMER, and implemented this profile in COVID-Align, a user-friendly interface to be used online or as standalone via Docker. The alignment of 1,000 genomes requires less than 20mn on our cluster. Moreover, COVID-Align provides summary statistics, which can be used to determine the sequencing quality and evolutionary novelty of input genomes (e.g. number of new mutations and indels).Availabilityhttps://covalign.pasteur.cloud, hub.docker.com/r/evolbioinfo/[email protected], [email protected] informationSupplementary information is available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document