Metabolite-Investigator: an integrated user-friendly workflow for metabolomics multi-study analysis

Bioinformatics ◽

10.1093/bioinformatics/btaa967 ◽

2020 ◽

Author(s):

Carl Beuchel ◽

Holger Kirsten ◽

Uta Ceglarek ◽

Markus Scholz

Keyword(s):

Measurement Techniques ◽

Supplementary Information ◽

Metabolomics Data ◽

Factors Affecting ◽

Quantitative Metabolomics ◽

Covariate Model ◽

Shiny App ◽

Analysis Workflow ◽

Scalable Analysis ◽

User Friendly

Abstract Motivation Many diseases have a metabolic background, which is increasingly investigated due to improved measurement techniques allowing high-throughput assessment of metabolic features in several body fluids. Integrating data from multiple cohorts is of high importance to obtain robust and reproducible results. However, considerable variability across studies due to differences in sampling, measurement techniques and study populations needs to be accounted for. Results We present Metabolite-Investigator, a scalable analysis workflow for quantitative metabolomics data from multiple studies. Our tool supports all aspects of data pre-processing including data integration, cleaning, transformation, batch analysis as well as multiple analysis methods including uni- and multivariable factor-metabolite associations, network analysis and factor prioritization in one or more cohorts. Moreover, it allows identifying critical interactions between cohorts and factors affecting metabolite levels and inferring a common covariate model, all via a graphical user interface. Availability and implementation We constructed Metabolite-Investigator as a free and open web-tool and stand-alone Shiny-app. It is hosted at https://apps.health-atlas.de/metabolite-investigator/, the source code is freely available at https://github.com/cfbeuchel/Metabolite-Investigator. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

CPVA: a web-based metabolomic tool for chromatographic peak visualization and annotation

Bioinformatics ◽

10.1093/bioinformatics/btaa200 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3913-3915

Author(s):

Hemi Luan ◽

Xingen Jiang ◽

Fenfen Ji ◽

Zhangzhang Lan ◽

Zongwei Cai ◽

...

Keyword(s):

False Positive ◽

Supplementary Information ◽

Liquid Chromatography Mass Spectrometry ◽

Targeted Metabolomics ◽

Metabolomics Data ◽

Web Based ◽

Tremendous Amount ◽

Chromatographic Peaks ◽

User Friendly

Abstract Motivation Liquid chromatography–mass spectrometry-based non-targeted metabolomics is routinely performed to qualitatively and quantitatively analyze a tremendous amount of metabolite signals in complex biological samples. However, false-positive peaks in the datasets are commonly detected as metabolite signals by using many popular software, resulting in non-reliable measurement. Results To reduce false-positive calling, we developed an interactive web tool, termed CPVA, for visualization and accurate annotation of the detected peaks in non-targeted metabolomics data. We used a chromatogram-centric strategy to unfold the characteristics of chromatographic peaks through visualization of peak morphology metrics, with additional functions to annotate adducts, isotopes and contaminants. CPVA is a free, user-friendly tool to help users to identify peak background noises and contaminants, resulting in decrease of false-positive or redundant peak calling, thereby improving the data quality of non-targeted metabolomics studies. Availability and implementation The CPVA is freely available at http://cpva.eastus.cloudapp.azure.com. Source code and installation instructions are available on GitHub: https://github.com/13479776/cpva. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Miso: an R package for multiple isotope labeling assisted metabolomics data analysis

Bioinformatics ◽

10.1093/bioinformatics/btz092 ◽

2019 ◽

Vol 35 (18) ◽

pp. 3524-3526 ◽

Cited By ~ 3

Author(s):

Yonghui Dong ◽

Liron Feldberg ◽

Asaph Aharoni

Keyword(s):

Data Analysis ◽

Isotope Labeling ◽

R Package ◽

Mass Spectrometry Data ◽

Data Matrix ◽

Supplementary Information ◽

Metabolomics Data ◽

Biological Studies ◽

Analysis Workflow ◽

Efficient Data

Abstract Motivation The use of stable isotope labeling is highly advantageous for structure elucidation in metabolomics studies. However, computational tools dealing with multiple-precursor-based labeling studies are still missing. Hence, we developed Miso, an R package providing automated and efficient data analysis workflow to detect the complete repertoire of labeled molecules from multiple-precursor-based labeling experiments. Results The capability of Miso is demonstrated by the analysis of liquid chromatography-mass spectrometry data obtained from duckweed plants fed with one unlabeled and two differently labeled tyrosine (unlabeled tyrosine, tyrosine-2H4 and tyrosine-13C915N1). The resulting data matrix generated by Miso contains sets of unlabeled and labeled ions with their retention time, m/z values and number of labeled atoms that can be directly utilized for database query and biological studies. Availability and implementation Miso is publicly available on the CRAN repository (https://cran.r-project.org/web/packages/Miso). A reproducible case study and a detailed tutorial are available from GitHub (https://github.com/YonghuiDong/Miso_example). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

rCASC: reproducible Classification Analysis of Single Cell sequencing data

10.1101/430967 ◽

2018 ◽

Cited By ~ 1

Author(s):

Luca Alessandrì ◽

Marco Beccuti ◽

Maddalena Arigoni ◽

Martina Olivero ◽

Greta Romano ◽

...

Keyword(s):

Single Cell ◽

Single Cells ◽

R Package ◽

Cellular Heterogeneity ◽

Supplementary Information ◽

Sequencing Data ◽

Single Cell Sequencing ◽

Analysis Workflow ◽

User Friendly ◽

Bioinformatics Workflows

AbstractSummarySingle-cell RNA sequencing has emerged as an essential tool to investigate cellular heterogeneity, and highlighting cell sub-population specific signatures. Nowadays, dedicated and user-friendly bioinformatics workflows are required to exploit the deconvolution of single-cells transcriptome. Furthermore, there is a growing need of bioinformatics workflows granting both functional, i.e. saving information about data and analysis parameters, and computation reproducibility, i.e. storing the real image of the computation environment. Here, we present rCASC a modular RNAseq analysis workflow allowing data analysis from counts generation to cell sub-population signatures identification, granting both functional and computation reproducibility.Availability and ImplementationrCASC is part of the reproducible bioinfomatics project. rCASC is a docker based application controlled by a R package available at https://github.com/kendomaniac/rCASC.Supplementary informationSupplementary data are available at rCASC github

Download Full-text

Interoperable and scalable data analysis with microservices: applications in metabolomics

Bioinformatics ◽

10.1093/bioinformatics/btz160 ◽

2019 ◽

Vol 35 (19) ◽

pp. 3752-3760 ◽

Cited By ~ 10

Author(s):

Payam Emami Khoonsari ◽

Pablo Moreno ◽

Sven Bergmann ◽

Joachim Burman ◽

Marco Capuccini ◽

...

Keyword(s):

Mass Spectrometry ◽

Data Analysis ◽

Large Scale ◽

Scientific Discipline ◽

Supplementary Information ◽

Resonance Spectroscopy ◽

Research Environment ◽

Metabolomics Data ◽

Analysis Workflow ◽

Virtual Research Environment

Abstract Motivation Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. Results We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science. Availability and implementation The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

ASICS: an R package for a whole analysis workflow of 1D 1H NMR spectra

Bioinformatics ◽

10.1093/bioinformatics/btz248 ◽

2019 ◽

Vol 35 (21) ◽

pp. 4356-4363 ◽

Cited By ~ 7

Author(s):

Gaëlle Lefort ◽

Laurence Liaubet ◽

Cécile Canlet ◽

Patrick Tardivel ◽

Marie-Christine Père ◽

...

Keyword(s):

Metabolic Pathways ◽

Nmr Spectra ◽

Complex Mixture ◽

R Package ◽

Statistical Analyses ◽

Supplementary Information ◽

Automatic Identification ◽

Analysis Workflow ◽

Expert Analysis ◽

New Biomarkers

Abstract Motivation In metabolomics, the detection of new biomarkers from Nuclear Magnetic Resonance (NMR) spectra is a promising approach. However, this analysis remains difficult due to the lack of a whole workflow that handles spectra pre-processing, automatic identification and quantification of metabolites and statistical analyses, in a reproducible way. Results We present ASICS, an R package that contains a complete workflow to analyse spectra from NMR experiments. It contains an automatic approach to identify and quantify metabolites in a complex mixture spectrum and uses the results of the quantification in untargeted and targeted statistical analyses. ASICS was shown to improve the precision of quantification in comparison to existing methods on two independent datasets. In addition, ASICS successfully recovered most metabolites that were found important to explain a two level condition describing the samples by a manual and expert analysis based on bucketing. It also found new relevant metabolites involved in metabolic pathways related to risk factors associated with the condition. Availability and implementation ASICS is distributed as an R package, available on Bioconductor. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

BREC: an R package/Shiny app for automatically identifying heterochromatin boundaries and estimating local recombination rates along chromosomes

BMC Bioinformatics ◽

10.1186/s12859-021-04233-1 ◽

2021 ◽

Vol 22 (S6) ◽

Author(s):

Yasmine Mansour ◽

Annie Chateau ◽

Anna-Sophie Fiston-Lavier

Keyword(s):

Data Quality ◽

Data Science ◽

Fruit Fly ◽

R Package ◽

Model Organisms ◽

Data Quality Control ◽

Recombination Rates ◽

Functional Dynamics ◽

Shiny App ◽

User Friendly

Abstract Background Meiotic recombination is a vital biological process playing an essential role in genome's structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates necessary to address evolutionary questions. Results Here, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers' density and distribution issues. Conclusions BREC's heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC's recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource. The BREC R-package is available at the GitHub repository https://github.com/GenomeStructureOrganization.

Download Full-text

Unified exact design with early stopping rules for single arm clinical trials with multiple endpoints

Statistical Methods in Medical Research ◽

10.1177/09622802211013062 ◽

2021 ◽

pp. 096228022110130

Author(s):

Wei Wei ◽

Denise Esserman ◽

Michael Kane ◽

Daniel Zelterman

Keyword(s):

Clinical Trials ◽

Early Phase ◽

Decision Rules ◽

Stopping Rules ◽

Multiple Endpoints ◽

Shiny App ◽

R Shiny ◽

Early Phase Clinical Trials ◽

User Friendly ◽

Exact Design

Adaptive designs are gaining popularity in early phase clinical trials because they enable investigators to change the course of a study in response to accumulating data. We propose a novel design to simultaneously monitor several endpoints. These include efficacy, futility, toxicity and other outcomes in early phase, single-arm studies. We construct a recursive relationship to compute the exact probabilities of stopping for any combination of endpoints without the need for simulation, given pre-specified decision rules. The proposed design is flexible in the number and timing of interim analyses. A R Shiny app with user-friendly web interface has been created to facilitate the implementation of the proposed design.

Download Full-text

Ribo-ODDR: Oligo design pipeline for experiment-specific rRNA depletion in ribo-seq

Bioinformatics ◽

10.1093/bioinformatics/btab171 ◽

2021 ◽

Author(s):

Ferhat Alkan ◽

Joana Silva ◽

Eric Pintó Barberà ◽

William J Faller

Keyword(s):

Ribosome Profiling ◽

Supplementary Information ◽

Experimental Conditions ◽

Computational Framework ◽

Rna Translation ◽

Rrna Depletion ◽

Selection For ◽

Nucleotide Resolution ◽

User Friendly ◽

Oligo Design

Abstract Motivation Ribosome Profiling (Ribo-seq) has revolutionized the study of RNA translation by providing information on ribosome positions across all translated RNAs with nucleotide-resolution. Yet several technical limitations restrict the sequencing depth of such experiments, the most common of which is the overabundance of rRNA fragments. Various strategies can be employed to tackle this issue, including the use of commercial rRNA depletion kits. However, as they are designed for more standardized RNAseq experiments, they may perform suboptimally in Ribo-seq. In order to overcome this, it is possible to use custom biotinylated oligos complementary to the most abundant rRNA fragments, however currently no computational framework exists to aid the design of optimal oligos. Results Here, we first show that a major confounding issue is that the rRNA fragments generated via Ribo-seq vary significantly with differing experimental conditions, suggesting that a “one-size-fits-all” approach may be inefficient. Therefore we developed Ribo-ODDR, an oligo design pipeline integrated with a user-friendly interface that assists in oligo selection for efficient experiment-specific rRNA depletion. Ribo-ODDR uses preliminary data to identify the most abundant rRNA fragments, and calculates the rRNA depletion efficiency of potential oligos. We experimentally show that Ribo-ODDR designed oligos outperform commercially available kits and lead to a significant increase in rRNA depletion in Ribo-seq. Availability Ribo-ODDR is freely accessible at https://github.com/fallerlab/Ribo-ODDR Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

The prediction of voluntary intake of grazing dairy cows

The Journal of Agricultural Science ◽

10.1017/s0021859600066788 ◽

1986 ◽

Vol 107 (1) ◽

pp. 43-54 ◽

Cited By ~ 23

Author(s):

Lindsey Caird ◽

W. Holmes

Keyword(s):

Organic Matter ◽

Measurement Techniques ◽

Live Weight ◽

Simple Equation ◽

Factors Affecting ◽

Farm Planning ◽

Sward Height ◽

The Mean ◽

Major Factors ◽

Grazing Dairy Cows

SUMMARYInformation on the total organic matter intake, concentrates supplied (C), live weight (LW), week of lactation (WL), milk yield (MY), herbage organic matter digestibility (HOMD), herbage mass, sward height (SHT) or herbage allowance (HAL) measured individually for 357 cows at one of three sites was assembled. Observed intake was compared with intakes predicted by existing intake equations and new prediction equations based on regression models or regression and least-squares constants were developed. Major factors affecting intake were MY, LW, WL, C and HAL or SHT. Although HOMD was correlated with intake, better predictions were obtained when HOMD was omitted. There were differences between sites possibly associated with differences in measurement techniques.The predictive value of some existing equations and new equations were tested against independent sets of data. A simple equation (A) based on MY and LW (Ministry of Agriculture, Fisheries and Food, 1975) gave satisfactory average predictions but the mean square prediction error (MSPE) was high. The equations of Vadiveloo & Holmes (1979) adjusted for bias gave a relatively low MSPE. The preferred new equations for grazing cattle included MY, LW, WL, C and HAL or SHT, and their MSPE were similar to or lower than for indoor equations.The discussion indicates that a simple equation (A) would give adequate predictions for farm planning. The more detailed equations illustrate the inter-relations of animal with sward conditions and concentrate allowances. Predicted intakes may deviate from actual intakes because of short-term changes in body reserves.

Download Full-text

From Cheese-Making to Consumption: Exploring the Microbial Safety of Cheeses through Predictive Microbiology Models

Foods ◽

10.3390/foods10020355 ◽

2021 ◽

Vol 10 (2) ◽

pp. 355

Author(s):

Arícia Possas ◽

Olga María Bonilla-Luque ◽

Antonio Valero

Keyword(s):

Production Process ◽

Physicochemical Characteristics ◽

Storage Conditions ◽

Starter Cultures ◽

Predictive Microbiology ◽

Factors Affecting ◽

Modeling Approaches ◽

Modelling Studies ◽

Foodborne Outbreaks ◽

User Friendly

Cheeses are traditional products widely consumed throughout the world that have been frequently implicated in foodborne outbreaks. Predictive microbiology models are relevant tools to estimate microbial behavior in these products. The objective of this study was to conduct a review on the available modeling approaches developed in cheeses, and to identify the main microbial targets of concern and the factors affecting microbial behavior in these products. Listeria monocytogenes has been identified as the main hazard evaluated in modelling studies. The pH, aw, lactic acid concentration and temperature have been the main factors contemplated as independent variables in models. Other aspects such as the use of raw or pasteurized milk, starter cultures, and factors inherent to the contaminating pathogen have also been evaluated. In general, depending on the production process, storage conditions, and physicochemical characteristics, microorganisms can grow or die-off in cheeses. The classical two-step modeling has been the most common approach performed to develop predictive models. Other modeling approaches, including microbial interaction, growth boundary, response surface methodology, and neural networks, have also been performed. Validated models have been integrated into user-friendly software tools to be used to obtain estimates of microbial behavior in a quick and easy manner. Future studies should investigate the fate of other target bacterial pathogens, such as spore-forming bacteria, and the dynamic character of the production process of cheeses, among other aspects. The information compiled in this study helps to deepen the knowledge on the predictive microbiology field in the context of cheese production and storage.

Download Full-text