Metabolomics Data Processing Using OpenMS

Abstract Background Metabolomics is gaining popularity as a standard tool for the investigation of biological systems. Yet, parsing metabolomics data in the absence of in-house computational scientists can be overwhelming and time consuming. As a consequence of manual data processing the results are often not processed in full depth, so potential novel findings might get lost. Methods To tackle this problem we developed Metabolite AutoPlotter, a tool to process and visualise metabolite data. It reads as input pre-processed compound-intensity tables and accepts different experimental designs, with respect to number of compounds, conditions and replicates. The code was written in R and wrapped into a shiny-application that can be run online in a web-browser on https://mpietzke.shinyapps.io/autoplotter. Results We demonstrate the main features and the ease of use with two different metabolite datasets, for quantitative experiments and for stable isotope tracing experiments. We show how the plots generated by the tool can be interactively modified with respect to plot type, colours, text labels and the shown statistics. We also demonstrate the application towards 13-C-tracing experiments and the seamless integration of natural abundance correction, which facilitates the better interpretation of stable isotope tracing experiments. The output of the tool is a zip-file containing one single plot for each compound as well as sorted and restructured tables that can be used for further analysis. Conclusion With the help of Metabolite AutoPlotter it is now possible to automate data processing and visualisation for a wide audience. High quality plots from complex data can be generated in a short time with pressing a few buttons. This offers dramatic improvements over manual processing. It is significantly faster and allows researchers to spend more time interpreting the results or to perform follow-up experiments. Further this eliminates potential copy-and paste errors or tedious repetitions when things need to be changed. We are sure that this tool will help to improve and speed up scientific discoveries.

Download Full-text

AutoTuner: High fidelity, robust, and rapid parameter selection for metabolomics data processing

10.1101/812370 ◽

2019 ◽

Cited By ~ 3

Author(s):

Craig McLean ◽

Elizabeth B. Kujawinski

Keyword(s):

Data Processing ◽

Parameter Optimization ◽

R Package ◽

Parameter Selection ◽

Single Step ◽

Monte Carlo Experiment ◽

Parameter Estimates ◽

Metabolomics Data ◽

Raw Data ◽

Mass Spectral

AbstractUntargeted metabolomics experiments provide a snapshot of cellular metabolism, but remain challenging to interpret due to the computational complexity involved in data processing and analysis. Prior to any interpretation, raw data must be processed to remove noise and to align mass-spectral peaks across samples. This step requires selection of dataset-specific parameters, as erroneous parameters can result in noise inflation. While several algorithms exist to automate parameter selection, each depends on gradient descent optimization functions. In contrast, our new parameter optimization algorithm, AutoTuner, obtains parameter estimates from raw data in a single step as opposed to many iterations. Here, we tested the accuracy and the run time of AutoTuner in comparison to isotopologue parameter optimization (IPO), the most commonly-used parameter selection tool, and compared the resulting parameters’ influence on the quality of feature tables after processing. We performed a Monte Carlo experiment to test the robustness of AutoTuner parameter selection, and found that AutoTuner generated similar parameter estimates from random subsets of samples. We conclude that AutoTuner is a desirable alternative to existing tools, because it is scalable, highly robust, and very fast (∼100-1000X speed improvement from other algorithms going from days to minutes). AutoTuner is freely available as an R package through BioConductor.

Download Full-text

Comparison of Three Untargeted Data Processing Workflows for Evaluating LC-HRMS Metabolomics Data

Metabolites ◽

10.3390/metabo10090378 ◽

2020 ◽

Vol 10 (9) ◽

pp. 378 ◽

Cited By ~ 2

Author(s):

Selina Hemmer ◽

Sascha K. Manier ◽

Svenja Fischmann ◽

Folker Westphal ◽

Lea Wagmann ◽

...

Keyword(s):

Data Processing ◽

Open Source ◽

High Resolution Mass Spectrometry ◽

Liver Microsomes ◽

Reversed Phase ◽

Ease Of Use ◽

Untargeted Metabolomics ◽

Metabolomics Data ◽

Metabolomics Study ◽

High Flexibility

The evaluation of liquid chromatography high-resolution mass spectrometry (LC-HRMS) raw data is a crucial step in untargeted metabolomics studies to minimize false positive findings. A variety of commercial or open source software solutions are available for such data processing. This study aims to compare three different data processing workflows (Compound Discoverer 3.1, XCMS Online combined with MetaboAnalyst 4.0, and a manually programmed tool using R) to investigate LC-HRMS data of an untargeted metabolomics study. Simple but highly standardized datasets for evaluation were prepared by incubating pHLM (pooled human liver microsomes) with the synthetic cannabinoid A-CHMINACA. LC-HRMS analysis was performed using normal- and reversed-phase chromatography followed by full scan MS in positive and negative mode. MS/MS spectra of significant features were subsequently recorded in a separate run. The outcome of each workflow was evaluated by its number of significant features, peak shape quality, and the results of the multivariate statistics. Compound Discoverer as an all-in-one solution is characterized by its ease of use and seems, therefore, suitable for simple and small metabolomic studies. The two open source solutions allowed extensive customization but particularly, in the case of R, made advanced programming skills necessary. Nevertheless, both provided high flexibility and may be suitable for more complex studies and questions.

Download Full-text

AutoTuner: High Fidelity and Robust Parameter Selection for Metabolomics Data Processing

Analytical Chemistry ◽

10.1021/acs.analchem.9b04804 ◽

2020 ◽

Vol 92 (8) ◽

pp. 5724-5732 ◽

Cited By ~ 4

Author(s):

Craig McLean ◽

Elizabeth B. Kujawinski

Keyword(s):

Data Processing ◽

Parameter Selection ◽

High Fidelity ◽

Metabolomics Data ◽

Robust Parameter ◽

Selection For

Download Full-text

MET-IDEA version 2.06; improved efficiency and additional functions for mass spectrometry-based metabolomics data processing

Metabolomics ◽

10.1007/s11306-012-0397-5 ◽

2012 ◽

Vol 8 (S1) ◽

pp. 105-110 ◽

Cited By ~ 20

Author(s):

Zhentian Lei ◽

Haiquan Li ◽

Junil Chang ◽

Patrick X. Zhao ◽

Lloyd W. Sumner

Keyword(s):

Mass Spectrometry ◽

Data Processing ◽

Metabolomics Data

Download Full-text

SmartPeak automates targeted and quantitative metabolomics data processing

10.1101/2020.07.14.202002 ◽

2020 ◽

Author(s):

Svetlana Kutuzova ◽

Pasquale Colaianni ◽

Hannes Röst ◽

Timo Sachsenberg ◽

Oliver Alka ◽

...

Keyword(s):

Data Processing ◽

List Type ◽

Data Set ◽

Metabolomics Data ◽

Quantitative Metabolomics ◽

Analytical Instruments ◽

Time Alignment ◽

Automated Processing ◽

Retention Time Alignment ◽

Novel Algorithms

AbstractSmartPeak is an application that encapsulates advanced algorithms to enable fast, accurate, and automated processing of CE-, GC- and LC-MS(/MS) data, and HPLC data for targeted and semi-targeted metabolomics, lipidomics, and fluxomics experiments.HighlightsNovel algorithms for retention time alignment, calibration curve fitting, and peak integrationEnables reproducibility by reducing operator bias and ensuring high QC/QAAutomated pipeline for high throughput targeted and/or quantitative metabolomics, lipidomics, and fluxomics data processing from multiple analytical instrumentsManually curated data set of LC-MS/MS, GC-MS, and HPLC integrated peaks for further algorithm development and benchmarking

Download Full-text

Speaq 2.0: A Complete Workflow for High-Throughput 1D NMR Spectra Processing And Quantification

10.1101/138503 ◽

2017 ◽

Author(s):

Charlie Beirnaert ◽

Pieter Meysman ◽

Trung Nghia Vu ◽

Nina Hermans ◽

Sandra Apers ◽

...

Keyword(s):

Data Analysis ◽

Data Processing ◽

High Throughput ◽

Nmr Spectra ◽

User Interaction ◽

Peak Picking ◽

Commercial Software ◽

Metabolomics Data ◽

Link Type ◽

Nmr Data

AbstractNuclear Magnetic Resonance (NMR) spectroscopy is, together with liquid chromatography-mass spectrometry (LC-MS), the most established platform to perform metabolomics. In contrast to LC-MS however, NMR data is predominantly being processed with commercial software. This has the effect that its data processing remains tedious and dependent on user interventions. As a follow-up to speaq, a previously released workflow for NMR spectral alignment and quantitation, we present speaq 2.0. This completely revised framework to automatically analyze 1D NMR spectra uses wavelets to efficiently summarize the raw spectra with minimal information loss or user interaction. The tool offers a fast and easy workflow that starts with the common approach of peak-picking, followed by grouping. This yields a matrix consisting of features, samples and peak values that can be conveniently processed either by using included multivariate statistical functions or by using many other recently developed methods for NMR data analysis. speaq 2.0 facilitates robust and high-throughput metabolomics based on 1D NMR but is also compatible with other NMR frameworks or complementary LC-MS workflows. The methods are benchmarked using two publicly available datasets. speaq 2.0 is distributed through the existing speaq R package to provide a complete solution for NMR data processing. The package and the code for the presented case studies are freely available on CRAN (https://cran.r-project.org/package=speaq) and GitHub (https://github.com/beirnaert/speaq).Author summaryWe present speaq 2.0: a user friendly workflow for processing NMR spectra quickly and easily. By limiting the need for user interaction and allowing the construction of workflows by combining R functions, metabolomics data analysis becomes fully reproducible and shareable. Such advances are critical for the future of the metabolomics field as it needs to move towards a fully open-science approach. This is no trivial goal as many researchers are still using black-box commercial software that often requires manually doing several steps, thus hampering reproducibility. To encourage the shift towards open source, we deliberately made our method usable for anyone with the most basic of R experience, something that is easily acquired. speaq 2.0 allows a stand-alone analysis from spectra to statistical analysis. In addition, the package can be combined with existing tools to improve performance, as it provides a superior peak picking method compared to the standard binning approach.

Download Full-text

Data Processing Optimization in Untargeted Metabolomics of Urine Using Voigt Lineshape Model Non-Linear Regression Analysis

Metabolites ◽

10.3390/metabo11050285 ◽

2021 ◽

Vol 11 (5) ◽

pp. 285

Author(s):

Kristina E. Haslauer ◽

Philippe Schmitt-Kopplin ◽

Silke S. Heinzmann

Keyword(s):

Nmr Spectroscopy ◽

Data Processing ◽

Large Scale ◽

Linear Regression Analysis ◽

Untargeted Metabolomics ◽

Metabolomics Data ◽

Non Linear ◽

Data Processing Tool ◽

Processing Optimization ◽

Spectral Libraries

Nuclear magnetic resonance (NMR) spectroscopy is well-established to address questions in large-scale untargeted metabolomics. Although several approaches in data processing and analysis are available, significant issues remain. NMR spectroscopy of urine generates information-rich but complex spectra in which signals often overlap. Furthermore, slight changes in pH and salt concentrations cause peak shifting, which introduces, in combination with baseline irregularities, un-informative noise in statistical analysis. Within this work, a straight-forward data processing tool addresses these problems by applying a non-linear curve fitting model based on Voigt function line shape and integration of the underlying peak areas. This method allows a rapid untargeted analysis of urine metabolomics datasets without relying on time-consuming 2D-spectra based deconvolution or information from spectral libraries. The approach is validated with spiking experiments and tested on a human urine 1H dataset compared to conventionally used methods and aims to facilitate metabolomics data analysis.

Download Full-text