scholarly journals MetCirc: navigating mass spectral similarity in high-resolution MS/MS metabolomics data

2017 ◽  
Vol 33 (15) ◽  
pp. 2419-2420 ◽  
Author(s):  
Thomas Naake ◽  
Emmanuel Gaquerel
2015 ◽  
Vol 377 ◽  
pp. 719-727 ◽  
Author(s):  
Neha Garg ◽  
Clifford A. Kapono ◽  
Yan Wei Lim ◽  
Nobuhiro Koyama ◽  
Mark J.A. Vermeij ◽  
...  

2021 ◽  
Author(s):  
Florian Huber ◽  
Sven van der Burg ◽  
Justin J.J. van der Hooft ◽  
Lars Ridder

Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are considered characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of >100,000 mass spectra of about 15,000 unique known compounds, MS2DeepScore learns to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model's prediction uncertainty. On 3,600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and predicts Tanimoto scores with a root mean squared error of about 0.15. The prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. We demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity metrics have great potential for a range of metabolomics data processing pipelines.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Florian Huber ◽  
Sven van der Burg ◽  
Justin J. J. van der Hooft ◽  
Lars Ridder

AbstractMass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model’s prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.


2015 ◽  
Vol 10 (9) ◽  
pp. 1934578X1501000 ◽  
Author(s):  
Venkata Sai Prakash Chaturvedula ◽  
Srinivasa Rao Meneni

A systematic phytochemical study of the commercial extract of Luo Han Guo ( Siraitia grosvenorii) resulted in the isolation of an additional minor new cucurbitane glycoside, mogroside V A1 (1). The structure of the new compound was characterized on the basis of 1D (1H and 13C NMR) and 2D (COSY, HMQC, HMBC and NOESY) NMR and high resolution mass spectral (HRMS) data, as well as hydrolysis studies.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mihir Mongia ◽  
Hosein Mohimani

AbstractVarious studies have shown associations between molecular features and phenotypes of biological samples. These studies, however, focus on a single phenotype per study and are not applicable to repository scale metabolomics data. Here we report MetSummarizer, a method for predicting (i) the biological phenotypes of environmental and host-oriented samples, and (ii) the raw ingredient composition of complex mixtures. We show that the aggregation of various metabolomic datasets can improve the accuracy of predictions. Since these datasets have been collected using different standards at various laboratories, in order to get unbiased results it is crucial to detect and discard standard-specific features during the classification step. We further report high accuracy in prediction of the raw ingredient composition of complex foods from the Global Foodomics Project.


1973 ◽  
Vol 135 (1) ◽  
pp. 133-143 ◽  
Author(s):  
Hans J. Förster ◽  
Klaus Biemann ◽  
W. Geoffrey Haigh ◽  
Neil H. Tattrie ◽  
J. Ross Colvin

A novel C35 terpene and its monounsaturated analogue were isolated from cultures of Acetobacter xylinum, together with traces of their C36 homologues. These substances were found to be hopane derivatives substituted by a five-carbon chain bearing four vicinal hydroxyl groups. For the parent hydrocarbon the term bacteriohopane is proposed. The elucidation of the structures utilized high-resolution mass spectrometry of the terpenes, degradation to C32 hydrocarbons and detailed mass-spectrometric comparison of these with C32 hydrocarbons synthesized from known pentacyclic triterpenes. High-resolution mass-spectral data of the terpenes are presented. N.m.r. data are in agreement with the proposed structures, which are further supported by the isolation from the same organism of 22-hydroxyhopane and derivative hopene(s).


2020 ◽  
Author(s):  
Xin Hu ◽  
Douglas Walker ◽  
YongLiang Liang ◽  
Matthew Smith ◽  
Michael Orr ◽  
...  

Abstract Complementing the genome with an understanding of the human exposome is an important challenge for contemporary science and technology. Tens of thousands of chemicals are used in commerce, yet cost for targeted environmental chemical analysis limits surveillance to a few hundred known hazards. To overcome limitations which prevent scaling to thousands of chemicals, we developed a single-step express liquid extraction (XLE), gas chromatography high-resolution mass spectrometry (GC-HRMS) analysis and computational pipeline to operationalize the human exposome. We show that the workflow supports quantification of environmental chemicals in small human plasma (200 µL) and tissue (≤ 100 mg) samples. The method also provides high resolution, sensitivity and selectivity for exposome epidemiology of mass spectral features without a priori knowledge of chemical identity. The simplicity of the method can facilitate harmonization of environmental biomonitoring between laboratories and enable population level human exposome research with limited sample volume.


Sign in / Sign up

Export Citation Format

Share Document