scholarly journals MEMO: Mass Spectrometry-based Sample Vectorization to Explore Chemodiverse Datasets

2021 ◽  
Author(s):  
Arnaud Gaudry ◽  
Florian Huber ◽  
Louis-Felix Nothias ◽  
Sylvian Cretton ◽  
Marcel Kaiser ◽  
...  

In natural products research, chemodiverse extracts coming from multiple organisms are explored for novel bioactive molecules, sometimes over extended periods. Samples are usually analyzed by liquid chromatography coupled with fragmentation mass spectrometry to acquire informative mass spectral ensembles. Such data is then exploited to establish relationships among analytes or samples (e.g. via molecular networking) and annotate metabolites. However, the comparison of samples profiled in different batches is challenging with current metabolomics methods. Indeed, the experimental variation - changes in chromatographical or mass spectrometric conditions - often hinders the direct comparison of the profiled samples. Here we introduce MEMO - MS2 BasEd SaMple VectOrization - a method allowing to cluster large amounts of chemodiverse samples based on their LC-MS/MS profiles in a retention time agnostic manner. This method is particularly suited for heterogeneous and chemodiverse sample sets. MEMO demonstrated similar clustering performance as state-of-the-art metrics taking into account fragmentation spectra. More importantly, such performance was achieved without the requirement of a prior feature alignment step and in a significantly shorter computational time. MEMO thus allows the comparison of vast ensembles of samples, even when analyzed over long periods of time, and on different chromatographic or mass spectrometry platforms. This new addition to the computational metabolomics toolbox should drastically expand the scope of large-scale comparative analysis.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Olga Permiakova ◽  
Romain Guibert ◽  
Alexandra Kraut ◽  
Thomas Fortin ◽  
Anne-Marie Hesse ◽  
...  

Abstract Background The clustering of data produced by liquid chromatography coupled to mass spectrometry analyses (LC-MS data) has recently gained interest to extract meaningful chemical or biological patterns. However, recent instrumental pipelines deliver data which size, dimensionality and expected number of clusters are too large to be processed by classical machine learning algorithms, so that most of the state-of-the-art relies on single pass linkage-based algorithms. Results We propose a clustering algorithm that solves the powerful but computationally demanding kernel k-means objective function in a scalable way. As a result, it can process LC-MS data in an acceptable time on a multicore machine. To do so, we combine three essential features: a compressive data representation, Nyström approximation and a hierarchical strategy. In addition, we propose new kernels based on optimal transport, which interprets as intuitive similarity measures between chromatographic elution profiles. Conclusions Our method, referred to as CHICKN, is evaluated on proteomics data produced in our lab, as well as on benchmark data coming from the literature. From a computational viewpoint, it is particularly efficient on raw LC-MS data. From a data analysis viewpoint, it provides clusters which differ from those resulting from state-of-the-art methods, while achieving similar performances. This highlights the complementarity of differently principle algorithms to extract the best from complex LC-MS data.


Circulation ◽  
2007 ◽  
Vol 116 (suppl_16) ◽  
Author(s):  
Margaret B Lucitt ◽  
Tom S Price ◽  
Angel Pizarro ◽  
Weichen Wu ◽  
Anastasia Yocum Yocum ◽  
...  

Zebrafish is an attractive vertebrate model organism for studies into the molecular mechanisms of cardiovascular development, pathology and pharmacology. Studies into the genetics of protein expression are largely constrained by the availability of specific antibodies. Mass spectrometry based proteomics methods have the potential to overcome these hurdles. This requires firstly an accurate characterization of proteins accessible to targeted quantitative analysis. We applied mass spectrometric proteomic methodology and statistical analysis to create profiles of proteins expressed during zebrafish embryonic development. We detected 1307 proteins from 327,906 peptide sequence identifications at 72 hpf and 120 hpf with false identification rates of less than 1% using two dimensional chromatography tandem mass spectrometry. Close to two thirds of all detected proteins were derived from hypothetical or predicted gene models or were entirely unannotated. Comparison of protein expression in embryos by two dimensional gel electrophoresis differential in gel analysis (DIGE) revealed that proteins involved in energy production and transcription/ translation were relatively more abundant at 72 hpf consistent with the faster synthesis of cellular proteins during organismal growth. Pathway analysis revealed similar expression of proteins at both stages that relate to calcium, insulin receptor, ERK/MAP kinase, vascular epithelial growth factor signaling, and WNT/b-Catenin. Similarly both stages expressed proteins of the complement and coagulation cascades, GM-CSF, PTEN, and sonic hedgehog signaling and inflammatory signals. The data are accessible in a fully searchable database (http://bioinf.itmat.upenn.edu/zebrafish) that links protein identifications to existing resources including the Zebrafish Model Organism Database. This new resource should facilitate the selection of candidate proteins for targeted quantitation and may refine systematic genetic network analysis in vertebrate development and biology. This is the first large-scale proteome analysis of embryonic zebrafish tissue to reveal previously uncharacterized proteins and detect regulated proteins with relevance for cardiovascular function and development.


Marine Drugs ◽  
2021 ◽  
Vol 19 (3) ◽  
pp. 142 ◽  
Author(s):  
Max Crüsemann

Bacterial natural products possess potent bioactivities and high structural diversity and are typically encoded in biosynthetic gene clusters. Traditional natural product discovery approaches rely on UV- and bioassay-guided fractionation and are limited in terms of dereplication. Recent advances in mass spectrometry, sequencing and bioinformatics have led to large-scale accumulation of genomic and mass spectral data that is increasingly used for signature-based or correlation-based mass spectrometry genome mining approaches that enable rapid linking of metabolomic and genomic information to accelerate and rationalize natural product discovery. In this mini-review, these approaches are presented, and discovery examples provided. Finally, future opportunities and challenges for paired omics-based natural products discovery workflows are discussed.


2013 ◽  
Vol 8 (6) ◽  
pp. 1934578X1300800
Author(s):  
Vladimir A. Khripach ◽  
Danuše Tarkowská ◽  
Vladimir N. Zhabinskii ◽  
Olga V. Gulyakevich ◽  
Yurii V. Ermolovich ◽  
...  

New analogues of brassinolide biosynthetic precursors with three deuterium atoms at non-exchangeable positions have been synthesized to be used as standards for quantification of natural brassinosteroids by liquid chromatography-mass spectrometry. [26-2H3](22 R,23 R,24 S)-22,23-Dihydroxy-6β-methoxy-24-methyl-3α,5-cyclo-5α-cholestane was used as a starting material for the preparation of campestane derivatives having a 22 R,23 R-diol functionality and either a hydroxy or keto group at C-3 and labeled at C-26. The mass spectrometric behavior of the newly synthesized compounds has been studied.


2020 ◽  
Vol 21 (4) ◽  
pp. 1524 ◽  
Author(s):  
Van-An Duong ◽  
Jong-Moon Park ◽  
Hookeun Lee

Proteomics is a large-scale study of proteins, aiming at the description and characterization of all expressed proteins in biological systems. The expressed proteins are typically highly complex and large in abundance range. To fulfill high accuracy and sensitivity of proteome analysis, the hybrid platforms of multidimensional (MD) separations and mass spectrometry have provided the most powerful solution. Multidimensional separations provide enhanced peak capacity and reduce sample complexity, which enables mass spectrometry to analyze more proteins with high sensitivity. Although two-dimensional (2D) separations have been widely used since the early period of proteomics, three-dimensional (3D) separation was barely used by low reproducibility of separation, increased analysis time in mass spectrometry. With developments of novel microscale techniques such as nano-UPLC and improvements of mass spectrometry, the 3D separation becomes a reliable and practical selection. This review summarizes existing offline and online 3D-LC platforms developed for proteomics and their applications. In detail, setups and implementation of those systems as well as their advances are outlined. The performance of those platforms is also discussed and compared with the state-of-the-art 2D-LC. In addition, we provide some perspectives on the future developments and applications of 3D-LC in proteomics.


2019 ◽  
Author(s):  
Liqun Cao ◽  
Jinzhe Zeng ◽  
Mingyuan Xu ◽  
Chih-Hao Chin ◽  
Tong Zhu ◽  
...  

Combustion is a kind of important reaction that affects people's daily lives and the development of aerospace. Exploring the reaction mechanism contributes to the understanding of combustion and the more efficient use of fuels. Ab initio quantum mechanical (QM) calculation is precise but limited by its computational time for large-scale systems. In order to carry out reactive molecular dynamics (MD) simulation for combustion accurately and quickly, we develop the MFCC-combustion method in this study, which calculates the interaction between atoms using QM method at the level of MN15/6-31G(d). Each molecule in systems is treated as a fragment, and when the distance between any two atoms in different molecules is greater than 3.5 Å, a new fragment involved two molecules is produced in order to consider the two-body interaction. The deviations of MFCC-combustion from full system calculations are within a few kcal/mol, and the result clearly shows that the calculated energies of the different systems using MFCC-combustion are close to converging after the distance thresholds are larger than 3.5 Å for the two-body QM interactions. The methane combustion was studied with the MFCC-combustion method to explore the combustion mechanism of the methane-oxygen system.


2018 ◽  
Author(s):  
Gilian T. Thomas ◽  
Landon MacGillivray ◽  
Natalie L. Dean ◽  
Rhonda L. Stoddard ◽  
Lars Yunker ◽  
...  

<p>Reactions carried out in the presence of rubber septa run the risk of additives being leached out by the solvent. Normally, such species are present at low enough levels that they do not interfere with the reaction significantly. However, when studying reactions using sensitive methods such as mass spectrometry, the appearance of even trace amounts of material can confuse dynamic analyses of reactions. A wide variety of additives are present in rubber along with the polymer: antioxidants, dyes, detergent, and vulcanization agents, and these are all especially problematic in negative ion mode. A redesigned Schlenk flask for pressurized sample infusion (PSI) is presented as a means of practically eliminating the presence of contaminants during reaction analyses.</p>


2020 ◽  
Vol 86 (7) ◽  
pp. 12-19
Author(s):  
I. V. Plyushchenko ◽  
D. G. Shakhmatov ◽  
I. A. Rodin

A viral development of statistical data processing, computing capabilities, chromatography-mass spectrometry, and omics technologies (technologies based on the achievements of genomics, transcriptomics, proteomics, metabolomics) in recent decades has not led to formation of a unified protocol for untargeted profiling. Systematic errors reduce the reproducibility and reliability of the obtained results, and at the same time hinder consolidation and analysis of data gained in large-scale multi-day experiments. We propose an algorithm for conducting omics profiling to identify potential markers in the samples of complex composition and present the case study of urine samples obtained from different clinical groups of patients. Profiling was carried out by the method of liquid chromatography mass spectrometry. The markers were selected using methods of multivariate analysis including machine learning and feature selection. Testing of the approach was performed using an independent dataset by clustering and projection on principal components.


Sign in / Sign up

Export Citation Format

Share Document