scholarly journals MetPC: Metabolite Pipeline Consisting of Metabolite Identification and Biomarker Discovery Under the Control of Two-Dimensional FDR

Metabolites ◽  
2019 ◽  
Vol 9 (5) ◽  
pp. 103
Author(s):  
Jaehwi Kim ◽  
Jaesik Jeong

Due to the complex features of metabolomics data, the development of a unified platform, which covers preprocessing steps to data analysis, has been in high demand over the last few decades. Thus, we developed a new bioinformatics tool that includes a few of preprocessing steps and biomarker discovery procedure. For metabolite identification, we considered a hierarchical statistical model coupled with an Expectation–Maximization (EM) algorithm to take care of latent variables. For biomarker metabolite discovery, our procedure controls two-dimensional false discovery rate (fdr2d) when testing for multiple hypotheses simultaneously.

Author(s):  
Florence Anne Castelli ◽  
Giulio Rosati ◽  
Christian Moguet ◽  
Celia Fuentes ◽  
Jose Marrugo-Ramírez ◽  
...  

AbstractMetabolomics refers to the large-scale detection, quantification, and analysis of small molecules (metabolites) in biological media. Although metabolomics, alone or combined with other omics data, has already demonstrated its relevance for patient stratification in the frame of research projects and clinical studies, much remains to be done to move this approach to the clinical practice. This is especially true in the perspective of being applied to personalized/precision medicine, which aims at stratifying patients according to their risk of developing diseases, and tailoring medical treatments of patients according to individual characteristics in order to improve their efficacy and limit their toxicity. In this review article, we discuss the main challenges linked to analytical chemistry that need to be addressed to foster the implementation of metabolomics in the clinics and the use of the data produced by this approach in personalized medicine. First of all, there are already well-known issues related to untargeted metabolomics workflows at the levels of data production (lack of standardization), metabolite identification (small proportion of annotated features and identified metabolites), and data processing (from automatic detection of features to multi-omic data integration) that hamper the inter-operability and reusability of metabolomics data. Furthermore, the outputs of metabolomics workflows are complex molecular signatures of few tens of metabolites, often with small abundance variations, and obtained with expensive laboratory equipment. It is thus necessary to simplify these molecular signatures so that they can be produced and used in the field. This last point, which is still poorly addressed by the metabolomics community, may be crucial in a near future with the increased availability of molecular signatures of medical relevance and the increased societal demand for participatory medicine. Graphical abstract


Metabolites ◽  
2019 ◽  
Vol 9 (12) ◽  
pp. 308 ◽  
Author(s):  
Julijana Ivanisevic ◽  
Elizabeth J. Want

Untargeted metabolomics (including lipidomics) is a holistic approach to biomarker discovery and mechanistic insights into disease onset and progression, and response to intervention. Each step of the analytical and statistical pipeline is crucial for the generation of high-quality, robust data. Metabolite identification remains the bottleneck in these studies; therefore, confidence in the data produced is paramount in order to maximize the biological output. Here, we outline the key steps of the metabolomics workflow and provide details on important parameters and considerations. Studies should be designed carefully to ensure appropriate statistical power and adequate controls. Subsequent sample handling and preparation should avoid the introduction of bias, which can significantly affect downstream data interpretation. It is not possible to cover the entire metabolome with a single platform; therefore, the analytical platform should reflect the biological sample under investigation and the question(s) under consideration. The large, complex datasets produced need to be pre-processed in order to extract meaningful information. Finally, the most time-consuming steps are metabolite identification, as well as metabolic pathway and network analysis. Here we discuss some widely used tools and the pitfalls of each step of the workflow, with the ultimate aim of guiding the reader towards the most efficient pipeline for their metabolomics studies.


Author(s):  
Partho Sen ◽  
Santosh Lamichhane ◽  
Vivek B Mathema ◽  
Aidan McGlinchey ◽  
Alex M Dickens ◽  
...  

Abstract Deep learning (DL), an emerging area of investigation in the fields of machine learning and artificial intelligence, has markedly advanced over the past years. DL techniques are being applied to assist medical professionals and researchers in improving clinical diagnosis, disease prediction and drug discovery. It is expected that DL will help to provide actionable knowledge from a variety of ‘big data’, including metabolomics data. In this review, we discuss the applicability of DL to metabolomics, while presenting and discussing several examples from recent research. We emphasize the use of DL in tackling bottlenecks in metabolomics data acquisition, processing, metabolite identification, as well as in metabolic phenotyping and biomarker discovery. Finally, we discuss how DL is used in genome-scale metabolic modelling and in interpretation of metabolomics data. The DL-based approaches discussed here may assist computational biologists with the integration, prediction and drawing of statistical inference about biological outcomes, based on metabolomics data.


2021 ◽  
Vol 87 (9) ◽  
pp. 615-630
Author(s):  
Longjie Ye ◽  
Ka Zhang ◽  
Wen Xiao ◽  
Yehua Sheng ◽  
Dong Su ◽  
...  

This paper proposes a Gaussian mixture model of a ground filtering method based on hierarchical curvature constraints. Firstly, the thin plate spline function is iteratively applied to interpolate the reference surface. Secondly, gradually changing grid size and curvature threshold are used to construct hierarchical constraints. Finally, an adaptive height difference classifier based on the Gaussian mixture model is proposed. Using the latent variables obtained by the expectation-maximization algorithm, the posterior probability of each point is computed. As a result, ground and objects can be marked separately according to the calculated possibility. 15 data samples provided by the International Society for Photogrammetry and Remote Sensing are used to verify the proposed method, which is also compared with eight classical filtering algorithms. Experimental results demonstrate that the average total errors and average Cohen's kappa coefficient of the proposed method are 6.91% and 80.9%, respectively. In general, it has better performance in areas with terrain discontinuities and bridges.


2018 ◽  
Vol 2018 ◽  
pp. 1-16
Author(s):  
Tao Wu ◽  
Zhenghong Deng ◽  
Qingyue Gu ◽  
Jiwei Xu

We explore the estimation of a two-dimensional (2D) nonsymmetric coherently distributed (CD) source using L-shaped arrays. Compared with a symmetric source, the modeling and estimation of a nonsymmetric source are more practical. A nonsymmetric CD source is established through modeling the deterministic angular signal distribution function as a summation of Gaussian probability density functions. Parameter estimation of the nonsymmetric distributed source is proposed under an expectation maximization (EM) framework. The proposed EM iterative calculation contains three steps in each cycle. Firstly, the nominal azimuth angles and nominal elevation angles of Gaussian components in the nonsymmetric source are obtained from the relationship of rotational invariance matrices. Then, angular spreads can be solved through one-dimensional (1D) searching based on nominal angles. Finally, the powers of Gaussian components are obtained by solving least-squares estimators. Simulations are conducted to verify the effectiveness of the nonsymmetric CD model and estimation technique.


2016 ◽  
Vol 12 (1) ◽  
pp. 1-7 ◽  
Author(s):  
Ned Kock

Path models with and without latent variables are extensively used in e-collaboration research. Both direct and moderating relationships can be included in such path models. Moderating relationships involve three latent variables, the moderating variable and a pair of variables that are connected through a direct link. This paper discusses the visualization of moderating relationships through two-dimensional and three-dimensional graphs. The software WarpPLS version 5.0 is used in this discussion, since it provides an extensive set of graphs that can be used to visualize moderating effects.


Sign in / Sign up

Export Citation Format

Share Document