scholarly journals Towards a metagenomics machine learning interpretable model for understanding the transition from adenoma to colorectal cancer

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Carlos S. Casimiro-Soriguer ◽  
Carlos Loucera ◽  
María Peña-Chilet ◽  
Joaquin Dopazo

AbstractGut microbiome is gaining interest because of its links with several diseases, including colorectal cancer (CRC), as well as the possibility of being used to obtain non-intrusive predictive disease biomarkers. Here we performed a meta-analysis of 1042 fecal metagenomic samples from seven publicly available studies. We used an interpretable machine learning approach based on functional profiles, instead of the conventional taxonomic profiles, to produce a highly accurate predictor of CRC with better precision than those of previous proposals. Moreover, this approach is also able to discriminate samples with adenoma, which makes this approach very promising for CRC prevention by detecting early stages in which intervention is easier and more effective. In addition, interpretable machine learning methods allow extracting features relevant for the classification, which reveals basic molecular mechanisms accounting for the changes undergone by the microbiome functional landscape in the transition from healthy gut to adenoma and CRC conditions. Functional profiles have demonstrated superior accuracy in predicting CRC and adenoma conditions than taxonomic profiles and additionally, in a context of explainable machine learning, provide useful hints on the molecular mechanisms operating in the microbiota behind these conditions.

Author(s):  
Carlos S. Casimiro-Soriguer ◽  
Carlos Loucera ◽  
María Peña-Chilet ◽  
Joaquin Dopazo

Abstract Background Gut microbiome is gaining interest because its links with several diseases, including colorectal cancer (CRC). Results Here we performed a meta-analysis of 851 fecal metagenomic samples from five publicly available studies. We used an interpretable machine learning approach based on functional profiles, instead of the conventional taxonomic profiles, to produce a highly accurate predictor of CRC with better precision than those of previous proposals. Moreover, this approach is also able to discriminate samples with adenoma, which makes this approach very promising for CRC prevention by detecting early stages in which intervention is easier and more effective. In addition, interpretable machine learning methods allows extracting features relevant for the classification, which reveals basic molecular mechanisms accounting for the changes underwent by the microbiome functional landscape in the transition from healthy gut to adenoma and CRC conditions. Conclusion Functional profiles provide superior accuracy in predicting CCR and adenoma conditions than taxonomic profiles and additionally, in a context of explainable machine learning, provide useful hints on the molecular mechanisms operating in the microbiota behind these conditions.


2021 ◽  
Vol 11 (1) ◽  
pp. 133-152
Author(s):  
Devesh Singh

Abstract In advancement of interpretable machine learning (IML), this research proposes local interpretable model-agnostic explanations (LIME) as a new visualization technique in a novel informative way to analyze the foreign direct investment (FDI) inflow. This article examines the determinants of FDI inflow through IML with a supervised learning method to analyze the foreign investment determinants in Hungary by using an open-source artificial intelligence H2O platform. This author used three ML algorithms—general linear model (GML), gradient boosting machine (GBM), and random forest (RF) classifier—to analyze the FDI inflow from 2001 to 2018. The result of this study shows that in all three classifiers GBM performs better to analyze FDI inflow determinants. The variable value of production in a region is the most influenced determinant to the inflow of FDI in Hungarian regions. Explanatory visualizations are presented from the analyzed dataset, which leads to their use in decision-making.


2021 ◽  
Vol 108 (Supplement_3) ◽  
Author(s):  
J Bote ◽  
J F Ortega-Morán ◽  
C L Saratxaga ◽  
B Pagador ◽  
A Picón ◽  
...  

Abstract INTRODUCTION New non-invasive technologies for improving early diagnosis of colorectal cancer (CRC) are demanded by clinicians. Optical Coherence Tomography (OCT) provides sub-surface structural information and offers diagnosis capabilities of colon polyps, further improved by machine learning methods. Databases of OCT images are necessary to facilitate algorithms development and testing. MATERIALS AND METHODS A database has been acquired from rat colonic samples with a Thorlabs OCT system with 930nm centre wavelength that provides 1.2KHz A-scan rate, 7μm axial resolution in air, 4μm lateral resolution, 1.7mm imaging depth in air, 6mm x 6mm FOV, and 107dB sensitivity. The colon from anaesthetised animals has been excised and samples have been extracted and preserved for ex-vivo analysis with the OCT equipment. RESULTS This database consists of OCT 3D volumes (C-scans) and 2D images (B-scans) of murine samples from: 1) healthy tissue, for ground-truth comparison (18 samples; 66 C-scans; 17,478 B-scans); 2) hyperplastic polyps, obtained from an induced colorectal hyperplastic murine model (47 samples; 153 C-scans; 42,450 B-scans); 3) neoplastic polyps (adenomatous and adenocarcinomatous), obtained from clinically validated Pirc F344/NTac-Apcam1137 rat model (232 samples; 564 C-scans; 158,557 B-scans); and 4) unknown tissue (polyp adjacent, presumably healthy) (98 samples; 157 C-scans; 42,070 B-scans). CONCLUSIONS A novel extensive ex-vivo OCT database of murine CRC model has been obtained and will be openly published for the research community. It can be used for classification/segmentation machine learning methods, for correlation between OCT features and histopathological structures, and for developing new non-invasive in-situ methods of diagnosis of colorectal cancer.


Molecules ◽  
2019 ◽  
Vol 24 (11) ◽  
pp. 2097 ◽  
Author(s):  
Ambrose Plante ◽  
Derek M. Shore ◽  
Giulia Morra ◽  
George Khelashvili ◽  
Harel Weinstein

G protein-coupled receptors (GPCRs) play a key role in many cellular signaling mechanisms, and must select among multiple coupling possibilities in a ligand-specific manner in order to carry out a myriad of functions in diverse cellular contexts. Much has been learned about the molecular mechanisms of ligand-GPCR complexes from Molecular Dynamics (MD) simulations. However, to explore ligand-specific differences in the response of a GPCR to diverse ligands, as is required to understand ligand bias and functional selectivity, necessitates creating very large amounts of data from the needed large-scale simulations. This becomes a Big Data problem for the high dimensionality analysis of the accumulated trajectories. Here we describe a new machine learning (ML) approach to the problem that is based on transforming the analysis of GPCR function-related, ligand-specific differences encoded in the MD simulation trajectories into a representation recognizable by state-of-the-art deep learning object recognition technology. We illustrate this method by applying it to recognize the pharmacological classification of ligands bound to the 5-HT2A and D2 subtypes of class-A GPCRs from the serotonin and dopamine families. The ML-based approach is shown to perform the classification task with high accuracy, and we identify the molecular determinants of the classifications in the context of GPCR structure and function. This study builds a framework for the efficient computational analysis of MD Big Data collected for the purpose of understanding ligand-specific GPCR activity.


Sign in / Sign up

Export Citation Format

Share Document