INFLAMMATORY BOWEL DISEASE CLASSIFICATION USING THE GUT MICROBIOME: A BENCHMARK OF MICROBIAL DATA ANALYSIS METHODS

2021 ◽  
Vol 27 (Supplement_1) ◽  
pp. S40-S40
Author(s):  
Ryszard Kubinski ◽  
Jean Djamen ◽  
Timur Zhanabaev ◽  
Ryan Martin

Abstract The prevalence of inflammatory bowel disease (IBD) is increasing throughout the developed world. For the newly diagnosed, the time between the appearance of symptoms and diagnosis can take months, involving invasive procedures. There is an urgent need to develop a simple, low cost, accurate and non-invasive diagnostic test. With decreasing costs of next-generation sequencing, many studies have compared IBD gut microbiomes to healthy controls, successfully identifying bacterial biomarkers for IBD. Unfortunately, a majority of these studies utilize machine learning and statistical methods on either single or low-sample size datasets. This results in the creation of disease classification models that have a high level of overfitting and therefore minimal clinical application to new patient cohorts. There are several data preprocessing methods available for data normalization and reduction of cohort specific signals (batch reduction) which can address this lack of cross-dataset performance. With an abundance of potential methods, there is a need to benchmark the performance and generalizability of various machine learning pipelines (combination of data preprocessing and model) for microbiome-based IBD diagnostic tools. We used a collection of 12 IBD-associated North American microbiome datasets (~4000 samples) to benchmark several machine learning pipelines. Raw sequencing data was processed, collapsed at the OTU or Genus level and merged using QIIME2. Datasets were then normalized using either sum-scaling or log based methods and batch reduction was performed using either zero-centering or Empirical Bayes’ approaches. Performance of pipelines was evaluated using binary accuracy, AUC, F1 metric and MCC score. Generalizability of pipelines was evaluated using leave one out cross validation, where data from one study was left out of the training set and tested upon. The best performing and most generalizable pipeline included a Random Forest model paired with centered log ratio based normalization and batch reduction via an Empirical Bayes’ based approach. This combination, along with others, showed equivalent or higher performance to that of more complex models involving deep neural networks (DNNs). In addition to benchmarking our pipelines, we also explore their limitations, such as the tendency of zero-centered batch reduction to rely on balanced data as input or the tendency of Empirical Bayes’ based methods to introduce artificial signals into data, evidencing certain methods as poor tools for clinical use. To our knowledge, this is the first comprehensive benchmark of data preprocessing and machine learning methods for microbiome-based disease classification of IBD. These findings will help improve the generalizability of machine learning models as we move towards non-invasive diagnostic and disease management tools for patients with IBD.

Biosensors ◽  
2019 ◽  
Vol 9 (2) ◽  
pp. 55 ◽  
Author(s):  
Tiele ◽  
Wicaksono ◽  
Kansara ◽  
Arasaradnam ◽  
Covington

Early diagnosis of inflammatory bowel disease (IBD), including Crohn’s disease (CD) and ulcerative colitis (UC), remains a clinical challenge with current tests being invasive and costly. The analysis of volatile organic compounds (VOCs) in exhaled breath and biomarkers in stool (faecal calprotectin (FCP)) show increasing potential as non-invasive diagnostic tools. The aim of this pilot study is to evaluate the efficacy of breath analysis and determine if FCP can be used as an additional non-invasive parameter to supplement breath results, for the diagnosis of IBD. Thirty-nine subjects were recruited (14 CD, 16 UC, 9 controls). Breath samples were analysed using an in-house built electronic nose (Wolf eNose) and commercial gas chromatograph–ion mobility spectrometer (G.A.S. BreathSpec GC-IMS). Both technologies could consistently separate IBD and controls [AUC ± 95%, sensitivity, specificity], eNose: [0.81, 0.67, 0.89]; GC-IMS: [0.93, 0.87, 0.89]. Furthermore, we could separate CD from UC, eNose: [0.88, 0.71, 0.88]; GC-IMS: [0.71, 0.86, 0.62]. Including FCP did not improve distinction between CD vs UC; eNose: [0.74, 1.00, 0.56], but rather, improved separation of CD vs controls and UC vs controls; eNose: [0.77, 0.55, 1.00] and [0.72, 0.89, 0.67] without FCP, [0.81, 0.73, 0.78] and [0.90, 1.00, 0.78] with FCP, respectively. These results confirm the utility of breath analysis to distinguish between IBD-related diagnostic groups. FCP does not add significant diagnostic value to breath analysis within this study.


2004 ◽  
Vol 39 (Supplement 1) ◽  
pp. S324 ◽  
Author(s):  
R. Berni Canani ◽  
L. Tanturri de Horatio ◽  
M. T. Romano ◽  
F. Manguso ◽  
L. Rapacciuolo ◽  
...  

2021 ◽  
Vol 71 (10) ◽  
pp. 2350-2354
Author(s):  
Huma Arshad Cheema ◽  
Nadia Waheed ◽  
Anjum Saeed ◽  
Zafar Fayyaz ◽  
Muhammad Nadeem Anjum ◽  
...  

Background: Very early-onset inflammatory bowel disease (VEO-IBD) is defined as diagnosis of Ulcerative Colitis (UC) or Crohn’s Disease (CD) in children under six years of age. Genome wide association studies have linked a strong genetic component responsible for VEO-IBD. Approximately, 30-40% children of VEO-IBD have underlying immunodeficiency states. We aimed to study the spectrum of presentation, underlying monogenetic defects and outcome in VEO-IBD. Methods: This is a prospective, observational study conducted at division of Gastroenterology, the Children's Hospital & the Institute of Child Health, Lahore, over 2 years. Children developing features of IBD under six-years of age were included. Data included demography, clinical presentation, diagnostic tools and outcome. Gastroscopy and colonoscopy were performed in all patients in addition to basic work up done for associatedimmunodeficiency states and molecular genetics.  SPSS version 21 was used for analysis. Continuous...


2021 ◽  
Vol 8 ◽  
Author(s):  
Shaun S. C. Ho ◽  
Michael Ross ◽  
Jacqueline I. Keenan ◽  
Andrew S. Day

Introduction: Fecal calprotectin (FC) is a useful non-invasive screening test but elevated levels are not specific to inflammatory bowel disease (IBD). The study aimed to evaluate the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of FC alone or FC in combination with other standard blood tests in the diagnosis of IBD.Methods: Children aged <17 years who had FC (normal range <50 μg/g) measured and underwent endoscopy over 33 months in Christchurch, New Zealand were identified retrospectively (consecutive sampling). Medical records were reviewed for patient final diagnoses.Results: One hundred and two children were included; mean age was 12.3 years and 53 were male. Fifty-eight (57%) of the 102 children were diagnosed with IBD: 49 with Crohn's disease, eight with ulcerative colitis and one with IBD-unclassified. FC of 50 μg/g threshold provided a sensitivity of 96.6% [95% confident interval (CI) 88.3–99.4%] and PPV of 72.7% (95% CI 61.9–81.4%) in diagnosing IBD. Two children with IBD however were found to have FC <50 μg/g. Sensitivity in diagnosing IBD was further improved to 98.3% (95% CI 90.7–99.1%) when including FC >50 μg/g or elevated platelet count. Furthermore, PPVs in diagnosing IBD improved when FC at various thresholds was combined with either low albumin or high platelet count.Conclusion: Although FC alone is a useful screening test for IBD, a normal FC alone does not exclude IBD. Extending FC to include albumin or platelet count may improve sensitivity, specificity, PPV and NPV in diagnosing IBD. However, prospective studies are required to validate this conclusion.


2020 ◽  
Vol 9 (11) ◽  
pp. 3427 ◽  
Author(s):  
Youn I Choi ◽  
Sung Jin Park ◽  
Jun-Won Chung ◽  
Kyoung Oh Kim ◽  
Jae Hee Cho ◽  
...  

Background: The incidence and global burden of inflammatory bowel disease (IBD) have steadily increased in the past few decades. Improved methods to stratify risk and predict disease-related outcomes are required for IBD. Aim: The aim of this study was to develop and validate a machine learning (ML) model to predict the 5-year risk of starting biologic agents in IBD patients. Method: We applied an ML method to the database of the Korean common data model (K-CDM) network, a data sharing consortium of tertiary centers in Korea, to develop a model to predict the 5-year risk of starting biologic agents in IBD patients. The records analyzed were those of patients diagnosed with IBD between January 2006 and June 2017 at Gil Medical Center (GMC; n = 1299) or present in the K-CDM network (n = 3286). The ML algorithm was developed to predict 5- year risk of starting biologic agents in IBD patients using data from GMC and externally validated with the K-CDM network database. Result: The ML model for prediction of IBD-related outcomes at 5 years after diagnosis yielded an area under the curve (AUC) of 0.86 (95% CI: 0.82–0.92), in an internal validation study carried out at GMC. The model performed consistently across a range of other datasets, including that of the K-CDM network (AUC = 0.81; 95% CI: 0.80–0.85), in an external validation study. Conclusion: The ML-based prediction model can be used to identify IBD-related outcomes in patients at risk, enabling physicians to perform close follow-up based on the patient’s risk level, estimated through the ML algorithm.


Sign in / Sign up

Export Citation Format

Share Document