INFLAMMATORY BOWEL DISEASE CLASSIFICATION USING THE GUT MICROBIOME: A BENCHMARK OF MICROBIAL DATA ANALYSIS METHODS

Abstract The prevalence of inflammatory bowel disease (IBD) is increasing throughout the developed world. For the newly diagnosed, the time between the appearance of symptoms and diagnosis can take months, involving invasive procedures. There is an urgent need to develop a simple, low cost, accurate and non-invasive diagnostic test. With decreasing costs of next-generation sequencing, many studies have compared IBD gut microbiomes to healthy controls, successfully identifying bacterial biomarkers for IBD. Unfortunately, a majority of these studies utilize machine learning and statistical methods on either single or low-sample size datasets. This results in the creation of disease classification models that have a high level of overfitting and therefore minimal clinical application to new patient cohorts. There are several data preprocessing methods available for data normalization and reduction of cohort specific signals (batch reduction) which can address this lack of cross-dataset performance. With an abundance of potential methods, there is a need to benchmark the performance and generalizability of various machine learning pipelines (combination of data preprocessing and model) for microbiome-based IBD diagnostic tools. We used a collection of 12 IBD-associated North American microbiome datasets (~4000 samples) to benchmark several machine learning pipelines. Raw sequencing data was processed, collapsed at the OTU or Genus level and merged using QIIME2. Datasets were then normalized using either sum-scaling or log based methods and batch reduction was performed using either zero-centering or Empirical Bayes’ approaches. Performance of pipelines was evaluated using binary accuracy, AUC, F1 metric and MCC score. Generalizability of pipelines was evaluated using leave one out cross validation, where data from one study was left out of the training set and tested upon. The best performing and most generalizable pipeline included a Random Forest model paired with centered log ratio based normalization and batch reduction via an Empirical Bayes’ based approach. This combination, along with others, showed equivalent or higher performance to that of more complex models involving deep neural networks (DNNs). In addition to benchmarking our pipelines, we also explore their limitations, such as the tendency of zero-centered batch reduction to rely on balanced data as input or the tendency of Empirical Bayes’ based methods to introduce artificial signals into data, evidencing certain methods as poor tools for clinical use. To our knowledge, this is the first comprehensive benchmark of data preprocessing and machine learning methods for microbiome-based disease classification of IBD. These findings will help improve the generalizability of machine learning models as we move towards non-invasive diagnostic and disease management tools for patients with IBD.

Download Full-text

Breath Analysis Using eNose and Ion Mobility Technology to Diagnose Inflammatory Bowel Disease—A Pilot Study

Biosensors ◽

10.3390/bios9020055 ◽

2019 ◽

Vol 9 (2) ◽

pp. 55 ◽

Cited By ~ 11

Author(s):

Tiele ◽

Wicaksono ◽

Kansara ◽

Arasaradnam ◽

Covington

Keyword(s):

Inflammatory Bowel Disease ◽

Pilot Study ◽

Ion Mobility ◽

Bowel Disease ◽

Breath Analysis ◽

Diagnostic Tools ◽

Exhaled Breath ◽

Faecal Calprotectin ◽

Non Invasive ◽

Inflammatory Bowel

Early diagnosis of inflammatory bowel disease (IBD), including Crohn’s disease (CD) and ulcerative colitis (UC), remains a clinical challenge with current tests being invasive and costly. The analysis of volatile organic compounds (VOCs) in exhaled breath and biomarkers in stool (faecal calprotectin (FCP)) show increasing potential as non-invasive diagnostic tools. The aim of this pilot study is to evaluate the efficacy of breath analysis and determine if FCP can be used as an additional non-invasive parameter to supplement breath results, for the diagnosis of IBD. Thirty-nine subjects were recruited (14 CD, 16 UC, 9 controls). Breath samples were analysed using an in-house built electronic nose (Wolf eNose) and commercial gas chromatograph–ion mobility spectrometer (G.A.S. BreathSpec GC-IMS). Both technologies could consistently separate IBD and controls [AUC ± 95%, sensitivity, specificity], eNose: [0.81, 0.67, 0.89]; GC-IMS: [0.93, 0.87, 0.89]. Furthermore, we could separate CD from UC, eNose: [0.88, 0.71, 0.88]; GC-IMS: [0.71, 0.86, 0.62]. Including FCP did not improve distinction between CD vs UC; eNose: [0.74, 1.00, 0.56], but rather, improved separation of CD vs controls and UC vs controls; eNose: [0.77, 0.55, 1.00] and [0.72, 0.89, 0.67] without FCP, [0.81, 0.73, 0.78] and [0.90, 1.00, 0.78] with FCP, respectively. These results confirm the utility of breath analysis to distinguish between IBD-related diagnostic groups. FCP does not add significant diagnostic value to breath analysis within this study.

Download Full-text

P0702 CLINICAL UTILITY OF NON-INVASIVE DIAGNOSTIC TOOLS IN CHILDREN WITH SUSPECTED INFLAMMATORY BOWEL DISEASE

Journal of Pediatric Gastroenterology and Nutrition ◽

10.1097/00005176-200406001-00826 ◽

2004 ◽

Vol 39 (Supplement 1) ◽

pp. S324 ◽

Cited By ~ 1

Author(s):

R. Berni Canani ◽

L. Tanturri de Horatio ◽

M. T. Romano ◽

F. Manguso ◽

L. Rapacciuolo ◽

...

Keyword(s):

Inflammatory Bowel Disease ◽

Bowel Disease ◽

Clinical Utility ◽

Diagnostic Tools ◽

Non Invasive ◽

Inflammatory Bowel

Download Full-text

Clinical applications of artificial intelligence and machine learning‐based methods in inflammatory bowel disease

Journal of Gastroenterology and Hepatology ◽

10.1111/jgh.15405 ◽

2021 ◽

Vol 36 (2) ◽

pp. 279-285

Author(s):

Shirley Cohen‐Mekelburg ◽

Sameer Berry ◽

Ryan W Stidham ◽

Ji Zhu ◽

Akbar K Waljee

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Inflammatory Bowel Disease ◽

Bowel Disease ◽

Clinical Applications ◽

Inflammatory Bowel

Download Full-text

Very early onset inflammatory bowel disease (VEO-IBD); spectrum of clinical presentation, diagnostic tools and outcome in children

Journal of the Pakistan Medical Association ◽

10.47391/jpma.05-725 ◽

2021 ◽

Vol 71 (10) ◽

pp. 2350-2354

Author(s):

Huma Arshad Cheema ◽

Nadia Waheed ◽

Anjum Saeed ◽

Zafar Fayyaz ◽

Muhammad Nadeem Anjum ◽

...

Keyword(s):

Inflammatory Bowel Disease ◽

Bowel Disease ◽

Early Onset ◽

Clinical Presentation ◽

Association Studies ◽

Diagnostic Tools ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Basic Work ◽

Inflammatory Bowel

Background: Very early-onset inflammatory bowel disease (VEO-IBD) is defined as diagnosis of Ulcerative Colitis (UC) or Crohn’s Disease (CD) in children under six years of age. Genome wide association studies have linked a strong genetic component responsible for VEO-IBD. Approximately, 30-40% children of VEO-IBD have underlying immunodeficiency states. We aimed to study the spectrum of presentation, underlying monogenetic defects and outcome in VEO-IBD. Methods: This is a prospective, observational study conducted at division of Gastroenterology, the Children's Hospital & the Institute of Child Health, Lahore, over 2 years. Children developing features of IBD under six-years of age were included. Data included demography, clinical presentation, diagnostic tools and outcome. Gastroscopy and colonoscopy were performed in all patients in addition to basic work up done for associatedimmunodeficiency states and molecular genetics. SPSS version 21 was used for analysis. Continuous...

Download Full-text

Fecal Calprotectin in Combination With Standard Blood Tests in the Diagnosis of Inflammatory Bowel Disease in Children

Frontiers in Pediatrics ◽

10.3389/fped.2020.609279 ◽

2021 ◽

Vol 8 ◽

Author(s):

Shaun S. C. Ho ◽

Michael Ross ◽

Jacqueline I. Keenan ◽

Andrew S. Day

Keyword(s):

Inflammatory Bowel Disease ◽

Platelet Count ◽

Bowel Disease ◽

Predictive Value ◽

Screening Test ◽

Fecal Calprotectin ◽

Non Invasive ◽

Blood Tests ◽

Inflammatory Bowel ◽

Sensitivity Specificity

Introduction: Fecal calprotectin (FC) is a useful non-invasive screening test but elevated levels are not specific to inflammatory bowel disease (IBD). The study aimed to evaluate the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of FC alone or FC in combination with other standard blood tests in the diagnosis of IBD.Methods: Children aged <17 years who had FC (normal range <50 μg/g) measured and underwent endoscopy over 33 months in Christchurch, New Zealand were identified retrospectively (consecutive sampling). Medical records were reviewed for patient final diagnoses.Results: One hundred and two children were included; mean age was 12.3 years and 53 were male. Fifty-eight (57%) of the 102 children were diagnosed with IBD: 49 with Crohn's disease, eight with ulcerative colitis and one with IBD-unclassified. FC of 50 μg/g threshold provided a sensitivity of 96.6% [95% confident interval (CI) 88.3–99.4%] and PPV of 72.7% (95% CI 61.9–81.4%) in diagnosing IBD. Two children with IBD however were found to have FC <50 μg/g. Sensitivity in diagnosing IBD was further improved to 98.3% (95% CI 90.7–99.1%) when including FC >50 μg/g or elevated platelet count. Furthermore, PPVs in diagnosing IBD improved when FC at various thresholds was combined with either low albumin or high platelet count.Conclusion: Although FC alone is a useful screening test for IBD, a normal FC alone does not exclude IBD. Extending FC to include albumin or platelet count may improve sensitivity, specificity, PPV and NPV in diagnosing IBD. However, prospective studies are required to validate this conclusion.

Download Full-text

Development of Machine Learning Model to Predict the 5-Year Risk of Starting Biologic Agents in Patients with Inflammatory Bowel Disease (IBD): K-CDM Network Study

Journal of Clinical Medicine ◽

10.3390/jcm9113427 ◽

2020 ◽

Vol 9 (11) ◽

pp. 3427 ◽

Cited By ~ 1

Author(s):

Youn I Choi ◽

Sung Jin Park ◽

Jun-Won Chung ◽

Kyoung Oh Kim ◽

Jae Hee Cho ◽

...

Keyword(s):

Machine Learning ◽

Inflammatory Bowel Disease ◽

Validation Study ◽

Bowel Disease ◽

Medical Center ◽

External Validation ◽

Biologic Agents ◽

Risk Level ◽

Common Data Model ◽

Inflammatory Bowel

Background: The incidence and global burden of inflammatory bowel disease (IBD) have steadily increased in the past few decades. Improved methods to stratify risk and predict disease-related outcomes are required for IBD. Aim: The aim of this study was to develop and validate a machine learning (ML) model to predict the 5-year risk of starting biologic agents in IBD patients. Method: We applied an ML method to the database of the Korean common data model (K-CDM) network, a data sharing consortium of tertiary centers in Korea, to develop a model to predict the 5-year risk of starting biologic agents in IBD patients. The records analyzed were those of patients diagnosed with IBD between January 2006 and June 2017 at Gil Medical Center (GMC; n = 1299) or present in the K-CDM network (n = 3286). The ML algorithm was developed to predict 5- year risk of starting biologic agents in IBD patients using data from GMC and externally validated with the K-CDM network database. Result: The ML model for prediction of IBD-related outcomes at 5 years after diagnosis yielded an area under the curve (AUC) of 0.86 (95% CI: 0.82–0.92), in an internal validation study carried out at GMC. The model performed consistently across a range of other datasets, including that of the K-CDM network (AUC = 0.81; 95% CI: 0.80–0.85), in an external validation study. Conclusion: The ML-based prediction model can be used to identify IBD-related outcomes in patients at risk, enabling physicians to perform close follow-up based on the patient’s risk level, estimated through the ML algorithm.

Download Full-text