scholarly journals Training and deploying a deep learning model for endoscopic severity grading in ulcerative colitis using multicenter clinical trial data

2021 ◽  
Vol 14 ◽  
pp. 263177452199062
Author(s):  
Benjamin Gutierrez Becker ◽  
Filippo Arcadu ◽  
Andreas Thalhammer ◽  
Citlalli Gamez Serna ◽  
Owen Feehan ◽  
...  

Introduction: The Mayo Clinic Endoscopic Subscore is a commonly used grading system to assess the severity of ulcerative colitis. Correctly grading colonoscopies using the Mayo Clinic Endoscopic Subscore is a challenging task, with suboptimal rates of interrater and intrarater variability observed even among experienced and sufficiently trained experts. In recent years, several machine learning algorithms have been proposed in an effort to improve the standardization and reproducibility of Mayo Clinic Endoscopic Subscore grading. Methods: Here we propose an end-to-end fully automated system based on deep learning to predict a binary version of the Mayo Clinic Endoscopic Subscore directly from raw colonoscopy videos. Differently from previous studies, the proposed method mimics the assessment done in practice by a gastroenterologist, that is, traversing the whole colonoscopy video, identifying visually informative regions and computing an overall Mayo Clinic Endoscopic Subscore. The proposed deep learning–based system has been trained and deployed on raw colonoscopies using Mayo Clinic Endoscopic Subscore ground truth provided only at the colon section level, without manually selecting frames driving the severity scoring of ulcerative colitis. Results and Conclusion: Our evaluation on 1672 endoscopic videos obtained from a multisite data set obtained from the etrolizumab Phase II Eucalyptus and Phase III Hickory and Laurel clinical trials, show that our proposed methodology can grade endoscopic videos with a high degree of accuracy and robustness (Area Under the Receiver Operating Characteristic Curve = 0.84 for Mayo Clinic Endoscopic Subscore ⩾ 1, 0.85 for Mayo Clinic Endoscopic Subscore ⩾ 2 and 0.85 for Mayo Clinic Endoscopic Subscore ⩾ 3) and reduced amounts of manual annotation. Plain language summary Patient, caregiver and provider thoughts on educational materials about prescribing and medication safety Artificial intelligence can be used to automatically assess full endoscopic videos and estimate the severity of ulcerative colitis. In this work, we present an artificial intelligence algorithm for the automatic grading of ulcerative colitis in full endoscopic videos. Our artificial intelligence models were trained and evaluated on a large and diverse set of colonoscopy videos obtained from concluded clinical trials. We demonstrate not only that artificial intelligence is able to accurately grade full endoscopic videos, but also that using diverse data sets obtained from multiple sites is critical to train robust AI models that could potentially be deployed on real-world data.

2021 ◽  
Vol 15 (Supplement_1) ◽  
pp. S173-S174
Author(s):  
B Gutierrez Becker ◽  
E Giuffrida ◽  
M Mangia ◽  
F Arcadu ◽  
V Whitehill ◽  
...  

Abstract Background Endoscopic assessment is a critical procedure to assess the improvement of mucosa and response to therapy, and therefore a pivotal component of clinical trial endpoints for IBD. Central scoring of endoscopic videos is challenging and time consuming. We evaluated the feasibility of using an Artificial Intelligence (AI) algorithm to automatically produce filtered videos where the non-readable portions of the video are removed, with the aim of accelerating the scoring of endoscopic videos. Methods The AI algorithm was based on a Convolutional Neural Network trained to perform a binary classification task. This task consisted of assigning the frames in a colonoscopy video to one of two classes: “readable” or “unreadable.” The algorithm was trained using annotations performed by two data scientists (BG, FA). The criteria to consider a frame “readable” were: i) the colon walls were within the field of view; ii) contrast and sharpness of the frame were sufficient to visually inspect the mucosa, and iii) no presence of artifacts completely obstructing the visibility of the mucosa. The frames were extracted randomly from 351 colonoscopy videos of the etrolizumab EUCALYPTUS (NCT01336465) Phase II ulcerative colitis clinical trial. Evaluation of the performance of the AI algorithm was performed on colonoscopy videos obtained as part of the etrolizumab HICKORY (NCT02100696) and LAUREL (NCT02165215) Phase III ulcerative colitis clinical trials. Each video was filtered using the AI algorithm, resulting in a shorter video where the sections considered unreadable by the AI algorithm were removed. Each of three annotators (EG, MM and MD) was randomly assigned an equal number of AI-filtered videos and raw videos. The gastroenterologist was tasked to score temporal segments of the video according to the Mayo Clinic Endoscopic Subscore (MCES). Annotations were performed by means of an online annotation platform (Virgo Surgical Video Solutions, Inc). Results We measured the time it took the annotators to score raw and AI-filtered videos. We observed a statistically significant reduction (Mann Whitney U test p-value=0.039) in the median time spent by the annotators scoring raw videos (10.59∓ 0.94 minutes) with respect to the time spent scoring AI-filtered videos (9.51 ∓ 0.92 minutes), with a substantial intra-rater agreement when evaluating highlight and raw videos (Cohen’s kappa 0.92 and 0.55 for experienced and junior gastroenterologists respectively). Conclusion Our analysis shows that AI can be used reliably as an assisting tool to automatically remove non-readable time segments from full colonoscopy videos. The use of our proposed algorithm can lead to reduced annotation times in the task of centrally reading colonoscopy videos.


Different mathematical models, Artificial Intelligence approach and Past recorded data set is combined to formulate Machine Learning. Machine Learning uses different learning algorithms for different types of data and has been classified into three types. The advantage of this learning is that it uses Artificial Neural Network and based on the error rates, it adjusts the weights to improve itself in further epochs. But, Machine Learning works well only when the features are defined accurately. Deciding which feature to select needs good domain knowledge which makes Machine Learning developer dependable. The lack of domain knowledge affects the performance. This dependency inspired the invention of Deep Learning. Deep Learning can detect features through self-training models and is able to give better results compared to using Artificial Intelligence or Machine Learning. It uses different functions like ReLU, Gradient Descend and Optimizers, which makes it the best thing available so far. To efficiently apply such optimizers, one should have the knowledge of mathematical computations and convolutions running behind the layers. It also uses different pooling layers to get the features. But these Modern Approaches need high level of computation which requires CPU and GPUs. In case, if, such high computational power, if hardware is not available then one can use Google Colaboratory framework. The Deep Learning Approach is proven to improve the skin cancer detection as demonstrated in this paper. The paper also aims to provide the circumstantial knowledge to the reader of various practices mentioned above.


BMJ ◽  
2020 ◽  
pp. m689 ◽  
Author(s):  
Myura Nagendran ◽  
Yang Chen ◽  
Christopher A Lovejoy ◽  
Anthony C Gordon ◽  
Matthieu Komorowski ◽  
...  

Abstract Objective To systematically examine the design, reporting standards, risk of bias, and claims of studies comparing the performance of diagnostic deep learning algorithms for medical imaging with that of expert clinicians. Design Systematic review. Data sources Medline, Embase, Cochrane Central Register of Controlled Trials, and the World Health Organization trial registry from 2010 to June 2019. Eligibility criteria for selecting studies Randomised trial registrations and non-randomised studies comparing the performance of a deep learning algorithm in medical imaging with a contemporary group of one or more expert clinicians. Medical imaging has seen a growing interest in deep learning research. The main distinguishing feature of convolutional neural networks (CNNs) in deep learning is that when CNNs are fed with raw data, they develop their own representations needed for pattern recognition. The algorithm learns for itself the features of an image that are important for classification rather than being told by humans which features to use. The selected studies aimed to use medical imaging for predicting absolute risk of existing disease or classification into diagnostic groups (eg, disease or non-disease). For example, raw chest radiographs tagged with a label such as pneumothorax or no pneumothorax and the CNN learning which pixel patterns suggest pneumothorax. Review methods Adherence to reporting standards was assessed by using CONSORT (consolidated standards of reporting trials) for randomised studies and TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) for non-randomised studies. Risk of bias was assessed by using the Cochrane risk of bias tool for randomised studies and PROBAST (prediction model risk of bias assessment tool) for non-randomised studies. Results Only 10 records were found for deep learning randomised clinical trials, two of which have been published (with low risk of bias, except for lack of blinding, and high adherence to reporting standards) and eight are ongoing. Of 81 non-randomised clinical trials identified, only nine were prospective and just six were tested in a real world clinical setting. The median number of experts in the comparator group was only four (interquartile range 2-9). Full access to all datasets and code was severely limited (unavailable in 95% and 93% of studies, respectively). The overall risk of bias was high in 58 of 81 studies and adherence to reporting standards was suboptimal (<50% adherence for 12 of 29 TRIPOD items). 61 of 81 studies stated in their abstract that performance of artificial intelligence was at least comparable to (or better than) that of clinicians. Only 31 of 81 studies (38%) stated that further prospective studies or trials were required. Conclusions Few prospective deep learning studies and randomised trials exist in medical imaging. Most non-randomised trials are not prospective, are at high risk of bias, and deviate from existing reporting standards. Data and code availability are lacking in most studies, and human comparator groups are often small. Future studies should diminish risk of bias, enhance real world clinical relevance, improve reporting and transparency, and appropriately temper conclusions. Study registration PROSPERO CRD42019123605.


Author(s):  
Evren Dağlarli

The explainable artificial intelligence (xAI) is one of the interesting issues that has emerged recently. Many researchers are trying to deal with the subject with different dimensions and interesting results that have come out. However, we are still at the beginning of the way to understand these types of models. The forthcoming years are expected to be years in which the openness of deep learning models is discussed. In classical artificial intelligence approaches, we frequently encounter deep learning methods available today. These deep learning methods can yield highly effective results according to the data set size, data set quality, the methods used in feature extraction, the hyper parameter set used in deep learning models, the activation functions, and the optimization algorithms. However, there are important shortcomings that current deep learning models are currently inadequate. These artificial neural network-based models are black box models that generalize the data transmitted to it and learn from the data. Therefore, the relational link between input and output is not observable. This is an important open point in artificial neural networks and deep learning models. For these reasons, it is necessary to make serious efforts on the explainability and interpretability of black box models.


Blood ◽  
2006 ◽  
Vol 108 (11) ◽  
pp. 131-131
Author(s):  
Brian G. Van Ness ◽  
John C. Crowley ◽  
Christine Ramos ◽  
Suzanne M. Grindle ◽  
Antje Hoering ◽  
...  

Abstract While there are certain common clinical features in myeloma, the disease shows significant heterogeneity with regard to disease progression, and responses to therapy, affecting both survival and toxicities. Heritable variations in a wide variety of genes and pathways affecting cellular functions and drug responses likely impact patient outcomes. In the Bank On A Cure (BOAC) program we have developed a custom chip that assesses 3,404 SNPs representing variations in cellular functions and pathways that may be involved in myeloma progression and response. The chip has gone through rigorous quality controls checks for high call rates, accuracy, and reproducibility that will be presented. Using the BOAC chip, we have conducted studies to look for SNPs that may identify biologic variations that are associated with good or poor response across a variety of treatments. In this study we looked for SNPs that may distinguish short term and long term survivors in two phase III clinical trials: ECOG E9486 and intergroup trial S9321. E9487 patients were treated with VBMCP followed by randomization to no further treatment, IFN-alpha, or cylcophosphamide; and, although there was variation in survival, no significant differences in survival were noted among the 3 arms of the trial. Patients included in this SNP study from S9321 received VAD induction followed by randomization to VBMCP or high dose melphalan + TBI. SNP profiles were obtained for patients with less than 1 year EFS (n=20 in E9487; n=50 in S9321) and patients showing greater than 3 years EFS (n=32 in E9486; n=41 in S9321). Statistical approaches were performed to identify single and groups of SNPs that best discriminated the survival groups. Previous studies have suggested genetic variations in drug metabolism genes, p-glycoprotein transport, and DNA repair genes may influence survival outcomes. Our results show significant survival associations of genetic variations in genes within these functional categories (eg. GST, XRCC, ABCB, and CYP genes). Although genetic variations were found that were uniquely associated with each clinical trial, several of these genetic variations show survival associations that increase in significance when the two trials were examined as a conglomerate data set. Grouping genetic variations through common pathway approaches using gene set enrichment analysis, as well as clustering or partitioning algorithms, further improve the value of the SNPs as potential prognostic markers of survival outcomes. These results and statistical approaches will be presented, and represent steps toward identifying patient variations in biologic mechanisms important in predicting therapeutic outcomes.


2019 ◽  
Vol 12 ◽  
pp. 175628481984863 ◽  
Author(s):  
Ferdinando D’Amico ◽  
Tommaso Lorenzo Parigi ◽  
Gionata Fiorino ◽  
Laurent Peyrin-Biroulet ◽  
Silvio Danese

Tofacitinib is an oral small molecule directed against the JAK/STAT pathway, blocking the inflammatory cascade. Oral formulation of tofacitinib has recently been approved for the treatment of patients with moderate–severe ulcerative colitis. Its efficacy and safety have been demonstrated in three phase III clinical trials and confirmed by promising real-life data. The purpose of this review is to summarize the available evidence on the efficacy and safety of tofacitinib and to define its role and position in the treatment algorithms for patients with ulcerative colitis.


Author(s):  
D J Samatha Naidu ◽  
M.Gurivi Reddy

The farmer is a backbone to nation, but majority of the cultivated crops in india affecting by various diseases at various stages of its cultivation. Recent research works shows that diseases are not providing accurate results and few identifying but not providing optimized solutions to the system. In proposed work, the recent developments of Artificial intelligence through Deep Learning show that AIR (Automatic Image Recognition systems) using CNN algorithm models can be very beneficial in such scenarios. The Rice leaf diseases images related dataset is not easily available to automate , so that we have created our own trained data set which is small in size hence we have used transfer learning to develop our Proposed model which supports deep learning models. The Proposed CNN architecture illustrated based on VGG-16 model and it is trained, tested on given dataset collected from rice fields and the internet. The accuracy of the proposed model is moderately accurate with 92.46%.


2021 ◽  
Vol 93 (6) ◽  
pp. AB196-AB197
Author(s):  
Michael F. Byrne ◽  
James E. East ◽  
Marietta Iacucci ◽  
Remo Panaccione ◽  
Rakesh Kalapala ◽  
...  

2012 ◽  
Vol 30 (19) ◽  
pp. 2334-2339 ◽  
Author(s):  
Joleen Hubbard ◽  
David M. Thomas ◽  
Greg Yothers ◽  
Erin Green ◽  
Charles Blanke ◽  
...  

Purpose Limited data exist regarding the outcomes of adjuvant therapy in younger patients with stage II and III colon cancer. We examined disease-free survival (DFS), overall survival (OS), recurrence-free interval (RFI), and grade 3+ adverse events (AEs) in younger patients in the 33,574 patient Adjuvant Colon Cancer Endpoints Group data set. Patients and Methods Individual patient data from 24 randomized phase III clinical trials were obtained for survival outcomes, which included 10 clinical trials for AE outcomes. Two age-based cutoff points were used to define younger patients: age younger than 40 years and younger than 50 years. Adjuvant therapy benefit analyses were limited to the nine clinical trials in which the investigational chemotherapeutic arm demonstrated benefit. Results One thousand seven hundred fifty-eight patients (5.2%) were younger than 40 years, 5,817 patients (17.3%) were younger than 50 years, and only 299 patients (0.9%) were younger than 30 years. No meaningful differences in sex or stage were noted in younger versus older patients. Younger and older patients did not differ in RFI (age, < 40 years: hazard ratio [HR], 1.0; P = .62 and age < 50 years: HR, 1.02; P = .35). Younger patients (both cutoff points), had longer OS and DFS than older patients. In trials demonstrating adjuvant therapy benefit, similar DFS benefit was observed by age. Younger patients experienced less leukopenia and stomatitis, but more frequent nausea/vomiting. Conclusion Among patients on clinical trials, younger and older patients with stage II and III colon cancer had similar RFI and adjuvant therapy benefit. Younger patients have longer OS and DFS, which is likely primarily because of fewer competing causes of death. Adjuvant therapy is beneficial for colon cancer in patients younger than 50 years who meet typical clinical trial eligibility criteria.


Sign in / Sign up

Export Citation Format

Share Document