scholarly journals On evaluation metrics for medical applications of artificial intelligence

Author(s):  
Steven Hicks ◽  
Inga Strüke ◽  
Vajira Thambawita ◽  
Malek Hammou ◽  
Pål Halvorsen ◽  
...  

Clinicians and model developers need to understand how proposed machine learning (ML) models could improve patient care. In fact, no single metric captures all the desirable properties of a model and several metrics are typically reported to summarize a model's performance. Unfortunately, these measures are not easily understandable by many clinicians. Moreover, comparison of models across studies in an objective manner is challenging, and no tool exists to compare models using the same performance metrics. This paper looks at previous ML studies done in gastroenterology, provides an explanation of what different metrics mean in the context of the presented studies, and gives a thorough explanation of how different metrics should be interpreted. We also release an open source web-based tool that may be used to aid in calculating the most relevant metrics presented in this paper so that other researchers and clinicians may easily incorporate them into their research.

2020 ◽  
Author(s):  
Abdulrahman Takiddin ◽  
Jens Schneider ◽  
Yin Yang ◽  
Alaa Abd-Alrazaq ◽  
Mowafa Househ

BACKGROUND Skin cancer is the most common cancer type affecting humans. Traditional skin cancer diagnosis methods are costly, require a professional physician, and take time. Hence, to aid in diagnosing skin cancer, Artificial Intelligence (AI) tools are being used, including shallow and deep machine learning-based techniques that are trained to detect and classify skin cancer using computer algorithms and deep neural networks. OBJECTIVE The aim of this study is to identify and group the different types of AI-based technologies used to detect and classify skin cancer. The study also examines the reliability of the selected papers by studying the correlation between the dataset size and number of diagnostic classes with the performance metrics used to evaluate the models. METHODS We conducted a systematic search for articles using IEEE Xplore, ACM DL, and Ovid MEDLINE databases following the PRISMA Extension for Scoping Reviews (PRISMA-ScR) guidelines. The study included in this scoping review had to fulfill several selection criteria; to be specifically about skin cancer, detecting or classifying skin cancer, and using AI technologies. Study selection and data extraction were conducted by two reviewers independently. Extracted data were synthesized narratively, where studies were grouped based on the diagnostic AI techniques and their evaluation metrics. RESULTS We retrieved 906 papers from the 3 databases, but 53 studies were eligible for this review. While shallow techniques were used in 14 studies, deep techniques were utilized in 39 studies. The studies used accuracy (n=43/53), the area under receiver operating characteristic curve (n=5/53), sensitivity (n=3/53), and F1-score (n=2/53) to assess the proposed models. Studies that use smaller datasets and fewer diagnostic classes tend to have higher reported accuracy scores. CONCLUSIONS The adaptation of AI in the medical field facilitates the diagnosis process of skin cancer. However, the reliability of most AI tools is questionable since small datasets or low numbers of diagnostic classes are used. In addition, a direct comparison between methods is hindered by a varied use of different evaluation metrics and image types.


Author(s):  
Roman David Bülow ◽  
Daniel Dimitrov ◽  
Peter Boor ◽  
Julio Saez-Rodriguez

AbstractIgA nephropathy (IgAN) is the most common glomerulonephritis. It is characterized by the deposition of immune complexes containing immunoglobulin A (IgA) in the kidney’s glomeruli, triggering an inflammatory process. In many patients, the disease has a progressive course, eventually leading to end-stage kidney disease. The current understanding of IgAN’s pathophysiology is incomplete, with the involvement of several potential players, including the mucosal immune system, the complement system, and the microbiome. Dissecting this complex pathophysiology requires an integrated analysis across molecular, cellular, and organ scales. Such data can be obtained by employing emerging technologies, including single-cell sequencing, next-generation sequencing, proteomics, and complex imaging approaches. These techniques generate complex “big data,” requiring advanced computational methods for their analyses and interpretation. Here, we introduce such methods, focusing on the broad areas of bioinformatics and artificial intelligence and discuss how they can advance our understanding of IgAN and ultimately improve patient care. The close integration of advanced experimental and computational technologies with medical and clinical expertise is essential to improve our understanding of human diseases. We argue that IgAN is a paradigmatic disease to demonstrate the value of such a multidisciplinary approach.


2021 ◽  
pp. 036354652110086
Author(s):  
Prem N. Ramkumar ◽  
Bryan C. Luu ◽  
Heather S. Haeberle ◽  
Jaret M. Karnuta ◽  
Benedict U. Nwachukwu ◽  
...  

Artificial intelligence (AI) represents the fourth industrial revolution and the next frontier in medicine poised to transform the field of orthopaedics and sports medicine, though widespread understanding of the fundamental principles and adoption of applications remain nascent. Recent research efforts into implementation of AI in the field of orthopaedic surgery and sports medicine have demonstrated great promise in predicting athlete injury risk, interpreting advanced imaging, evaluating patient-reported outcomes, reporting value-based metrics, and augmenting the patient experience. Not unlike the recent emphasis thrust upon physicians to understand the business of medicine, the future practice of sports medicine specialists will require a fundamental working knowledge of the strengths, limitations, and applications of AI-based tools. With appreciation, caution, and experience applying AI to sports medicine, the potential to automate tasks and improve data-driven insights may be realized to fundamentally improve patient care. In this Current Concepts review, we discuss the definitions, strengths, limitations, and applications of AI from the current literature as it relates to orthopaedic sports medicine.


Circulation ◽  
2021 ◽  
Vol 144 (Suppl_2) ◽  
Author(s):  
Alexandra Weissman ◽  
Mariam Bramah Lawani ◽  
Thomas Rohan ◽  
Clifton W CALLAWAY

Introduction: Pneumonia is common after OHCA but is difficult to diagnose in the first 72 hours following ROSC, this results in early untargeted antibiotic administration based on non-specific imaging and laboratory findings. Antibiotic resistance is rising, is influenced by untargeted antibiotic administration, and can increase patient morbidity and mortality as well as healthcare costs. Precision methods of bacterial pathogen detection in OHCA patients are needed to improve patient care. This proof-of-concept pilot study aimed to assess feasibility of bacterial pathogen sequencing and comparability of sequencing results to clinical culture after OHCA. Methods: Blood and bronchoalveolar lavage (BAL) were obtained from residual clinical specimens collected within 12 hours of ROSC. Bacterial DNA was extracted using the Qiagen PowerLyzer PowerSoil DNA kit, sequenced using the MinION nanopore sequencer, and analyzed with Oxford Nanopore Technologies’ EPI2ME bioinformatics software. Sequencing results were compared to culture results using McNemar’s chi-square statistic. Study-defined pneumonia was based on presence of at least two characteristics within 72 hours of ROSC: fever (temperature ≥38°C); persistent leukocytosis >15,000 or leukopenia <3,500 for 48 hours; persistent chest radiography infiltrates for 48 hours per clinical radiology read; bacterial pathogen cultured. Results: We enrolled 38 consecutive OHCA subjects: mean age 61.8 years (18.0); 16 (42%) female; 25 (66%) White, 7 (18%) Black, 6 (16%) “Other” race; 7 subjects (18%) survived and 31 (82%) died; 16 (42%) subjects had pneumonia. Sequencing results were available in 12 hours while culture results were available in 48-72 hours after collection. There was a non-significant difference in the proportion of the same pathogens identified for each method per McNemar’s chi-square: p = 0.38, difference of 0.095 (-0.095, 0.286). Conclusions: Nanopore sequencing detects pathogenic bacteria comparable to clinical microbiologic culture and in less time. This technology can produce a paradigm shift in early bacterial pathogen detection in OHCA survivors, which can improve patient care. The technology is applicable to other patient populations and for viral and fungal pathogens.


Sign in / Sign up

Export Citation Format

Share Document