scholarly journals Development and evaluation of an artificial intelligence system for COVID-19 diagnosis

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Cheng Jin ◽  
Weixiang Chen ◽  
Yukun Cao ◽  
Zhanwei Xu ◽  
Zimeng Tan ◽  
...  

Abstract Early detection of COVID-19 based on chest CT enables timely treatment of patients and helps control the spread of the disease. We proposed an artificial intelligence (AI) system for rapid COVID-19 detection and performed extensive statistical analysis of CTs of COVID-19 based on the AI system. We developed and evaluated our system on a large dataset with more than 10 thousand CT volumes from COVID-19, influenza-A/B, non-viral community acquired pneumonia (CAP) and non-pneumonia subjects. In such a difficult multi-class diagnosis task, our deep convolutional neural network-based system is able to achieve an area under the receiver operating characteristic curve (AUC) of 97.81% for multi-way classification on test cohort of 3,199 scans, AUC of 92.99% and 93.25% on two publicly available datasets, CC-CCII and MosMedData respectively. In a reader study involving five radiologists, the AI system outperforms all of radiologists in more challenging tasks at a speed of two orders of magnitude above them. Diagnosis performance of chest x-ray (CXR) is compared to that of CT. Detailed interpretation of deep network is also performed to relate system outputs with CT presentations. The code is available at https://github.com/ChenWWWeixiang/diagnosis_covid19.

Author(s):  
Cheng Jin ◽  
Weixiang Chen ◽  
Yukun Cao ◽  
Zhanwei Xu ◽  
Xin Zhang ◽  
...  

AbstractEarly detection of COVID-19 based on chest CT will enable timely treatment of patients and help control the spread of the disease. With rapid spreading of COVID-19 in many countries, however, CT volumes of suspicious patients are increasing at a speed much faster than the availability of human experts. Here, we propose an artificial intelligence (AI) system for fast COVID-19 diagnosis with an accuracy comparable to experienced radiologists. A large dataset was constructed by collecting 970 CT volumes of 496 patients with confirmed COVID-19 and 260 negative cases from three hospitals in Wuhan, China, and 1,125 negative cases from two publicly available chest CT datasets. Trained using only 312 cases, our diagnosis system, which is based on deep convolutional neural network, is able to achieve an accuracy of 94.98%, an area under the receiver operating characteristic curve (AUC) of 97.91%, a sensitivity of 94.06%, and a specificity of 95.47% on an independent external verification dataset of 1,255 cases. In a reader study involving five radiologists, only one radiologist is slightly more accurate than the AI system. The AI system is two orders of magnitude faster than radiologists and the code is available at https://github.com/ChenWWWeixiang/diagnosis_covid19.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Albert T. Young ◽  
Kristen Fernandez ◽  
Jacob Pfau ◽  
Rasika Reddy ◽  
Nhat Anh Cao ◽  
...  

AbstractArtificial intelligence models match or exceed dermatologists in melanoma image classification. Less is known about their robustness against real-world variations, and clinicians may incorrectly assume that a model with an acceptable area under the receiver operating characteristic curve or related performance metric is ready for clinical use. Here, we systematically assessed the performance of dermatologist-level convolutional neural networks (CNNs) on real-world non-curated images by applying computational “stress tests”. Our goal was to create a proxy environment in which to comprehensively test the generalizability of off-the-shelf CNNs developed without training or evaluation protocols specific to individual clinics. We found inconsistent predictions on images captured repeatedly in the same setting or subjected to simple transformations (e.g., rotation). Such transformations resulted in false positive or negative predictions for 6.5–22% of skin lesions across test datasets. Our findings indicate that models meeting conventionally reported metrics need further validation with computational stress tests to assess clinic readiness.


Author(s):  
Kangkang Hong ◽  
Ziping Shu ◽  
Laodong Li ◽  
Yu Zhong ◽  
Weiqian Chen ◽  
...  

Scrub typhus is often misdiagnosed in febrile patients, leading to antibiotic abuse and multiple complications. We conducted a retrospective record review at the Fourth Affiliated Hospital of Guangxi Medical University in China. Data were collected on 52 patients with a confirmed diagnosis of scrub typhus and complete clinical data. In addition, data were collected on 52 patients with bloodstream infection, 25 patients with HIV infection, 112 patients with common community-acquired pneumonia (CCAP), and 36 patients with severe community-acquired pneumonia (SCAP) to serve as control groups. The peripheral blood CD4 and CD8 counts, CD4/CD8 ratio, C-reactive protein, procalcitonin, alanine aminotransferase, aspartate aminotransferase, creatinine, and β2 microglobulin levels; and the white blood cell count and neutrophil percentage were compared between the scrub typhus and the control groups. The value of these biomarkers in the diagnosis of scrub typhus was assessed using receiver–operating characteristic curve analysis. The scrub typhus group had a significantly lower CD4 count and CD4/CD8 ratio than the bloodstream infection, CCAP, and SCAP groups, and a significantly greater CD4 count and CD4/CD8 ratio than the HIV infection group. In contrast, the scrub typhus group had a significantly greater CD8 count than the bloodstream infection and CCAP and SCAP groups, and it had a lower level of CD8 than the HIV infection group. The areas under the curve of CD4/CD8 were more than 0.93 in the receiver–operating characteristic curve analysis. These findings suggest that the CD4/CD8 ratio is a useful ancillary test for diagnosing scrub typhus.


Diagnostics ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 530
Author(s):  
Christian Salvatore ◽  
Matteo Interlenghi ◽  
Caterina B. Monti ◽  
Davide Ippolito ◽  
Davide Capra ◽  
...  

We assessed the role of artificial intelligence applied to chest X-rays (CXRs) in supporting the diagnosis of COVID-19. We trained and cross-validated a model with an ensemble of 10 convolutional neural networks with CXRs of 98 COVID-19 patients, 88 community-acquired pneumonia (CAP) patients, and 98 subjects without either COVID-19 or CAP, collected in two Italian hospitals. The system was tested on two independent cohorts, namely, 148 patients (COVID-19, CAP, or negative) collected by one of the two hospitals (independent testing I) and 820 COVID-19 patients collected by a multicenter study (independent testing II). On the training and cross-validation dataset, sensitivity, specificity, and area under the curve (AUC) were 0.91, 0.87, and 0.93 for COVID-19 versus negative subjects, 0.85, 0.82, and 0.94 for COVID-19 versus CAP. On the independent testing I, sensitivity, specificity, and AUC were 0.98, 0.88, and 0.98 for COVID-19 versus negative subjects, 0.97, 0.96, and 0.98 for COVID-19 versus CAP. On the independent testing II, the system correctly diagnosed 652 COVID-19 patients versus negative subjects (0.80 sensitivity) and correctly differentiated 674 COVID-19 versus CAP patients (0.82 sensitivity). This system appears promising for the diagnosis and differential diagnosis of COVID-19, showing its potential as a second opinion tool in conditions of the variable prevalence of different types of infectious pneumonia.


10.2196/24163 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e24163
Author(s):  
Md Mohaimenul Islam ◽  
Hsuan-Chia Yang ◽  
Tahmina Nasrin Poly ◽  
Yu-Chuan Jack Li

Background Laboratory tests are considered an essential part of patient safety as patients’ screening, diagnosis, and follow-up are solely based on laboratory tests. Diagnosis of patients could be wrong, missed, or delayed if laboratory tests are performed erroneously. However, recognizing the value of correct laboratory test ordering remains underestimated by policymakers and clinicians. Nowadays, artificial intelligence methods such as machine learning and deep learning (DL) have been extensively used as powerful tools for pattern recognition in large data sets. Therefore, developing an automated laboratory test recommendation tool using available data from electronic health records (EHRs) could support current clinical practice. Objective The objective of this study was to develop an artificial intelligence–based automated model that can provide laboratory tests recommendation based on simple variables available in EHRs. Methods A retrospective analysis of the National Health Insurance database between January 1, 2013, and December 31, 2013, was performed. We reviewed the record of all patients who visited the cardiology department at least once and were prescribed laboratory tests. The data set was split into training and testing sets (80:20) to develop the DL model. In the internal validation, 25% of data were randomly selected from the training set to evaluate the performance of this model. Results We used the area under the receiver operating characteristic curve, precision, recall, and hamming loss as comparative measures. A total of 129,938 prescriptions were used in our model. The DL-based automated recommendation system for laboratory tests achieved a significantly higher area under the receiver operating characteristic curve (AUROCmacro and AUROCmicro of 0.76 and 0.87, respectively). Using a low cutoff, the model identified appropriate laboratory tests with 99% sensitivity. Conclusions The developed artificial intelligence model based on DL exhibited good discriminative capability for predicting laboratory tests using routinely collected EHR data. Utilization of DL approaches can facilitate optimal laboratory test selection for patients, which may in turn improve patient safety. However, future study is recommended to assess the cost-effectiveness for implementing this model in real-world clinical settings.


2020 ◽  
Vol 7 (5) ◽  
Author(s):  
Linghang Wang ◽  
Yao Liu ◽  
Ting Zhang ◽  
Yuyong Jiang ◽  
Siyuan Yang ◽  
...  

Abstract Background There is currently a lack of nonspecific laboratory indicators as a quantitative standard to distinguish between the 2019 coronavirus disease (COVID-19) and an influenza A or B virus infection. Thus, the aim of this study was to establish a nomogram to detect COVID-19. Methods A nomogram was established using data collected from 457 patients (181 with COVID-19 and 276 with influenza A or B infection) in China. The nomogram used age, lymphocyte percentage, and monocyte count to differentiate COVID-19 from influenza. Results Our nomogram predicted probabilities of COVID-19 with an area under the receiver operating characteristic curve of 0.913 (95% confidence interval [CI], 0.883–0.937), greater than that of the lymphocyte:monocyte ratio (0.849; 95% CI, 0.812–0.880; P = .0007), lymphocyte percentage (0.808; 95% CI, 0.768–0.843; P < .0001), monocyte count (0.780; 95% CI, 0.739–0.817; P < .0001), or age (0.656; 95% CI, 0.610–0.699; P < .0001). The predicted probability conformed to the real observation outcomes of COVID-19, according to the calibration curves. Conclusions We found that age, lymphocyte percentage, and monocyte count are risk factors for the early-stage prediction of patients infected with the 2019 novel coronavirus. As such, our research provides a useful test for doctors to differentiate COVID-19 from influenza.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Yiqiu Shen ◽  
Farah E. Shamout ◽  
Jamie R. Oliver ◽  
Jan Witowski ◽  
Kawshik Kannan ◽  
...  

AbstractThough consistently shown to detect mammographically occult cancers, breast ultrasound has been noted to have high false-positive rates. In this work, we present an AI system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. Developed on 288,767 exams, consisting of 5,442,907 B-mode and Color Doppler images, the AI achieves an area under the receiver operating characteristic curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI achieves a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924 ± 0.02 radiologists). With the help of the AI, radiologists decrease their false positive rates by 37.3% and reduce requested biopsies by 27.8%, while maintaining the same level of sensitivity. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis.


Author(s):  
Agustín Julián-Jiménez ◽  
◽  
Juan González del Castillo ◽  
Eric Jorge García-Lamberechts ◽  
Rafael Rubio Díaz ◽  
...  

Objective. To analyse a new risk score to predict bacteremia in the patients with Community-acquired Pneumonia (CAP) in the emergency departments. Patients and methods. Prospective and multicenter observational cohort study of the blood cultures ordered in 74 Spanish emergency departments for patients with CAP seen from November 1, 2019, to March 31, 2020. The predictive ability of the model was analyzed with the area under the Receiver Operating Characteristic curve (AUC-ROC). The prognostic performance for true bacteremia was calculated with the chosen cut-off for getting the sensitivity, specificity, positive predictive value and negative predictive value. Results. A total of 1,020 blood samples wered cultured. True cases of bacteremia were confirmed in 162 (15.9%). The remaining 858 cultures (84.1%) wered negative. And, 59 (5.8%) were judged to be contaminated. The model´s area under the receiver operating characteristic curve was 0.915 (95% CI, 0.898-0.933). The prognostic performance with a model´s cut-off value of ≥ 5 points achieved 97.5% (95% CI, 95.1-99.9) sensitivity, 73.2% (95% CI, 70.2-76.2) specificity, 40.9% (95% CI, 36.4-45.1) positive predictive value and 99.4% (95% CI, 99.1-99.8) negative predictive value. Conclusion. The 5MPB-Toledo score is useful for predicting bacteremia in the patients with CAP seen in the emergency departments.


2021 ◽  
Author(s):  
Yiqiu Shen ◽  
Farah E. Shamout ◽  
Jamie R. Oliver ◽  
Jan Witowski ◽  
Kawshik Kannan ◽  
...  

AbstractUltrasound is an important imaging modality for the detection and characterization of breast cancer. Though consistently shown to detect mammographically occult cancers, especially in women with dense breasts, breast ultrasound has been noted to have high false-positive rates. In this work, we present an artificial intelligence (AI) system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. To develop and validate this system, we curated a dataset consisting of 288,767 ultrasound exams from 143,203 patients examined at NYU Langone Health, between 2012 and 2019. On a test set consisting of 44,755 exams, the AI system achieved an area under the receiver operating characteristic curve (AUROC) of 0.976. In a reader study, the AI system achieved a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924±0.02 radiologists). With the help of the AI, radiologists decreased their false positive rates by 37.4% and reduced the number of requested biopsies by 27.8%, while maintaining the same level of sensitivity. To confirm its generalizability, we evaluated our system on an independent external test dataset where it achieved an AUROC of 0.911. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis worldwide.


2020 ◽  
Vol 77 (9) ◽  
pp. 597-602
Author(s):  
Xiaohua Wang ◽  
Juezhao Yu ◽  
Qiao Zhu ◽  
Shuqiang Li ◽  
Zanmei Zhao ◽  
...  

ObjectivesTo investigate the potential of deep learning in assessing pneumoconiosis depicted on digital chest radiographs and to compare its performance with certified radiologists.MethodsWe retrospectively collected a dataset consisting of 1881 chest X-ray images in the form of digital radiography. These images were acquired in a screening setting on subjects who had a history of working in an environment that exposed them to harmful dust. Among these subjects, 923 were diagnosed with pneumoconiosis, and 958 were normal. To identify the subjects with pneumoconiosis, we applied a classical deep convolutional neural network (CNN) called Inception-V3 to these image sets and validated the classification performance of the trained models using the area under the receiver operating characteristic curve (AUC). In addition, we asked two certified radiologists to independently interpret the images in the testing dataset and compared their performance with the computerised scheme.ResultsThe Inception-V3 CNN architecture, which was trained on the combination of the three image sets, achieved an AUC of 0.878 (95% CI 0.811 to 0.946). The performance of the two radiologists in terms of AUC was 0.668 (95% CI 0.555 to 0.782) and 0.772 (95% CI 0.677 to 0.866), respectively. The agreement between the two readers was moderate (kappa: 0.423, p<0.001).ConclusionOur experimental results demonstrated that the deep leaning solution could achieve a relatively better performance in classification as compared with other models and the certified radiologists, suggesting the feasibility of deep learning techniques in screening pneumoconiosis.


Sign in / Sign up

Export Citation Format

Share Document