scholarly journals A preliminary study of commercially available general-purpose chest radiography artificial intelligence-based software for detecting airspace opacity lesions in COVID-19 patients

Author(s):  
Munemura Suzuki ◽  
Aruta Niimura ◽  
Yusuke Nakamura ◽  
Yujiro Otsuka

Purpose To validate commercially available general-purpose artificial intelligence (AI)-based software for detecting airspace opacity in chest radiographs (CXRs) of COVID-19 patients. Materials and Methods We used the ieee8023-covid-chestxray-dataset to validate commercial AI software capable of detecting "Nodule/Mass" and "Airspace opacity" as regions of interest with probability scores. From this dataset, we excluded computed tomography images and CXR images taken using an anteroposterior spine view and analyzed CXR images tagged with "Pneumonia/Viral/COVID-19" and "no findings". A radiologist then reviewed the images and rated them on a 3-point opacity score for the presence of airspace opacity. The maximum probability score of airspace opacity for each image was calculated using this software. The difference in each maximum probability for each opacity score was evaluated using Wilcoxon's rank sum test. The threshold of the probability score was determined by receiver operator characteristic curve analysis for the presence or absence of COVID-19, and the true positive rate (TPR) and false positive rate (FPR) were determined for the individual and overall opacity scores. Results Images from 342 patients with COVID-19 and 15 normal images were included. Opacity scores of 1, 2, and 3 were observed in 44, 70, and 243 images, respectively, of which 33 (75%), 66 (94.2%), and 243 (100%), respectively, were from COVID-19 patients. The overall TPR and FPR were 0.82 and 0.13, respectively, at an area under the curve of 0.88 and a threshold of 0.06, while the FPR for opacity score 1 was 0.18 and the TPR for score 3 was 0.97. Conclusion Using a public database containing CXR images of COVID-19 patients, commercial AI software was shown to be able to detect airspace opacity in severe pneumonia. Summary Commercially available AI software was capable of detecting airspace opacity in CXR images of COVID-19 patients in a public database.

2021 ◽  
Author(s):  
Ying-Shi Sun ◽  
Yu-Hong Qu ◽  
Dong Wang ◽  
Yi Li ◽  
Lin Ye ◽  
...  

Abstract Background: Computer-aided diagnosis using deep learning algorithms has been initially applied in the field of mammography, but there is no large-scale clinical application.Methods: This study proposed to develop and verify an artificial intelligence model based on mammography. Firstly, retrospectively collected mammograms from six centers were randomized to a training dataset and a validation dataset for establishing the model. Secondly, the model was tested by comparing 12 radiologists’ performance with and without it. Finally, prospectively multicenter mammograms were diagnosed by radiologists with the model. The detection and diagnostic capabilities were evaluated using the free-response receiver operating characteristic (FROC) curve and ROC curve.Results: The sensitivity of model for detecting lesion after matching was 0.908 for false positive rate of 0.25 in unilateral images. The area under ROC curve (AUC) to distinguish the benign from malignant lesions was 0.855 (95% CI: 0.830, 0.880). The performance of 12 radiologists with the model was higher than that of radiologists alone (AUC: 0.852 vs. 0.808, P = 0.005). The mean reading time of with the model was shorter than that of reading alone (80.18 s vs. 62.28 s, P = 0.03). In prospective application, the sensitivity of detection reached 0.887 at false positive rate of 0.25; the AUC of radiologists with the model was 0.983 (95% CI: 0.978, 0.988), with sensitivity, specificity, PPV, and NPV of 94.36%, 98.07%, 87.76%, and 99.09%, respectively.Conclusions: The artificial intelligence model exhibits high accuracy for detecting and diagnosing breast lesions, improves diagnostic accuracy and saves time.Trial registration: NCT, NCT03708978. Registered 17 April 2018, https://register.clinicaltrials.gov/prs/app/ NCT03708978


2020 ◽  
Vol 9 (12) ◽  
pp. 3860
Author(s):  
J. Luis Espinoza ◽  
Le Thanh Dong

Nearly one-quarter of all cancer deaths worldwide are due to lung cancer, making this disease the leading cause of cancer death among both men and women. The most important determinant of survival in lung cancer is the disease stage at diagnosis, thus developing an effective screening method for early diagnosis has been a long-term goal in lung cancer care. In the last decade, and based on the results of large clinical trials, lung cancer screening programs using low-dose computer tomography (LDCT) in high-risk individuals have been implemented in some clinical settings, however, this method has various limitations, especially a high false-positive rate which eventually results in a number of unnecessary diagnostic and therapeutic interventions among the screened subjects. By using complex algorithms and software, artificial intelligence (AI) is capable to emulate human cognition in the analysis, interpretation, and comprehension of complicated data and currently, it is being successfully applied in various healthcare settings. Taking advantage of the ability of AI to quantify information from images, and its superior capability in recognizing complex patterns in images compared to humans, AI has the potential to aid clinicians in the interpretation of LDCT images obtained in the setting of lung cancer screening. In the last decade, several AI models aimed to improve lung cancer detection have been reported. Some algorithms performed equal or even outperformed experienced radiologists in distinguishing benign from malign lung nodules and some of those models improved diagnostic accuracy and decreased the false-positive rate. Here, we discuss recent publications in which AI algorithms are utilized to assess chest computer tomography (CT) scans imaging obtaining in the setting of lung cancer screening.


PEDIATRICS ◽  
1991 ◽  
Vol 88 (1) ◽  
pp. 121-131
Author(s):  
Michael R. DeBaun ◽  
Harold C. Sox

Erythrocyte protoporphyrin (EP) was introduced in the 1970s as an inexpensive screening test for lead poisoning. As greater knowledge of lead poisoning has accumulated, the recommended EP level at which further evaluation for lead poisoning should be initiated has been lowered from ≥50 µg/dL to ≥35 µg/dL. The purpose of this study was to evaluate the utility of this EP threshold. A receiver operator characteristic curve was constructed to assess the relationship between the true-positive rate and false-positive rate of EP at various decision thresholds. The receiver operator characteristic curve was constructed with data from the second National Health and Nutrition Examination Survey from 1976 to 1980, which included 2673 children 6 years of age or younger who had both blood lead and EP level determinations. Decision analysis was then used to determine the optimal EP decision threshold for detecting a blood lead level ≥25 µg/dL. The receiver operator characteristic curve demonstrated that EP is a poor predictor of a blood lead level ≥25 µg/dL. At the currently recommended EP decision threshold of 35 µg/dL, the true-positive rates and false-positive rates of EP are 0.23 and 0.04, respectively. As a result of the inadequate performance of EP screening for lead poisoning, when the prevalence of lead poisoning is greater than 8%, there is no EP decision threshold that optimizes the relationship between the cost of screening normal children and the benefit of detecting lead-poisoned children. Erythrocyte protoporphyrin measurement is not sufficiently sensitive to be recommended uniformly as a screening test for lead poisoning.


2021 ◽  
Vol 17 (1) ◽  
Author(s):  
N. Khantavee ◽  
C. Chanthick ◽  
A. Tungtrongchitr ◽  
N. Techakriengkrai ◽  
S. Suradhat ◽  
...  

Abstract Background In dogs with atopic dermatitis, intradermal testing (IDT) or allergen specific IgE serological testing are routinely employed to identify causative allergens. These allergens can then be used for allergen-specific immunotherapy and allergy management. The clinical relevance of this testing is affected by the source of allergen, and other biomarkers that are more related to specific allergens still need to be identified. The aim of this study was to investigate levels of specific IgE, total IgG, and IgG1 and IgG2 subclasses against the local house dust mites (HDM) Dermatophagoides farinae (DF) and D. pteronyssinus (DP) as biomarkers by using in-house ELISAs in healthy (n = 33) and atopic dogs (AD) (n = 44) that were either positive or negative by IDT to HDM. Results Being over 3 years of age was a risk factor for AD (Odds Ratio (OD) = 4.10, 95% Confidence interval (CI) 1.57–10.75, p = 0.0049), but there was no relation to IDT outcomes (OR = 0.9091, 95% CI 0.22–3.74, p = 1.00). High levels of all antibody isotypes (IgE, IgG, IgG1 and IgG2) against HDM were found in aged healthy dogs (> 3 years old). In AD, HDM-IgE and IgG1 levels were higher in dogs that were IDT positive to HDM than in IDT negative animals. Levels of IgE and IgG1 could be used to distinguish the specific allergens, whereas total IgG and IgG2 levels were not different between IDT-positive and IDT-negative AD. By the receiver operating characteristic curve at a false-positive rate = 0.10, both IgE and IgG1 showed better sensitivity than IgG and IgG2. Similar to IgE, serum IgG1 concentration was also relevant to IDT outcomes. Conclusions Our in-house ELISAs coated with local HDM were useful for evaluating antibody levels, and we propose use of the HDM-specific IgG1 subclass as a biomarker to detect HDM specific allergens in AD, potentially together with an IgE based platform.


2021 ◽  
Author(s):  
Hisashi Ohseto ◽  
Mami Ishikuro ◽  
Taku Obara ◽  
Keiko Murakami ◽  
Tomomi Onuma ◽  
...  

Abstract ObjectiveThe aim of our study was to develop prediction model for preeclampsia (PE) using routinely examined items in early pregnancy especially dipstick test for proteinuria.MethodThe Tohoku Medical Megabank Project Birth and Three-Generation Cohort Study recruited pregnant women and we included 9,086 of them in analysis. Maternal basic characteristics were obtained by self-report, and blood pressure and dipstick test of proteinuria were obtained by medical record at regular antenatal care. The outcome was defined as PE including superimposed preeclampsia. We developed prediction model without dipstick test of proteinuria (model 1) and model with it (model 2), and we compared them by the mean of area under the receiver operating characteristic curve (mAUROC) using five-fold cross validation.ResultsmAUROC of model 1 was 0.769 (95% CI; 0.741 to 0.797) and that of model 2 was 0.785 (95% CI; 0.758 to 0.812). The difference of two mAUROCs was 0.016 (95% CI; 0.004 to 0.028). In model 2, detection rates at false-positive rate of 5%, 10% and 20% were 40%, 49% and 64%, respectively.ConclusionsWe could develop prediction model for PE using routine antenatal care items and it was improved by including dipstick test for proteinuria.


2020 ◽  
Vol 34 (04) ◽  
pp. 4239-4246
Author(s):  
Tomoharu Iwata ◽  
Akinori Fujino ◽  
Naonori Ueda

The partial area under a receiver operating characteristic curve (pAUC) is a performance measurement for binary classification problems that summarizes the true positive rate with the specific range of the false positive rate. Obtaining classifiers that achieve high pAUC is important in a wide variety of applications, such as cancer screening and spam filtering. Although many methods have been proposed for maximizing the pAUC, existing methods require many labeled data for training. In this paper, we propose a semi-supervised learning method for maximizing the pAUC, which trains a classifier with a small amount of labeled data and a large amount of unlabeled data. To exploit the unlabeled data, we derive two approximations of the pAUC: the first is calculated from positive and unlabeled data, and the second is calculated from negative and unlabeled data. A classifier is trained by maximizing the weighted sum of the two approximations of the pAUC and the pAUC that is calculated from positive and negative data. With experiments using various datasets, we demonstrate that the proposed method achieves higher test pAUCs than existing methods.


Water ◽  
2018 ◽  
Vol 11 (1) ◽  
pp. 51 ◽  
Author(s):  
Nguyen Long ◽  
Florimond De Smedt

Rainfall-induced landslides form an important natural threat in Vietnam. The purpose of this study is to explore regional landslide susceptibility mapping in the mountainous district of A Luoi in Thua Thien Hue Province, where data on the occurrence and causes of landslides are very limited. Three methods are applied to examine landslide susceptibility: statistical index, logistic regression and certainty factor. Nine causative factors are considered: elevation, slope, geological strata, fault density, geomorphic landforms, weathering crust, land use, distance to rivers and annual precipitation. The reliability of the landslide susceptibility maps is evaluated by a receiver operating characteristic curve and the area under the curve is used to quantify and compare the prediction accuracy of the models. The certainty factor model performs best. This model is optimized by maximizing the difference between the true positive rate and the false positive rate. The optimal model correctly identifies 84% of the observed landslides. The results are verified with a validation test, whereby the model is calibrated with 75% randomly selected observed landslides, while the remaining 25% of the observed landslides are used for validation. The validation test correctly identifies 81% of the observed landslides in the training set and 73% of the observed landslides in the validation set.


2019 ◽  
Vol 11 (16) ◽  
pp. 4386 ◽  
Author(s):  
Pham ◽  
Shirzadi ◽  
Shahabi ◽  
Omidvar ◽  
Singh ◽  
...  

: Landslides have multidimensional effects on the socioeconomic as well as environmental conditions of the impacted areas. The aim of this study is the spatial prediction of landslide using hybrid machine learning models including bagging (BA), random subspace (RS) and rotation forest (RF) with alternating decision tree (ADTree) as base classifier in the northern part of the Pithoragarh district, Uttarakhand, Himalaya, India. To construct the database, ten conditioning factors and a total of 103 landslide locations with a ratio of 70/30 were used. The significant factors were determined by chi-square attribute evaluation (CSEA) technique. The validity of the hybrid models was assessed by true positive rate (TP Rate), false positive rate (FP Rate), recall (sensitivity), precision, F-measure and area under the receiver operatic characteristic curve (AUC). Results concluded that land cover was the most important factor while curvature had no effect on landslide occurrence in the study area and it was removed from the modelling process. Additionally, results indicated that although all ensemble models enhanced the power prediction of the ADTree classifier (AUCtraining = 0.859; AUCvalidation = 0.813); however, the RS ensemble model (AUCtraining = 0.883; AUCvalidation = 0.842) outperformed and outclassed the RF (AUCtraining = 0.871; AUCvalidation = 0.840), and the BA (AUCtraining = 0.865; AUCvalidation = 0.836) ensemble model. The obtained results would be helpful for recognizing the landslide prone areas in future to better manage and decrease the damage and negative impacts on the environment.


2020 ◽  
Author(s):  
Prasannavenkatesan Theerthagiri ◽  
I.Jeena Jacob ◽  
A.Usha Ruby ◽  
Y.Vamsidhar

Abstract This paper studies the different machine learning classification algorithms to predict the COVID-19 recovered and deceased cases. The k-fold cross-validation resampling technique is used to validate the prediction model. The prediction scores of each algorithm are evaluated with performance metrics such as prediction accuracy, precision, recall, mean square error, confusion matrix, and kappa score. For the given dataset, the k-nearest neighbour (KNN) classification algorithm produces 80.4 % of predication accuracy and 1.5 to 3.3 % of improved accuracy over other algorithms. The KNN algorithm predicts 92 % (true positive rate) of the deceased cases correctly with 0.077 % of misclassification. Further, the KNN algorithm produces the lowest error rate as 0.19 on the prediction of accurate COVID-19 cases than the other algorithm. Also, it produces the receiver operator characteristic curve with the output value of 82 %.


10.2196/10010 ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. e10010 ◽  
Author(s):  
Jiayi Shen ◽  
Casper J P Zhang ◽  
Bangsheng Jiang ◽  
Jiebin Chen ◽  
Jian Song ◽  
...  

Background Artificial intelligence (AI) has been extensively used in a range of medical fields to promote therapeutic development. The development of diverse AI techniques has also contributed to early detections, disease diagnoses, and referral management. However, concerns about the value of advanced AI in disease diagnosis have been raised by health care professionals, medical service providers, and health policy decision makers. Objective This review aimed to systematically examine the literature, in particular, focusing on the performance comparison between advanced AI and human clinicians to provide an up-to-date summary regarding the extent of the application of AI to disease diagnoses. By doing so, this review discussed the relationship between the current advanced AI development and clinicians with respect to disease diagnosis and thus therapeutic development in the long run. Methods We systematically searched articles published between January 2000 and March 2019 following the Preferred Reporting Items for Systematic reviews and Meta-Analysis in the following databases: Scopus, PubMed, CINAHL, Web of Science, and the Cochrane Library. According to the preset inclusion and exclusion criteria, only articles comparing the medical performance between advanced AI and human experts were considered. Results A total of 9 articles were identified. A convolutional neural network was the commonly applied advanced AI technology. Owing to the variation in medical fields, there is a distinction between individual studies in terms of classification, labeling, training process, dataset size, and algorithm validation of AI. Performance indices reported in articles included diagnostic accuracy, weighted errors, false-positive rate, sensitivity, specificity, and the area under the receiver operating characteristic curve. The results showed that the performance of AI was at par with that of clinicians and exceeded that of clinicians with less experience. Conclusions Current AI development has a diagnostic performance that is comparable with medical experts, especially in image recognition-related fields. Further studies can be extended to other types of medical imaging such as magnetic resonance imaging and other medical practices unrelated to images. With the continued development of AI-assisted technologies, the clinical implications underpinned by clinicians’ experience and guided by patient-centered health care principle should be constantly considered in future AI-related and other technology-based medical research.


Sign in / Sign up

Export Citation Format

Share Document