Automation of Lung Ultrasound Interpretation via Deep Learning for the Classification of Normal versus Abnormal Lung Parenchyma: A Multicenter Study

Lung ultrasound (LUS) is an accurate thoracic imaging technique distinguished by its handheld size, low-cost, and lack of radiation. User dependence and poor access to training have limited the impact and dissemination of LUS outside of acute care hospital environments. Automated interpretation of LUS using deep learning can overcome these barriers by increasing accuracy while allowing point-of-care use by non-experts. In this multicenter study, we seek to automate the clinically vital distinction between A line (normal parenchyma) and B line (abnormal parenchyma) on LUS by training a customized neural network using 272,891 labelled LUS images. After external validation on 23,393 frames, pragmatic clinical application at the clip level was performed on 1162 videos. The trained classifier demonstrated an area under the receiver operating curve (AUC) of 0.96 (+/−0.02) through 10-fold cross-validation on local frames and an AUC of 0.93 on the external validation dataset. Clip-level inference yielded sensitivities and specificities of 90% and 92% (local) and 83% and 82% (external), respectively, for detecting the B line pattern. This study demonstrates accurate deep-learning-enabled LUS interpretation between normal and abnormal lung parenchyma on ultrasound frames while rendering diagnostically important sensitivity and specificity at the video clip level.

Download Full-text

Clinical Impact of Combined Viral and Bacterial Infection in Pediatric Mycoplasmal Community-Acquired Pneumonia in Western China

Nanoscience and Nanotechnology Letters ◽

10.1166/nnl.2020.3237 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1315-1322

Author(s):

Zhuoxin Liang ◽

Wenqiang Zhang ◽

Yongjiang Jiang ◽

Ping Wu ◽

Senxiong Zhang ◽

...

Keyword(s):

Clinical Characteristics ◽

Severe Disease ◽

Lung Parenchyma ◽

Community Acquired Pneumonia ◽

Western China ◽

Care Hospital ◽

Hospital Data ◽

Chest X Ray ◽

Ventilation Rates ◽

The Impact

Community-acquired pneumonia (CAP) refers to an infection contracted outside the hospital that leads to lung parenchyma inflammation. The clinical characteristics of Mycoplasma pneumoniae (M. pneumoniae) infection in CAP patients were rarely reported. The aim of this study was to describe the clinical characteristic and the impact of co-infections of M. pneumoniae with viral and bacterial pathogens in hospitalized children with CAP in Liuzhou, China. This study retrospects children diagnosed with CAP due to M. pneumoniae infection at a tertiary maternal and child health care hospital. Data related to co-infection pathogens, demographics, clinical characteristics, and hospitalization cost were collected from the electronic medical system in this hospital. A total of 983 children were diagnosed with mycoplasmal CAP in 2017. Among them, 18.2% had a bacterial-M. pneumoniae co-infection and 11.3% had a viral-M. pneumoniae co-infection. The highest infection rate of M. pneumoniae was 19.1% in February and March, while the highest rates of bacterial-M. pneumoniae and viral-M. pneumoniae co-infections were 3.6% in December and 2.3% in January, respectively. The prevalence of coughing and wheezing had significant differences between the bacterial- or viral-M. pneumoniae co-infections and the mono-infection groups. Furthermore, the chest X-ray progression, pleural effusions, respiratory failure, and ventilation rates were higher in the respiratory viral- and bacterial-M. pneumoniae co-infection groups than in the mono-infection group. Children with a bacterial or respiratory viral co-infection had a longer hospitalization and spent more on treatment fees than those with a M. pneumoniae mono-infection (P value <0.001). We conclude that children with mycoplasmal CAP, either with a bacterial or viral co-infection, who show signs of coughing and wheezing and have a radiographic progression, will have a severe disease progression and should be specifically treated and managed.

Download Full-text

Deep Learning for Caries Detection and Classification

Diagnostics ◽

10.3390/diagnostics11091672 ◽

2021 ◽

Vol 11 (9) ◽

pp. 1672

Author(s):

Luya Lian ◽

Tianer Zhu ◽

Fudong Zhu ◽

Haihua Zhu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Disease Diagnosis ◽

Validation Dataset ◽

Reference Dataset ◽

Dice Coefficient ◽

Learning Methods ◽

Test Dataset ◽

The Impact ◽

Caries Lesions

Objectives: Deep learning methods have achieved impressive diagnostic performance in the field of radiology. The current study aimed to use deep learning methods to detect caries lesions, classify different radiographic extensions on panoramic films, and compare the classification results with those of expert dentists. Methods: A total of 1160 dental panoramic films were evaluated by three expert dentists. All caries lesions in the films were marked with circles, whose combination was defined as the reference dataset. A training and validation dataset (1071) and a test dataset (89) were then established from the reference dataset. A convolutional neural network, called nnU-Net, was applied to detect caries lesions, and DenseNet121 was applied to classify the lesions according to their depths (dentin lesions in the outer, middle, or inner third D1/2/3 of dentin). The performance of the test dataset in the trained nnU-Net and DenseNet121 models was compared with the results of six expert dentists in terms of the intersection over union (IoU), Dice coefficient, accuracy, precision, recall, negative predictive value (NPV), and F1-score metrics. Results: nnU-Net yielded caries lesion segmentation IoU and Dice coefficient values of 0.785 and 0.663, respectively, and the accuracy and recall rate of nnU-Net were 0.986 and 0.821, respectively. The results of the expert dentists and the neural network were shown to be no different in terms of accuracy, precision, recall, NPV, and F1-score. For caries depth classification, DenseNet121 showed an overall accuracy of 0.957 for D1 lesions, 0.832 for D2 lesions, and 0.863 for D3 lesions. The recall results of the D1/D2/D3 lesions were 0.765, 0.652, and 0.918, respectively. All metric values, including accuracy, precision, recall, NPV, and F1-score values, were proven to be no different from those of the experienced dentists. Conclusion: In detecting and classifying caries lesions on dental panoramic radiographs, the performance of deep learning methods was similar to that of expert dentists. The impact of applying these well-trained neural networks for disease diagnosis and treatment decision making should be explored.

Download Full-text

Improvement of APACHE II score system for disease severity based on XGBoost algorithm

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01591-x ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Yan Luo ◽

Zhiyu Wang ◽

Cong Wang

Keyword(s):

Intensive Care ◽

Hospital Mortality ◽

Patient Outcomes ◽

External Validation ◽

Apache Ii ◽

Validation Dataset ◽

Apache Ii Score ◽

Severity Scores ◽

The Impact ◽

Improve Patient

Abstract Background Prognostication is an essential tool for risk adjustment and decision making in the intensive care units (ICUs). In order to improve patient outcomes, we have been trying to develop a more effective model than Acute Physiology and Chronic Health Evaluation (APACHE) II to measure the severity of the patients in ICUs. The aim of the present study was to provide a mortality prediction model for ICUs patients, and to assess its performance relative to prediction based on the APACHE II scoring system. Methods We used the Medical Information Mart for Intensive Care version III (MIMIC-III) database to build our model. After comparing the APACHE II with 6 typical machine learning (ML) methods, the best performing model was screened for external validation on anther independent dataset. Performance measures were calculated using cross-validation to avoid making biased assessments. The primary outcome was hospital mortality. Finally, we used TreeSHAP algorithm to explain the variable relationships in the extreme gradient boosting algorithm (XGBoost) model. Results We picked out 14 variables with 24,777 cases to form our basic data set. When the variables were the same as those contained in the APACHE II, the accuracy of XGBoost (accuracy: 0.858) was higher than that of APACHE II (accuracy: 0.742) and other algorithms. In addition, it exhibited better calibration properties than other methods, the result in the area under the ROC curve (AUC: 0.76). we then expand the variable set by adding five new variables to improve the performance of our model. The accuracy, precision, recall, F1, and AUC of the XGBoost model increased, and were still higher than other models (0.866, 0.853, 0.870, 0.845, and 0.81, respectively). On the external validation dataset, the AUC was 0.79 and calibration properties were good. Conclusions As compared to conventional severity scores APACHE II, our XGBoost proposal offers improved performance for predicting hospital mortality in ICUs patients. Furthermore, the TreeSHAP can help to enhance the understanding of our model by providing detailed insights into the impact of different features on the disease risk. In sum, our model could help clinicians determine prognosis and improve patient outcomes.

Download Full-text

Automated Classification of Colorectal Neoplasms in White-Light Colonoscopy Images via Deep Learning

Journal of Clinical Medicine ◽

10.3390/jcm9051593 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1593 ◽

Cited By ~ 3

Author(s):

Young Joo Yang ◽

Bum-Joo Cho ◽

Myung-Je Lee ◽

Ju Han Kim ◽

Hyun Lim ◽

...

Keyword(s):

Deep Learning ◽

Colorectal Neoplasms ◽

White Light ◽

External Validation ◽

Validation Dataset ◽

Tubular Adenoma ◽

Colonoscopic Examination ◽

Test Dataset ◽

The Mean

Background: Classification of colorectal neoplasms during colonoscopic examination is important to avoid unnecessary endoscopic biopsy or resection. This study aimed to develop and validate deep learning models that automatically classify colorectal lesions histologically on white-light colonoscopy images. Methods: White-light colonoscopy images of colorectal lesions exhibiting pathological results were collected and classified into seven categories: stages T1-4 colorectal cancer (CRC), high-grade dysplasia (HGD), tubular adenoma (TA), and non-neoplasms. The images were then re-classified into four categories including advanced CRC, early CRC/HGD, TA, and non-neoplasms. Two convolutional neural network models were trained, and the performances were evaluated in an internal test dataset and an external validation dataset. Results: In total, 3828 images were collected from 1339 patients. The mean accuracies of ResNet-152 model for the seven-category and four-category classification were 60.2% and 67.3% in the internal test dataset, and 74.7% and 79.2% in the external validation dataset, respectively, including 240 images. In the external validation, ResNet-152 outperformed two endoscopists for four-category classification, and showed a higher mean area under the curve (AUC) for detecting TA+ lesions (0.818) compared to the worst-performing endoscopist. The mean AUC for detecting HGD+ lesions reached 0.876 by Inception-ResNet-v2. Conclusions: A deep learning model presented promising performance in classifying colorectal lesions on white-light colonoscopy images; this model could help endoscopists build optimal treatment strategies.

Download Full-text

Deep Learning Model for Screening Sepsis Using Electrocardiography

10.21203/rs.3.rs-186976/v1 ◽

2021 ◽

Author(s):

Joon-myoung Kwon ◽

Ye Rang Lee ◽

Min-Seung Jung ◽

Yoon-Ji Lee ◽

Yong-Yeon Jo ◽

...

Keyword(s):

Septic Shock ◽

Deep Learning ◽

Characteristic Curve ◽

External Validation ◽

Medical Emergency ◽

Validation Dataset ◽

Internal Validation ◽

Significant Difference ◽

Life Threatening ◽

Sepsis And Septic Shock

Abstract Background: Sepsis is a life-threatening organ dysfunction and is a major healthcare burden worldwide. Although sepsis is a medical emergency that requires immediate management, it is difficult to screen the occurrence of sepsis. In this study, we propose an artificial intelligence based on deep learning-based model (DLM) for screening sepsis using electrocardiography (ECG).Methods: This retrospective cohort study included 46,017 patients who admitted to two hospitals. 1,548 and 639 patients underwent sepsis and septic shock. The DLM was developed using 73,727 ECGs of 18,142 patients and internal validation was conducted using 7,774 ECGs of 7,774 patients. Furthermore, we conducted an external validation with 20,101 ECGs of 20,101 patients from another hospital to verify the applicability of the DLM across centers.Results: During the internal and external validation, the area under the receiver operating characteristic curve (AUC) of an DLM using 12-lead ECG for screening sepsis were 0.901 (95% confidence interval 0.882–0.920) and 0.863 (0.846–0.879), respectively. During internal and external validation, AUC of an DLM for detecting septic shock were 0.906 (95% CI = 0.877–0.936) and 0.899 (95% CI = 0.872–0.925), respectively. The AUC of the DLM for detecting sepsis using 6-lead and single-lead ECGs were 0.845–0.882. A sensitivity map showed that the QRS complex and T wave was associated with sepsis. Subgroup analysis was conducted using ECGs from 4,609 patients who admitted with infectious disease, The AUC of the DLM for predicting in-hospital mortality was 0.817 (0.793–0.840). There was a significant difference in the prediction score of DLM using ECG according to the presence of infection in the validation dataset (0.277 vs 0.574, p<0.001), including severe acute respiratory syndrome coronavirus 2 (0.260 vs 0.725, p=0.018).Conclusions: The DLM demonstrated reasonable performance for screening sepsis using 12-, 6-, and single-lead ECG. The results suggest that sepsis can be screened using not only conventional ECG devices, but also diverse life-type ECG machine employing the DLM, thereby preventing irreversible disease progression and mortality.

Download Full-text

Deep learning applied to lung ultrasound videos for scoring COVID-19 patients: A multicenter study

The Journal of the Acoustical Society of America ◽

10.1121/10.0004855 ◽

2021 ◽

Vol 149 (5) ◽

pp. 3626-3634

Author(s):

Federico Mento ◽

Tiziano Perrone ◽

Anna Fiengo ◽

Andrea Smargiassi ◽

Riccardo Inchingolo ◽

...

Keyword(s):

Deep Learning ◽

Multicenter Study ◽

Lung Ultrasound

Download Full-text

Predicting Sex from Retinal Fundus Photographs Using Automated Deep Learning

10.21203/rs.3.rs-402433/v1 ◽

2021 ◽

Author(s):

Edward Korot ◽

Nikolas Pontikos ◽

Xiaoxuan Liu ◽

Siegfried K Wagner ◽

Livia Faes ◽

...

Keyword(s):

Deep Learning ◽

Characteristic Curve ◽

External Validation ◽

Model Development ◽

Model Performance ◽

Validation Dataset ◽

Internal Validation ◽

Fundus Photographs ◽

The Uk ◽

Retinal Fundus

Abstract Deep learning may transform health care, but model development has largely been dependent on availability of advanced technical expertise. Herein we present the development of a deep learning model by clinicians without coding, which predicts reported sex from retinal fundus photographs. A model was trained on 84,743 retinal fundus photos from the UK Biobank dataset. External validation was performed on 252 fundus photos from a tertiary ophthalmic referral center. For internal validation, the area under the receiver operating characteristic curve (AUROC) of the code free deep learning (CFDL) model was 0.93. Sensitivity, specificity, positive predictive value (PPV) and accuracy (ACC) were 88.8%, 83.6%, 87.3% and 86.5%, and for external validation were 83.9%, 72.2%, 78.2% and 78.6% respectively. Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task. The model performed significantly worse when foveal pathology was present in the external validation dataset, ACC: 69.4%, compared to 85.4% in healthy eyes, suggesting the fovea is a salient region for model performance OR (95% CI): 0.36 (0.19, 0.70) p = 0.0022. Automated machine learning (AutoML) may enable clinician-driven automated discovery of novel insights and disease biomarkers.

Download Full-text

Easily created prediction model using deep learning software (Prediction One, Sony Network Communications Inc.) for subarachnoid hemorrhage outcomes from small dataset at admission

Surgical Neurology International ◽

10.25259/sni_636_2020 ◽

2020 ◽

Vol 11 ◽

pp. 374

Author(s):

Masahito Katsuki ◽

Yukinari Kakizawa ◽

Akihiro Nishikawa ◽

Yasunaga Yamamoto ◽

Toshiya Uchiyama

Keyword(s):

Deep Learning ◽

Subarachnoid Hemorrhage ◽

Prediction Model ◽

Prediction Models ◽

External Validation ◽

Training Dataset ◽

Validation Dataset ◽

Good Prediction ◽

Network Communications ◽

Learning Software

Background: Reliable prediction models of subarachnoid hemorrhage (SAH) outcomes are needed for decision-making of the treatment. SAFIRE score using only four variables is a good prediction scoring system. However, making such prediction models needs a large number of samples and time-consuming statistical analysis. Deep learning (DL), one of the artificial intelligence, is attractive, but there were no reports on prediction models for SAH outcomes using DL. We herein made a prediction model using DL software, Prediction One (Sony Network Communications Inc., Tokyo, Japan) and compared it to SAFIRE score. Methods: We used 153 consecutive aneurysmal SAH patients data in our hospital between 2012 and 2019. Modified Rankin Scale (mRS) 0–3 at 6 months was defined as a favorable outcome. We randomly divided them into 102 patients training dataset and 51 patients external validation dataset. Prediction one made the prediction model using the training dataset with internal cross-validation. We used both the created model and SAFIRE score to predict the outcomes using the external validation set. The areas under the curve (AUCs) were compared. Results: The model made by Prediction One using 28 variables had AUC of 0.848, and its AUC for the validation dataset was 0.953 (95%CI 0.900–1.000). AUCs calculated using SAFIRE score were 0.875 for the training dataset and 0.960 for the validation dataset, respectively. Conclusion: We easily and quickly made prediction models using Prediction One, even with a small single-center dataset. The accuracy of the model was not so inferior to those of previous statistically calculated prediction models.

Download Full-text

Associations Between Predialysis Creatinine, SOFA Score, and Mortality in Acute Kidney Injury Patients Requiring Dialysis

10.21203/rs.3.rs-109940/v1 ◽

2020 ◽

Author(s):

Hsin-Hsiung Chang ◽

Chia-Lin Wu ◽

Ping-Fang Chiu

Keyword(s):

Acute Kidney Injury ◽

Mortality Rate ◽

Creatinine Level ◽

Sofa Score ◽

External Validation ◽

Multivariate Logistic Regression Analysis ◽

Kidney Injury ◽

Validation Dataset ◽

Study Cohort ◽

The Impact

Abstract Background: Creatinine is widely used to estimate renal function, but this is not practical in critical illness. Low creatinine has been associated with mortality in many clinical settings. However, the associations between predialysis creatinine level, Sepsis-related Organ Failure Assessment (SOFA) score, and mortality in acute kidney injury patients receiving dialysis therapy (AKI-D) has not been fully addressed. Methods: We extracted data for AKI-D patients in the eICU (n = 1,992) and MIMIC (n = 1,001) databases. We conducted a retrospective observational cohort study using the eICU dataset. The study cohort was divided into the high-creatine group and the low-creatinine group by the median value (4 mg/dL). The baseline patient information included demographic data, laboratory parameters, medications, and comorbid conditions. The independent association of creatinine level with mortality was examined using multivariate logistic regression analysis. We also carried out a sensitivity analysis using the MIMIC dataset.Results: In all 1,992 eICU participants, the 30-day intensive care unit mortality rate was 32.2%. The crude overall mortality rate in the low-creatinine group (43.7%) was significantly higher than that in the high-creatinine group (20.6%; P < 0.001). In the fully adjusted models, the high-creatinine group was associated with a lower risk of all-cause mortality (odds ratio, 0.56; 95% confidence interval, 0.42–0.75; P < 0.001) compared with the low-creatinine group. The nonrenal SOFA score was higher in the low-creatinine group. The results were consistent when the MIMIC dataset was used as an external validation dataset.Conclusions: AKI-D patients with a low predialysis creatinine value had a significantly higher risk of mortality that might be associated with more organ dysfunctions. Moreover, SOFA and nonrenal SOFA scores did not sufficiently reflect the severity of illness without considering the impact of the creatinine value in AKI-D patients.

Download Full-text

Deep learning-based amyloid PET positivity classification model in the Alzheimer’s disease continuum by using 2-[18F]FDG PET

EJNMMI Research ◽

10.1186/s13550-021-00798-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Suhong Kim ◽

Peter Lee ◽

Kyeong Taek Oh ◽

Min Soo Byun ◽

Dahyun Yi ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Deep Learning ◽

Fdg Pet ◽

External Validation ◽

Classification Model ◽

Validation Dataset ◽

Position Emission Tomography ◽

Amyloid Pet ◽

Deep Model

Abstract Background Considering the limited accessibility of amyloid position emission tomography (PET) in patients with dementia, we proposed a deep learning (DL)-based amyloid PET positivity classification model from PET images with 2-deoxy-2-[fluorine-18]fluoro-D-glucose (2-[18F]FDG). Methods We used 2-[18F]FDG PET datasets from the Alzheimer's Disease Neuroimaging Initiative and Korean Brain Aging Study for the Early diagnosis and prediction of Alzheimer’s disease for model development. Moreover, we used an independent dataset from another hospital. A 2.5-D deep learning architecture was constructed using 291 submodules and three axes images as the input. We conducted the voxel-wise analysis to assess the regions with substantial differences in glucose metabolism between the amyloid PET-positive and PET-negative participants. This facilitated an understanding of the deep model classification. In addition, we compared these regions with the classification probability from the submodules. Results There were 686 out of 1433 (47.9%) and 50 out of 100 (50%) amyloid PET-positive participants in the training and internal validation datasets and the external validation datasets, respectively. With 50 times iterations of model training and validation, the model achieved an AUC of 0.811 (95% confidence interval (CI) of 0.803–0.819) and 0.798 (95% CI, 0.789–0.807) on the internal and external validation datasets, respectively. The area under the curve (AUC) was 0.860 when tested with the model with the highest value (0.864) on the external validation dataset. Moreover, it had 75.0% accuracy, 76.0% sensitivity, 74.0% specificity, and 75.0% F1-score. We found an overlap between the regions within the default mode network, thus generating high classification values. Conclusion The proposed model based on the 2-[18F]FDG PET imaging data and a DL framework might successfully classify amyloid PET positivity in clinical practice, without performing amyloid PET, which have limited accessibility.

Download Full-text