scholarly journals A calibrated deep learning ensemble for abnormality detection in musculoskeletal radiographs

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Minliang He ◽  
Xuming Wang ◽  
Yijun Zhao

AbstractMusculoskeletal disorders affect the locomotor system and are the leading contributor to disability worldwide. Patients suffer chronic pain and limitations in mobility, dexterity, and functional ability. Musculoskeletal (bone) X-ray is an essential tool in diagnosing the abnormalities. In recent years, deep learning algorithms have increasingly been applied in musculoskeletal radiology and have produced remarkable results. In our study, we introduce a new calibrated ensemble of deep learners for the task of identifying abnormal musculoskeletal radiographs. Our model leverages the strengths of three baseline deep neural networks (ConvNet, ResNet, and DenseNet), which are typically employed either directly or as the backbone architecture in the existing deep learning-based approaches in this domain. Experimental results based on the public MURA dataset demonstrate that our proposed model outperforms three individual models and a traditional ensemble learner, achieving an overall performance of (AUC: 0.93, Accuracy: 0.87, Precision: 0.93, Recall: 0.81, Cohen’s kappa: 0.74). The model also outperforms expert radiologists in three out of the seven upper extremity anatomical regions with a leading performance of (AUC: 0.97, Accuracy: 0.93, Precision: 0.90, Recall:0.97, Cohen’s kappa: 0.85) in the humerus region. We further apply the class activation map technique to highlight the areas essential to our model’s decision-making process. Given that the best radiologist performance is between 0.73 and 0.78 in Cohen’s kappa statistic, our study provides convincing results supporting the utility of a calibrated ensemble approach for assessing abnormalities in musculoskeletal X-rays.

2021 ◽  
Vol 11 (6) ◽  
pp. 2723
Author(s):  
Fatih Uysal ◽  
Fırat Hardalaç ◽  
Ozan Peker ◽  
Tolga Tolunay ◽  
Nil Tokgöz

Fractures occur in the shoulder area, which has a wider range of motion than other joints in the body, for various reasons. To diagnose these fractures, data gathered from X-radiation (X-ray), magnetic resonance imaging (MRI), or computed tomography (CT) are used. This study aims to help physicians by classifying shoulder images taken from X-ray devices as fracture/non-fracture with artificial intelligence. For this purpose, the performances of 26 deep learning-based pre-trained models in the detection of shoulder fractures were evaluated on the musculoskeletal radiographs (MURA) dataset, and two ensemble learning models (EL1 and EL2) were developed. The pre-trained models used are ResNet, ResNeXt, DenseNet, VGG, Inception, MobileNet, and their spinal fully connected (Spinal FC) versions. In the EL1 and EL2 models developed using pre-trained models with the best performance, test accuracy was 0.8455, 0.8472, Cohen’s kappa was 0.6907, 0.6942 and the area that was related with fracture class under the receiver operating characteristic (ROC) curve (AUC) was 0.8862, 0.8695. As a result of 28 different classifications in total, the highest test accuracy and Cohen’s kappa values were obtained in the EL2 model, and the highest AUC value was obtained in the EL1 model.


Stroke ◽  
2021 ◽  
Author(s):  
Maximilian Nielsen ◽  
Moritz Waldmann ◽  
Andreas M. Frölich ◽  
Fabian Flottmann ◽  
Evelin Hristova ◽  
...  

Background and Purpose: Mechanical thrombectomy is an established procedure for treatment of acute ischemic stroke. Mechanical thrombectomy success is commonly assessed by the Thrombolysis in Cerebral Infarction (TICI) score, assigned by visual inspection of X-ray digital subtraction angiography data. However, expert-based TICI scoring is highly observer-dependent. This represents a major obstacle for mechanical thrombectomy outcome comparison in, for instance, multicentric clinical studies. Focusing on occlusions of the M1 segment of the middle cerebral artery, the present study aimed to develop a deep learning (DL) solution to automated and, therefore, objective TICI scoring, to evaluate the agreement of DL- and expert-based scoring, and to compare corresponding numbers to published scoring variability of clinical experts. Methods: The study comprises 2 independent datasets. For DL system training and initial evaluation, an in-house dataset of 491 digital subtraction angiography series and modified TICI scores of 236 patients with M1 occlusions was collected. To test the model generalization capability, an independent external dataset with 95 digital subtraction angiography series was analyzed. Characteristics of the DL system were modeling TICI scoring as ordinal regression, explicit consideration of the temporal image information, integration of physiological knowledge, and modeling of inherent TICI scoring uncertainties. Results: For the in-house dataset, the DL system yields Cohen’s kappa, overall accuracy, and specific agreement values of 0.61, 71%, and 63% to 84%, respectively, compared with the gold standard: the expert rating. Values slightly drop to 0.52/64%/43% to 87% when the model is, without changes, applied to the external dataset. After model updating, they increase to 0.65/74%/60% to 90%. Literature Cohen’s kappa values for expert-based TICI scoring agreement are in the order of 0.6. Conclusions: The agreement of DL- and expert-based modified TICI scores in the range of published interobserver variability of clinical experts highlights the potential of the proposed DL solution to automated TICI scoring.


Author(s):  
Abhinav Sharma ◽  
Emily Oulousian ◽  
Jiayi Ni ◽  
Renato Lopes ◽  
Matthew Pellan Cheng ◽  
...  

Abstract Aims Artificial intelligence (A.I) driven voice-based assistants may facilitate data capture in clinical care and trials; however, the feasibility and accuracy of using such devices in a healthcare environment are unknown. We explored the feasibility of using the Amazon Alexa (‘Alexa’) A.I. voice-assistant to screen for risk-factors or symptoms relating to SARS-CoV-2 exposure in quaternary care cardiovascular clinics. Methods We enrolled participants to be screened for signs and symptoms of SARS-CoV-2 exposure by a healthcare provider and then subsequently by the Alexa. Our primary outcome was interrater reliability of Alexa to healthcare provider screening using Cohen’s Kappa statistic. Participants rated the Alexa in a post-study survey (scale of 1 to 5 with 5 reflecting strongly agree). This study was approved by the McGill University Health Centre ethics board. Results We prospectively enrolled 215 participants. The mean age was 46 years (17.7 years standard deviation [SD]), 55% were female, and 31% were French speakers (others were English). In total, 645 screening questions were delivered by Alexa. The Alexa mis-identified one response. The simple and weighted Cohen’s kappa statistic between Alexa and healthcare provider screening was 0.989 (95% CI: 0.982, 0.997) and 0.992 (955 CI 0.985, 0.999) respectively. The participants gave an overall mean rating of 4.4 (out of 5, 0.9 SD). Conclusion Our study demonstrates the feasibility of an A.I. driven multilingual voice-based assistant to collect data in the context of SARS-CoV-2 exposure screening. Future studies integrating such devices in cardiovascular healthcare delivery and clinical trials are warranted. Registration https://clinicaltrials.gov/ct2/show/NCT04508972


Nutrients ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 1436 ◽  
Author(s):  
Adelle M. Gadowski ◽  
Tracy A. McCaffrey ◽  
Stephane Heritier ◽  
Andrea J. Curtis ◽  
Natalie Nanayakkara ◽  
...  

The aim of this study was to assess the relative validity and reproducibility of a six-item Australian Short Dietary Screener (Aus-SDS). The Aus-SDS assessed the daily intake of core food groups (vegetables, fruits, legumes and beans, cereals, protein sources and dairy sources) in 100 Australians (52 males and 48 females) aged ≥70 years. Relative validity was assessed by comparing intakes from the Aus-SDS1 with an average of three 24-h recalls (24-HRs), and reproducibility using two administrations of the Aus-SDS (Aus-SDS1 and Aus-SDS2). Cohen’s kappa statistic between the Aus-SDS1 and 24-HRs showed moderate to good agreement, ranging from 0.44 for fruits and dairy to 0.64 for protein. There was poor agreement for legume intake (0.12). Bland–Altman plots demonstrated acceptable limits of agreement between the Aus-SDS1 and 24-HRs for all food groups. Median intakes obtained from Aus-SDS1 and Aus-SDS2 did not differ. For all food groups, Cohen’s kappa statistic ranged from 0.68 to 0.89, indicating acceptable agreement between the Aus-SDS1 and Aus-SDS2. Spearman’s correlation coefficient between Aus-SDS1 and 24-HRs across all food groups ranged from 0.64 for fruit to 0.83 for protein. We found the Aus-SDS to be a useful tool in assessing daily intake of core food groups in this population.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1593 ◽  
Author(s):  
Yanlei Gu ◽  
Huiyang Zhang ◽  
Shunsuke Kamijo

Image based human behavior and activity understanding has been a hot topic in the field of computer vision and multimedia. As an important part, skeleton estimation, which is also called pose estimation, has attracted lots of interests. For pose estimation, most of the deep learning approaches mainly focus on the joint feature. However, the joint feature is not sufficient, especially when the image includes multi-person and the pose is occluded or not fully visible. This paper proposes a novel multi-task framework for the multi-person pose estimation. The proposed framework is developed based on Mask Region-based Convolutional Neural Networks (R-CNN) and extended to integrate the joint feature, body boundary, body orientation and occlusion condition together. In order to further improve the performance of the multi-person pose estimation, this paper proposes to organize the different information in serial multi-task models instead of the widely used parallel multi-task network. The proposed models are trained on the public dataset Common Objects in Context (COCO), which is further augmented by ground truths of body orientation and mutual-occlusion mask. Experiments demonstrate the performance of the proposed method for multi-person pose estimation and body orientation estimation. The proposed method can detect 84.6% of the Percentage of Correct Keypoints (PCK) and has an 83.7% Correct Detection Rate (CDR). Comparisons further illustrate the proposed model can reduce the over-detection compared with other methods.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e16605-e16605
Author(s):  
Choongheon Yoon ◽  
Jasper Van ◽  
Michelle Bardis ◽  
Param Bhatter ◽  
Alexander Ushinsky ◽  
...  

e16605 Background: Prostate Cancer is the most commonly diagnosed male cancer in the U.S. Multiparametric magnetic resonance imaging (mpMRI) is increasingly used for both prostate cancer evaluation and biopsy guidance. The PI-RADS v2 scoring paradigm was developed to stratify prostate lesions on MRI and to predict lesion grade. Prostate organ and lesion segmentation is an essential step in pre-biopsy surgical planning. Deep learning convolutional neural networks (CNN) for image recognition are becoming a more common method of machine learning. In this study, we develop a comprehensive deep learning pipeline of 3D/2D CNN based on U-Net architecture for automatic localization and segmentation of prostates, detection of prostate lesions and PI-RADS v2 lesion scoring of mpMRIs. Methods: This IRB approved retrospective review included a total of 303 prostate nodules from 217 patients who had a prostate mpMRI between September 2014 and December 2016 and an MR-guided transrectal biopsy. For each T2 weighted image, a board-certified abdominal radiologist manually segmented the prostate and each prostate lesion. The T2 weighted and ADC series were co-registered and each lesion was assigned an overall PI-RADS score, T2 weighted PI-RADS score, and ADC PI-RADS score. After a U-Net neural network segmented the prostate organ, a mask regional convolutional neural network (R-CNN) was applied. The mask R-CNN is composed of three neural networks: feature pyramid network, region proposal network, and head network. The mask R-CNN detected the prostate lesion, segmented it, and estimated its PI-RADS score. Instead, the mask R-CNN was implemented to regress along dimensions of the PI-RADS criteria. The mask R-CNN performance was assessed with AUC, Sørensen–Dice coefficient, and Cohen’s Kappa for PI-RADS scoring agreement. Results: The AUC for prostate nodule detection was 0.79. By varying detection thresholds, sensitivity/PPV were 0.94/.54 and 0.60/0.87 at either ends of the spectrum. For detected nodules, the segmentation Sørensen–Dice coefficient was 0.76 (0.72 – 0.80). Weighted Cohen’s Kappa for PI-RADS scoring agreement was 0.63, 0.71, and 0.51 for composite, T2 weighted, and ADC respectively. Conclusions: These results demonstrate the feasibility of implementing a comprehensive 3D/2D CNN-based deep learning pipeline for evaluation of prostate mpMRI. This method is highly accurate for organ segmentation. The results for lesion detection and categorization are modest; however, the PI-RADS v2 score accuracy is comparable to previously published human interobserver agreement.


2021 ◽  
Vol 11 (21) ◽  
pp. 10301
Author(s):  
Muhammad Shoaib Farooq ◽  
Attique Ur Rehman ◽  
Muhammad Idrees ◽  
Muhammad Ahsan Raza ◽  
Jehad Ali ◽  
...  

COVID-19 has been difficult to diagnose and treat at an early stage all over the world. The numbers of patients showing symptoms for COVID-19 have caused medical facilities at hospitals to become unavailable or overcrowded, which is a major challenge. Studies have recently allowed us to determine that COVID-19 can be diagnosed with the aid of chest X-ray images. To combat the COVID-19 outbreak, developing a deep learning (DL) based model for automated COVID-19 diagnosis on chest X-ray is beneficial. In this research, we have proposed a customized convolutional neural network (CNN) model to detect COVID-19 from chest X-ray images. The model is based on nine layers which uses a binary classification method to differentiate between COVID-19 and normal chest X-rays. It provides COVID-19 detection early so the patients can be admitted in a timely fashion. The proposed model was trained and tested on two publicly available datasets. Cross-dataset studies are used to assess the robustness in a real-world context. Six hundred X-ray images were used for training and two hundred X-rays were used for validation of the model. The X-ray images of the dataset were preprocessed to improve the results and visualized for better analysis. The developed algorithm reached 98% precision, recall and f1-score. The cross-dataset studies also demonstrate the resilience of deep learning algorithms in a real-world context with 98.5 percent accuracy. Furthermore, a comparison table was created which shows that our proposed model outperforms other relative models in terms of accuracy. The quick and high-performance of our proposed DL-based customized model identifies COVID-19 patients quickly, which is helpful in controlling the COVID-19 outbreak.


2021 ◽  
Vol 7 ◽  
pp. e551
Author(s):  
Nihad Karim Chowdhury ◽  
Muhammad Ashad Kabir ◽  
Md. Muhtadir Rahman ◽  
Noortaz Rezoana

The goal of this research is to develop and implement a highly effective deep learning model for detecting COVID-19. To achieve this goal, in this paper, we propose an ensemble of Convolutional Neural Network (CNN) based on EfficientNet, named ECOVNet, to detect COVID-19 from chest X-rays. To make the proposed model more robust, we have used one of the largest open-access chest X-ray data sets named COVIDx containing three classes—COVID-19, normal, and pneumonia. For feature extraction, we have applied an effective CNN structure, namely EfficientNet, with ImageNet pre-training weights. The generated features are transferred into custom fine-tuned top layers followed by a set of model snapshots. The predictions of the model snapshots (which are created during a single training) are consolidated through two ensemble strategies, i.e., hard ensemble and soft ensemble, to enhance classification performance. In addition, a visualization technique is incorporated to highlight areas that distinguish classes, thereby enhancing the understanding of primal components related to COVID-19. The results of our empirical evaluations show that the proposed ECOVNet model outperforms the state-of-the-art approaches and significantly improves detection performance with 100% recall for COVID-19 and overall accuracy of 96.07%. We believe that ECOVNet can enhance the detection of COVID-19 disease, and thus, underpin a fully automated and efficacious COVID-19 detection system.


AI ◽  
2020 ◽  
Vol 1 (3) ◽  
pp. 418-435
Author(s):  
Khandaker Haque ◽  
Ahmed Abdelgawad

Deep Learning has improved multi-fold in recent years and it has been playing a great role in image classification which also includes medical imaging. Convolutional Neural Networks (CNNs) have been performing well in detecting many diseases including coronary artery disease, malaria, Alzheimer’s disease, different dental diseases, and Parkinson’s disease. Like other cases, CNN has a substantial prospect in detecting COVID-19 patients with medical images like chest X-rays and CTs. Coronavirus or COVID-19 has been declared a global pandemic by the World Health Organization (WHO). As of 8 August 2020, the total COVID-19 confirmed cases are 19.18 M and deaths are 0.716 M worldwide. Detecting Coronavirus positive patients is very important in preventing the spread of this virus. On this conquest, a CNN model is proposed to detect COVID-19 patients from chest X-ray images. Two more CNN models with different number of convolution layers and three other models based on pretrained ResNet50, VGG-16 and VGG-19 are evaluated with comparative analytical analysis. All six models are trained and validated with Dataset 1 and Dataset 2. Dataset 1 has 201 normal and 201 COVID-19 chest X-rays whereas Dataset 2 is comparatively larger with 659 normal and 295 COVID-19 chest X-ray images. The proposed model performs with an accuracy of 98.3% and a precision of 96.72% with Dataset 2. This model gives the Receiver Operating Characteristic (ROC) curve area of 0.983 and F1-score of 98.3 with Dataset 2. Moreover, this work shows a comparative analysis of how change in convolutional layers and increase in dataset affect classifying performances.


2006 ◽  
Vol 58 (3-4) ◽  
pp. 151-170 ◽  
Author(s):  
Bikas K. Sinha ◽  
Pornpis Yimprayoon ◽  
Montip Tiensuwan

Sign in / Sign up

Export Citation Format

Share Document