scholarly journals Classification of Shoulder X-ray Images with Deep Learning Ensemble Models

2021 ◽  
Vol 11 (6) ◽  
pp. 2723
Author(s):  
Fatih Uysal ◽  
Fırat Hardalaç ◽  
Ozan Peker ◽  
Tolga Tolunay ◽  
Nil Tokgöz

Fractures occur in the shoulder area, which has a wider range of motion than other joints in the body, for various reasons. To diagnose these fractures, data gathered from X-radiation (X-ray), magnetic resonance imaging (MRI), or computed tomography (CT) are used. This study aims to help physicians by classifying shoulder images taken from X-ray devices as fracture/non-fracture with artificial intelligence. For this purpose, the performances of 26 deep learning-based pre-trained models in the detection of shoulder fractures were evaluated on the musculoskeletal radiographs (MURA) dataset, and two ensemble learning models (EL1 and EL2) were developed. The pre-trained models used are ResNet, ResNeXt, DenseNet, VGG, Inception, MobileNet, and their spinal fully connected (Spinal FC) versions. In the EL1 and EL2 models developed using pre-trained models with the best performance, test accuracy was 0.8455, 0.8472, Cohen’s kappa was 0.6907, 0.6942 and the area that was related with fracture class under the receiver operating characteristic (ROC) curve (AUC) was 0.8862, 0.8695. As a result of 28 different classifications in total, the highest test accuracy and Cohen’s kappa values were obtained in the EL2 model, and the highest AUC value was obtained in the EL1 model.

2021 ◽  
Vol 11 (9) ◽  
pp. 3863
Author(s):  
Ali Emre Öztürk ◽  
Ergun Erçelebi

A large amount of training image data is required for solving image classification problems using deep learning (DL) networks. In this study, we aimed to train DL networks with synthetic images generated by using a game engine and determine the effects of the networks on performance when solving real-image classification problems. The study presents the results of using corner detection and nearest three-point selection (CDNTS) layers to classify bird and rotary-wing unmanned aerial vehicle (RW-UAV) images, provides a comprehensive comparison of two different experimental setups, and emphasizes the significant improvements in the performance in deep learning-based networks due to the inclusion of a CDNTS layer. Experiment 1 corresponds to training the commonly used deep learning-based networks with synthetic data and an image classification test on real data. Experiment 2 corresponds to training the CDNTS layer and commonly used deep learning-based networks with synthetic data and an image classification test on real data. In experiment 1, the best area under the curve (AUC) value for the image classification test accuracy was measured as 72%. In experiment 2, using the CDNTS layer, the AUC value for the image classification test accuracy was measured as 88.9%. A total of 432 different combinations of trainings were investigated in the experimental setups. The experiments were trained with various DL networks using four different optimizers by considering all combinations of batch size, learning rate, and dropout hyperparameters. The test accuracy AUC values for networks in experiment 1 ranged from 55% to 74%, whereas the test accuracy AUC values in experiment 2 networks with a CDNTS layer ranged from 76% to 89.9%. It was observed that the CDNTS layer has considerable effects on the image classification accuracy performance of deep learning-based networks. AUC, F-score, and test accuracy measures were used to validate the success of the networks.


Stroke ◽  
2021 ◽  
Author(s):  
Maximilian Nielsen ◽  
Moritz Waldmann ◽  
Andreas M. Frölich ◽  
Fabian Flottmann ◽  
Evelin Hristova ◽  
...  

Background and Purpose: Mechanical thrombectomy is an established procedure for treatment of acute ischemic stroke. Mechanical thrombectomy success is commonly assessed by the Thrombolysis in Cerebral Infarction (TICI) score, assigned by visual inspection of X-ray digital subtraction angiography data. However, expert-based TICI scoring is highly observer-dependent. This represents a major obstacle for mechanical thrombectomy outcome comparison in, for instance, multicentric clinical studies. Focusing on occlusions of the M1 segment of the middle cerebral artery, the present study aimed to develop a deep learning (DL) solution to automated and, therefore, objective TICI scoring, to evaluate the agreement of DL- and expert-based scoring, and to compare corresponding numbers to published scoring variability of clinical experts. Methods: The study comprises 2 independent datasets. For DL system training and initial evaluation, an in-house dataset of 491 digital subtraction angiography series and modified TICI scores of 236 patients with M1 occlusions was collected. To test the model generalization capability, an independent external dataset with 95 digital subtraction angiography series was analyzed. Characteristics of the DL system were modeling TICI scoring as ordinal regression, explicit consideration of the temporal image information, integration of physiological knowledge, and modeling of inherent TICI scoring uncertainties. Results: For the in-house dataset, the DL system yields Cohen’s kappa, overall accuracy, and specific agreement values of 0.61, 71%, and 63% to 84%, respectively, compared with the gold standard: the expert rating. Values slightly drop to 0.52/64%/43% to 87% when the model is, without changes, applied to the external dataset. After model updating, they increase to 0.65/74%/60% to 90%. Literature Cohen’s kappa values for expert-based TICI scoring agreement are in the order of 0.6. Conclusions: The agreement of DL- and expert-based modified TICI scores in the range of published interobserver variability of clinical experts highlights the potential of the proposed DL solution to automated TICI scoring.


2021 ◽  
Author(s):  
Liangrui Pan ◽  
boya ji ◽  
Xiaoqi wang ◽  
shaoliang peng

The use of chest X-ray images (CXI) to detect Severe Acute Respiratory Syndrome Coronavirus 2 (SARS CoV-2) caused by Coronavirus Disease 2019 (COVID-19) is life-saving important for both patients and doctors. This research proposed a multi-channel feature deep neural network algorithm to screen people infected with COVID-19. The algorithm integrates data oversampling technology and a multi-channel feature deep neural network model to carry out the training process in an end-to-end manner. In the experiment, we used a publicly available CXI database with 10,192 Normal, 6012 Lung Opacity (Non-COVID lung infection), and 1345 Viral Pneumonia images. Compared with traditional deep learning models (Densenet201, ResNet50, VGG19, GoogLeNet), the MFDNN model obtains an average test accuracy of 93.19% in all data. Furthermore, in each type of screening, the precision, recall, and F1 Score of the MFDNN model are also better than traditional deep learning networks. Secondly, compared with the latest CoroDet model, the MFDNN algorithm is 1.91% higher than the CoroDet model in the experiment of detecting the four categories of COVID19 infected persons. Finally, our experimental code will be placed at https://github.com/panliangrui/covid19.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e16605-e16605
Author(s):  
Choongheon Yoon ◽  
Jasper Van ◽  
Michelle Bardis ◽  
Param Bhatter ◽  
Alexander Ushinsky ◽  
...  

e16605 Background: Prostate Cancer is the most commonly diagnosed male cancer in the U.S. Multiparametric magnetic resonance imaging (mpMRI) is increasingly used for both prostate cancer evaluation and biopsy guidance. The PI-RADS v2 scoring paradigm was developed to stratify prostate lesions on MRI and to predict lesion grade. Prostate organ and lesion segmentation is an essential step in pre-biopsy surgical planning. Deep learning convolutional neural networks (CNN) for image recognition are becoming a more common method of machine learning. In this study, we develop a comprehensive deep learning pipeline of 3D/2D CNN based on U-Net architecture for automatic localization and segmentation of prostates, detection of prostate lesions and PI-RADS v2 lesion scoring of mpMRIs. Methods: This IRB approved retrospective review included a total of 303 prostate nodules from 217 patients who had a prostate mpMRI between September 2014 and December 2016 and an MR-guided transrectal biopsy. For each T2 weighted image, a board-certified abdominal radiologist manually segmented the prostate and each prostate lesion. The T2 weighted and ADC series were co-registered and each lesion was assigned an overall PI-RADS score, T2 weighted PI-RADS score, and ADC PI-RADS score. After a U-Net neural network segmented the prostate organ, a mask regional convolutional neural network (R-CNN) was applied. The mask R-CNN is composed of three neural networks: feature pyramid network, region proposal network, and head network. The mask R-CNN detected the prostate lesion, segmented it, and estimated its PI-RADS score. Instead, the mask R-CNN was implemented to regress along dimensions of the PI-RADS criteria. The mask R-CNN performance was assessed with AUC, Sørensen–Dice coefficient, and Cohen’s Kappa for PI-RADS scoring agreement. Results: The AUC for prostate nodule detection was 0.79. By varying detection thresholds, sensitivity/PPV were 0.94/.54 and 0.60/0.87 at either ends of the spectrum. For detected nodules, the segmentation Sørensen–Dice coefficient was 0.76 (0.72 – 0.80). Weighted Cohen’s Kappa for PI-RADS scoring agreement was 0.63, 0.71, and 0.51 for composite, T2 weighted, and ADC respectively. Conclusions: These results demonstrate the feasibility of implementing a comprehensive 3D/2D CNN-based deep learning pipeline for evaluation of prostate mpMRI. This method is highly accurate for organ segmentation. The results for lesion detection and categorization are modest; however, the PI-RADS v2 score accuracy is comparable to previously published human interobserver agreement.


2021 ◽  
Vol 4 (2) ◽  
pp. 147-153
Author(s):  
Vina Ayumi ◽  
Ida Nurhaida

Deteksi dini terhadap adanya indikasi pasien dengan gejala COVID-19 perlu dilakukan untuk mengurangi penyebaran virus. Salah satu cara yang dapat dilakukan untuk mendeteksi virus COVID-19 adalah dengan cara mempelajari citra chest x-ray pasien dengan gejala Covid-19. Citra chest x-ray dianggap mampu menggambarkan kondisi paru-paru pasien COVID-19 sebagai alat bantu untuk diagnosa klinis. Penelitian ini mengusulkan pendekatan deep learning berbasis convolutional neural network (CNN) untuk klasifikasi gejala COVID-19 melalui citra chest X-Ray. Evaluasi performa metode yang diusulkan akan menggunakan perhitungan accuracy, precision, recall, f1-score, dan cohens kappa. Penelitian ini menggunakan model CNN dengan 2 lapis layer convolusi dan maxpoling serta fully-connected layer untuk output. Parameter-parameter yang digunakan diantaranya batch_size = 32, epoch = 50, learning_rate = 0.001, dengan optimizer yaitu Adam. Nilai akurasi validasi (val_acc) terbaik diperoleh pada epoch ke-49 dengan nilai 0.9606, nilai loss validasi (val_loss) 0.1471, akurasi training (acc) 0.9405, dan loss training (loss) 0.2558.


2021 ◽  
Vol 13 (3) ◽  
Author(s):  
Viktor Dalen ◽  
Anne-Sofie Vegsgaard Olsen ◽  
Claude-Pierre Jerome ◽  
Jonn-Terje Geitung ◽  
Anders E.A. Dahm

Skeletal disease is common in multiple myeloma. We investigated the inter-observer agreement and diagnostic accuracy of spinal fractures diagnosed by computer tomography (CT) and magnetic resonance imaging (MRI) from 12 myeloma patients. Two radiologists independently assessed the images. CT, MRI, and other images were combined to a gold standard. The inter-observer agreement was assessed with Cohen’s kappa. Radiologist 1 diagnosed 20 malignant spinal fractures on CT and 26 on MRI, while radiologist 2 diagnosed 12 malignant spinal fractures on CT and 22 on MRI. In comparison the gold standard diagnosed 10 malignant spinal fractures. The sensitivity for malignant fractures varied from 0.5 to 1 for CT and MRI, and the specificity varied from 0.17 to 0.67. On MRI, the specificity for malignant spinal fractures was 0.17 for both radiologists. The inter-observer agreement for malignant spinal fractures on CT was -0.42 (Cohen’s kappa) and -0.13 for MRI, while for osteoporotic fractures it was -0.24 for CT and 0.53 for MRI. We conclude that malignant spinal fractures were over-diagnosed on CT and MRI. The inter-observer agreement was extremely poor.


Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 2845 ◽  
Author(s):  
Michael B. Del Del Rosario ◽  
Nigel H. Lovell ◽  
Stephen J. Redmond

Features were developed which accounted for the changing orientation of the inertial measurement unit (IMU) relative to the body, and demonstrably improved the performance of models for human activity recognition (HAR). The method is proficient at separating periods of standing and sedentary activity (i.e., sitting and/or lying) using only one IMU, even if it is arbitrarily oriented or subsequently re-oriented relative to the body; since the body is upright during walking, learning the IMU orientation during walking provides a reference orientation against which sitting and/or lying can be inferred. Thus, the two activities can be identified (irrespective of the cohort) by analyzing the magnitude of the angle of shortest rotation which would be required to bring the upright direction into coincidence with the average orientation from the most recent 2.5 s of IMU data. Models for HAR were trained using data obtained from a cohort of 37 older adults (83.9 ± 3.4 years) or 20 younger adults (21.9 ± 1.7 years). Test data were generated from the training data by virtually re-orienting the IMU so that it is representative of carrying the phone in five different orientations (relative to the thigh). The overall performance of the model for HAR was consistent whether the model was trained with the data from the younger cohort, and tested with the data from the older cohort after it had been virtually re-oriented (Cohen’s Kappa 95% confidence interval [0.782, 0.793]; total class sensitivity 95% confidence interval [84.9%, 85.6%]), or the reciprocal scenario in which the model was trained with the data from the older cohort, and tested with the data from the younger cohort after it had been virtually re-oriented (Cohen’s Kappa 95% confidence interval [0.765, 0.784]; total class sensitivity 95% confidence interval [82.3%, 83.7%]).


SLEEP ◽  
2020 ◽  
Vol 43 (11) ◽  
Author(s):  
Maurice Abou Jaoude ◽  
Haoqi Sun ◽  
Kyle R Pellerin ◽  
Milena Pavlova ◽  
Rani A Sarkis ◽  
...  

Abstract Study Objectives Develop a high-performing, automated sleep scoring algorithm that can be applied to long-term scalp electroencephalography (EEG) recordings. Methods Using a clinical dataset of polysomnograms from 6,431 patients (MGH–PSG dataset), we trained a deep neural network to classify sleep stages based on scalp EEG data. The algorithm consists of a convolutional neural network for feature extraction, followed by a recurrent neural network that extracts temporal dependencies of sleep stages. The algorithm’s inputs are four scalp EEG bipolar channels (F3-C3, C3-O1, F4-C4, and C4-O2), which can be derived from any standard PSG or scalp EEG recording. We initially trained the algorithm on the MGH–PSG dataset and used transfer learning to fine-tune it on a dataset of long-term (24–72 h) scalp EEG recordings from 112 patients (scalpEEG dataset). Results The algorithm achieved a Cohen’s kappa of 0.74 on the MGH–PSG holdout testing set and cross-validated Cohen’s kappa of 0.78 after optimization on the scalpEEG dataset. The algorithm also performed well on two publicly available PSG datasets, demonstrating high generalizability. Performance on all datasets was comparable to the inter-rater agreement of human sleep staging experts (Cohen’s kappa ~ 0.75 ± 0.11). The algorithm’s performance on long-term scalp EEGs was robust over a wide age range and across common EEG background abnormalities. Conclusion We developed a deep learning algorithm that achieves human expert level sleep staging performance on long-term scalp EEG recordings. This algorithm, which we have made publicly available, greatly facilitates the use of large long-term EEG clinical datasets for sleep-related research.


Magnetic Resonance Imaging (MRI) is a type of scan that produces comprehensive images of the inside of the body using a steady magnetic field and radio waves. On the other hand, Computed Tomography (CT) scans, is a combination of a series of X-ray images, which are a type of radiation called ionizing radiation. It can be harmful to the DNA in your cells and also increase the chances that they'll turn cancerous. MRI is a safer option compared to CT and does not involve any radiation exposure. In this paper, we propose the use of Generative Adversarial Networks (GANs) to translate MRI images into equivalent CT images. We compare it with past techniques of MRI to CT scan conversion and elaborate on why GANs produce more realistic CT images while modeling the nonlinear relationship from MRI to CT.


2021 ◽  
Author(s):  
Hieu H. Pham ◽  
Dung V. Do ◽  
Ha Q. Nguyen

AbstractX-ray imaging in Digital Imaging and Communications in Medicine (DICOM) format is the most commonly used imaging modality in clinical practice, resulting in vast, non-normalized databases. This leads to an obstacle in deploying artificial intelligence (AI) solutions for analyzing medical images, which often requires identifying the right body part before feeding the image into a specified AI model. This challenge raises the need for an automated and efficient approach to classifying body parts from X-ray scans. Unfortunately, to the best of our knowledge, there is no open tool or framework for this task to date. To fill this lack, we introduce a DICOM Imaging Router that deploys deep convolutional neural networks (CNNs) for categorizing unknown DICOM X-ray images into five anatomical groups: abdominal, adult chest, pediatric chest, spine, and others. To this end, a large-scale X-ray dataset consisting of 16,093 images has been collected and manually classified. We then trained a set of state-of-the-art deep CNNs using a training set of 11,263 images. These networks were then evaluated on an independent test set of 2,419 images and showed superior performance in classifying the body parts. Specifically, our best performing model (i.e., MobileNet-V1) achieved a recall of 0.982 (95% CI, 0.977– 0.988), a precision of 0.985 (95% CI, 0.975–0.989) and a F1-score of 0.981 (95% CI, 0.976–0.987), whilst requiring less computation for inference (0.0295 second per image). Our external validity on 1,000 X-ray images shows the robustness of the proposed approach across hospitals. These remarkable performances indicate that deep CNNs can accurately and effectively differentiate human body parts from X-ray scans, thereby providing potential benefits for a wide range of applications in clinical settings. The dataset, codes, and trained deep learning models from this study will be made publicly available on our project website at https://vindr.ai/datasets/bodypartxr.


Sign in / Sign up

Export Citation Format

Share Document