scholarly journals Demographic Reporting in Publicly Available Chest Radiograph Data Sets: Opportunities for Mitigating Sex and Racial Disparities in Deep Learning Models

2022 ◽  
Vol 19 (1) ◽  
pp. 192-200
Author(s):  
Paul H. Yi ◽  
Tae Kyung Kim ◽  
Eliot Siegel ◽  
Noushin Yahyavi-Firouz-Abadi
2021 ◽  
Vol 11 (3) ◽  
pp. 999
Author(s):  
Najeeb Moharram Jebreel ◽  
Josep Domingo-Ferrer ◽  
David Sánchez ◽  
Alberto Blanco-Justicia

Many organizations devote significant resources to building high-fidelity deep learning (DL) models. Therefore, they have a great interest in making sure the models they have trained are not appropriated by others. Embedding watermarks (WMs) in DL models is a useful means to protect the intellectual property (IP) of their owners. In this paper, we propose KeyNet, a novel watermarking framework that satisfies the main requirements for an effective and robust watermarking. In KeyNet, any sample in a WM carrier set can take more than one label based on where the owner signs it. The signature is the hashed value of the owner’s information and her model. We leverage multi-task learning (MTL) to learn the original classification task and the watermarking task together. Another model (called the private model) is added to the original one, so that it acts as a private key. The two models are trained together to embed the WM while preserving the accuracy of the original task. To extract a WM from a marked model, we pass the predictions of the marked model on a signed sample to the private model. Then, the private model can provide the position of the signature. We perform an extensive evaluation of KeyNet’s performance on the CIFAR10 and FMNIST5 data sets and prove its effectiveness and robustness. Empirical results show that KeyNet preserves the utility of the original task and embeds a robust WM.


2021 ◽  
Vol 18 (6) ◽  
pp. 9264-9293
Author(s):  
Michael James Horry ◽  
◽  
Subrata Chakraborty ◽  
Biswajeet Pradhan ◽  
Maryam Fallahpoor ◽  
...  

<abstract> <p>The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focused on the diagnosis of COVID-19 from medical images. However, these models have found limited, if any, clinical application due in part to unproven generalization to data sets beyond their source training corpus. This study investigates the generalizability of deep learning models using publicly available COVID-19 Computed Tomography data through cross dataset validation. The predictive ability of these models for COVID-19 severity is assessed using an independent dataset that is stratified for COVID-19 lung involvement. Each inter-dataset study is performed using histogram equalization, and contrast limited adaptive histogram equalization with and without a learning Gabor filter. We show that under certain conditions, deep learning models can generalize well to an external dataset with F1 scores up to 86%. The best performing model shows predictive accuracy of between 75% and 96% for lung involvement scoring against an external expertly stratified dataset. From these results we identify key factors promoting deep learning generalization, being primarily the uniform acquisition of training images, and secondly diversity in CT slice position.</p> </abstract>


2021 ◽  
Vol 13 (19) ◽  
pp. 10690
Author(s):  
Heelak Choi ◽  
Sang-Ik Suh ◽  
Su-Hee Kim ◽  
Eun Jin Han ◽  
Seo Jin Ki

This study aimed to investigate the applicability of deep learning algorithms to (monthly) surface water quality forecasting. A comparison was made between the performance of an autoregressive integrated moving average (ARIMA) model and four deep learning models. All prediction algorithms, except for the ARIMA model working on a single variable, were tested with univariate inputs consisting of one of two dependent variables as well as multivariate inputs containing both dependent and independent variables. We found that deep learning models (6.31–18.78%, in terms of the mean absolute percentage error) showed better performance than the ARIMA model (27.32–404.54%) in univariate data sets, regardless of dependent variables. However, the accuracy of prediction was not improved for all dependent variables in the presence of other associated water quality variables. In addition, changes in the number of input variables, sliding window size (i.e., input and output time steps), and relevant variables (e.g., meteorological and discharge parameters) resulted in wide variation of the predictive accuracy of deep learning models, reaching as high as 377.97%. Therefore, a refined search identifying the optimal values on such influencing factors is recommended to achieve the best performance of any deep learning model in given multivariate data sets.


2021 ◽  
Author(s):  
Nanditha Mallesh ◽  
Max Zhao ◽  
Lisa Meintker ◽  
Alexander Höllein ◽  
Franz Elsner ◽  
...  

AbstractMulti-parameter flow cytometry (MFC) is a cornerstone in clinical decision making for hematological disorders such as leukemia or lymphoma. MFC data analysis requires trained experts to manually gate cell populations of interest, which is time-consuming and subjective. Manual gating is often limited to a two-dimensional space. In recent years, deep learning models have been developed to analyze the data in high-dimensional space and are highly accurate. Such models have been used successfully in histology, cytopathology, image flow cytometry, and conventional MFC analysis. However, current AI models used for subtype classification based on MFC data are limited to the antibody (flow cytometry) panel they were trained on. Thus, a key challenge in deploying AI models into routine diagnostics is the robustness and adaptability of such models. In this study, we present a workflow to extend our previous model to four additional MFC panels. We employ knowledge transfer to adapt the model to smaller data sets. We trained models for each of the data sets by transferring the features learned from our base model. With our workflow, we could increase the model’s overall performance and more prominently, increase the learning rate for very small training sizes.


Author(s):  
Rasha M. Al-Eidan ◽  
Hend Al-Khalifa ◽  
AbdulMalik Alsalman

The traditional standards employed for pain assessment have many limitations. One such limitation is reliability because of inter-observer variability. Therefore, there have been many approaches to automate the task of pain recognition. Recently, deep-learning methods have appeared to solve many challenges, such as feature selection and cases with a small number of data sets. This study provides a systematic review of pain-recognition systems that are based on deep-learning models for the last two years only. Furthermore, it presents the major deep-learning methods that were used in review papers. Finally, it provides a discussion of the challenges and open issues.


2020 ◽  
Vol 10 (11) ◽  
pp. 4177-4190
Author(s):  
Osval Antonio Montesinos-López ◽  
José Cricelio Montesinos-López ◽  
Pawan Singh ◽  
Nerida Lozano-Ramirez ◽  
Alberto Barrón-López ◽  
...  

The paradigm called genomic selection (GS) is a revolutionary way of developing new plants and animals. This is a predictive methodology, since it uses learning methods to perform its task. Unfortunately, there is no universal model that can be used for all types of predictions; for this reason, specific methodologies are required for each type of output (response variables). Since there is a lack of efficient methodologies for multivariate count data outcomes, in this paper, a multivariate Poisson deep neural network (MPDN) model is proposed for the genomic prediction of various count outcomes simultaneously. The MPDN model uses the minus log-likelihood of a Poisson distribution as a loss function, in hidden layers for capturing nonlinear patterns using the rectified linear unit (RELU) activation function and, in the output layer, the exponential activation function was used for producing outputs on the same scale of counts. The proposed MPDN model was compared to conventional generalized Poisson regression models and univariate Poisson deep learning models in two experimental data sets of count data. We found that the proposed MPDL outperformed univariate Poisson deep neural network models, but did not outperform, in terms of prediction, the univariate generalized Poisson regression models. All deep learning models were implemented in Tensorflow as back-end and Keras as front-end, which allows implementing these models on moderate and large data sets, which is a significant advantage over previous GS models for multivariate count data.


2020 ◽  
Vol 12 (6) ◽  
pp. 923 ◽  
Author(s):  
Kuiliang Gao ◽  
Bing Liu ◽  
Xuchu Yu ◽  
Jinchun Qin ◽  
Pengqiang Zhang ◽  
...  

Deep learning has achieved great success in hyperspectral image classification. However, when processing new hyperspectral images, the existing deep learning models must be retrained from scratch with sufficient samples, which is inefficient and undesirable in practical tasks. This paper aims to explore how to accurately classify new hyperspectral images with only a few labeled samples, i.e., the hyperspectral images few-shot classification. Specifically, we design a new deep classification model based on relational network and train it with the idea of meta-learning. Firstly, the feature learning module and the relation learning module of the model can make full use of the spatial–spectral information in hyperspectral images and carry out relation learning by comparing the similarity between samples. Secondly, the task-based learning strategy can enable the model to continuously enhance its ability to learn how to learn with a large number of tasks randomly generated from different data sets. Benefitting from the above two points, the proposed method has excellent generalization ability and can obtain satisfactory classification results with only a few labeled samples. In order to verify the performance of the proposed method, experiments were carried out on three public data sets. The results indicate that the proposed method can achieve better classification results than the traditional semisupervised support vector machine and semisupervised deep learning models.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e16097-e16097
Author(s):  
Andrew J. Kruger ◽  
Lingdao Sha ◽  
Madhavi Kannan ◽  
Rohan P. Joshi ◽  
Benjamin D. Leibowitz ◽  
...  

e16097 Background: Using gene-expression, consensus molecular subtypes (CMS) divide colorectal cancers (CRC) into four categories with prognostic and therapy-predictive clinical utilities. These subtypes also manifest as different morphological phenotypes in whole-slide images (WSIs). Here, we implemented and trained a novel deep multiple instance learning (MIL) framework that requires only a single label per WSI to identify morphological biomarkers and accelerate CMS classification. Methods: Deep learning models can be trained by MIL frameworks to classify tissue in localized tiles from large ( > 1 Gb) WSIs using only weakly supervised, slide-level classification labels. Here we demonstrate a novel framework that advances on instance-based MIL by using a multi-phase approach to training deep learning models. The framework allows us to train on WSIs that contain multiple CMS classes while further identifying previously undiscovered tissue features that have low or no correlation with any subtype. Identification of these uncorrelated features results in improved insights into the specific tissue features that are most associated with the four CMS classes and a more accurate classification of CMS status. Results: We trained and validated (n = 735 WSIs and 184 withheld WSIs, respectively) a ResNet34 convolutional neural network to classify 224x224 pixel tiles distributed across tumor, lymphocyte, and stroma tissue regions. The slide-level CMS classification probability was calculated by an aggregation of the tiles correlated with each one of the four subtypes. The receiver operating characteristic curves had the following one-vs-all AUCs: CMS1 = 0.854, CMS2 = 0.921, CMS3 = 0.850, and CMS4 = 0.866, resulting in an average AUC of 0.873. Initial tests to generalize to other data sets, such as TCGA, are promising and constitute one of the future directions of this work. Conclusions: The MIL framework robustly identified tissue features correlated with CMS groups, allowing for a more efficient classification of CRC samples. We also demonstrated that the morphological features indicative of different molecular subtypes can be identified from the deep neural network.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Qingyu Zhao ◽  
Ehsan Adeli ◽  
Kilian M. Pohl

AbstractThe presence of confounding effects (or biases) is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Confounders affect the relationship between input data (e.g., brain MRIs) and output variables (e.g., diagnosis). Improper modeling of those relationships often results in spurious and biased associations. Traditional machine learning and statistical models minimize the impact of confounders by, for example, matching data sets, stratifying data, or residualizing imaging measurements. Alternative strategies are needed for state-of-the-art deep learning models that use end-to-end training to automatically extract informative features from large set of images. In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations between the confounder(s) and prediction outcome. The method does so by exploiting concepts from traditional statistical methods and recent fair machine learning schemes. We evaluate the method on predicting the diagnosis of HIV solely from Magnetic Resonance Images (MRIs), identifying morphological sex differences in adolescence from those of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), and determining the bone age from X-ray images of children. The results show that our method can accurately predict while reducing biases associated with confounders. The code is available at https://github.com/qingyuzhao/br-net.


2019 ◽  
Vol 16 (10) ◽  
pp. 4202-4213
Author(s):  
Priyanka Malhotra ◽  
Sheifali Gupta ◽  
Deepika Koundal

Pneumonia is a deadly chest disease and is a major culprit behind numerous deaths every year. Chest radiographs (CXR) are commonly used for quick and cheap diagnosis of chest diseases. The interpretation of CXR’s for diagnosing pneumonia is difficult. This has created an interest in computer-aided diagnosis (CAD) for CXR images. In this study, a brief review of literature based on computer aided analysis of chest radiograph images for identification of pneumonia using different machine learning and deep learning models is presented and a comparison of these different techniques has been provided. In addition, the study also presents various publicly available chest X-ray data sets for training, testing and validation of deep learning models.


Sign in / Sign up

Export Citation Format

Share Document