scholarly journals Modeling clinical assessor intervariability using deep hypersphere encoder–decoder networks

2019 ◽  
Vol 32 (14) ◽  
pp. 10705-10717 ◽  
Author(s):  
Joost van der Putten ◽  
Fons van der Sommen ◽  
Jeroen de Groof ◽  
Maarten Struyvenberg ◽  
Svitlana Zinger ◽  
...  

AbstractIn medical imaging, a proper gold-standard ground truth as, e.g., annotated segmentations by assessors or experts is lacking or only scarcely available and suffers from large intervariability in those segmentations. Most state-of-the-art segmentation models do not take inter-observer variability into account and are fully deterministic in nature. In this work, we propose hypersphere encoder–decoder networks in combination with dynamic leaky ReLUs, as a new method to explicitly incorporate inter-observer variability into a segmentation model. With this model, we can then generate multiple proposals based on the inter-observer agreement. As a result, the output segmentations of the proposed model can be tuned to typical margins inherent to the ambiguity in the data. For experimental validation, we provide a proof of concept on a toy data set as well as show improved segmentation results on two medical data sets. The proposed method has several advantages over current state-of-the-art segmentation models such as interpretability in the uncertainty of segmentation borders. Experiments with a medical localization problem show that it offers improved biopsy localizations, which are on average 12% closer to the optimal biopsy location.

2021 ◽  
Vol 47 (1) ◽  
pp. 141-179
Author(s):  
Matej Martinc ◽  
Senja Pollak ◽  
Marko Robnik-Šikonja

Abstract We present a set of novel neural supervised and unsupervised approaches for determining the readability of documents. In the unsupervised setting, we leverage neural language models, whereas in the supervised setting, three different neural classification architectures are tested. We show that the proposed neural unsupervised approach is robust, transferable across languages, and allows adaptation to a specific readability task and data set. By systematic comparison of several neural architectures on a number of benchmark and new labeled readability data sets in two languages, this study also offers a comprehensive analysis of different neural approaches to readability classification. We expose their strengths and weaknesses, compare their performance to current state-of-the-art classification approaches to readability, which in most cases still rely on extensive feature engineering, and propose possibilities for improvements.


2021 ◽  
Vol 7 (3) ◽  
pp. 93-100
Author(s):  
Pedro Costa ◽  
Asim Smailagic ◽  
Jaime Cardoso ◽  
Aurélio Campilho

Current state-of-the-art medical image segmentation methods require high quality datasets to obtain good performance. However, medical specialists often disagree on diagnosis, hence, datasets contain contradictory annotations. This, in turn, leads to difficulties in the optimization process of Deep Learning models and hinder performance. We propose a method to estimate uncertainty in Convolutional Neural Network (CNN) segmentation models, that makes the training of CNNs more robust to contradictory annotations. In this work, we model two types of uncertainty, heteroscedastic and epistemic, without adding any additional supervisory signal other than the ground-truth segmentation mask. As expected, the uncertainty is higher closer to vessel boundaries, and on top of thinner and less visible vessels where it is more likely for medical specialists to disagree. Therefore, our method is more suitable to learn from datasets created with heterogeneous annotators. We show that there is a correlation between the uncertainty estimated by our method and the disagreement in the segmentation provided by two different medical specialists. Furthermore, by explicitly modeling the uncertainty, the Intersection over Union of the segmentation network improves 5.7 percentage points.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


2021 ◽  
Vol 7 (2) ◽  
pp. 21
Author(s):  
Roland Perko ◽  
Manfred Klopschitz ◽  
Alexander Almer ◽  
Peter M. Roth

Many scientific studies deal with person counting and density estimation from single images. Recently, convolutional neural networks (CNNs) have been applied for these tasks. Even though often better results are reported, it is often not clear where the improvements are resulting from, and if the proposed approaches would generalize. Thus, the main goal of this paper was to identify the critical aspects of these tasks and to show how these limit state-of-the-art approaches. Based on these findings, we show how to mitigate these limitations. To this end, we implemented a CNN-based baseline approach, which we extended to deal with identified problems. These include the discovery of bias in the reference data sets, ambiguity in ground truth generation, and mismatching of evaluation metrics w.r.t. the training loss function. The experimental results show that our modifications allow for significantly outperforming the baseline in terms of the accuracy of person counts and density estimation. In this way, we get a deeper understanding of CNN-based person density estimation beyond the network architecture. Furthermore, our insights would allow to advance the field of person density estimation in general by highlighting current limitations in the evaluation protocols.


Author(s):  
Yinfei Yang ◽  
Gustavo Hernandez Abrego ◽  
Steve Yuan ◽  
Mandy Guo ◽  
Qinlan Shen ◽  
...  

In this paper, we present an approach to learn multilingual sentence embeddings using a bi-directional dual-encoder with additive margin softmax. The embeddings are able to achieve state-of-the-art results on the United Nations (UN) parallel corpus retrieval task. In all the languages tested, the system achieves P@1 of 86% or higher. We use pairs retrieved by our approach to train NMT models that achieve similar performance to models trained on gold pairs. We explore simple document-level embeddings constructed by averaging our sentence embeddings. On the UN document-level retrieval task, document embeddings achieve around 97% on P@1 for all experimented language pairs. Lastly, we evaluate the proposed model on the BUCC mining task. The learned embeddings with raw cosine similarity scores achieve competitive results compared to current state-of-the-art models, and with a second-stage scorer we achieve a new state-of-the-art level on this task.


2021 ◽  
Author(s):  
Brigid A McDonald ◽  
Carlos Cardenas ◽  
Nicolette O'Connell ◽  
Sara Ahmed ◽  
Mohamed A. Naser ◽  
...  

Purpose: In order to accurately accumulate delivered dose for head and neck cancer patients treated with the Adapt to Position workflow on the 1.5T magnetic resonance imaging (MRI)-linear accelerator (MR-linac), the low-resolution T2-weighted MRIs used for daily setup must be segmented to enable reconstruction of the delivered dose at each fraction. In this study, our goal is to evaluate various autosegmentation methods for head and neck organs at risk (OARs) on on-board setup MRIs from the MR-linac for off-line reconstruction of delivered dose. Methods: Seven OARs (parotid glands, submandibular glands, mandible, spinal cord, and brainstem) were contoured on 43 images by seven observers each. Ground truth contours were generated using a simultaneous truth and performance level estimation (STAPLE) algorithm. 20 autosegmentation methods were evaluated in ADMIRE: 1-9) atlas-based autosegmentation using a population atlas library (PAL) of 5/10/15 patients with STAPLE, patch fusion (PF), random forest (RF) for label fusion; 10-19) autosegmentation using images from a patient's 1-4 prior fractions (individualized patient prior (IPP)) using STAPLE/PF/RF; 20) deep learning (DL) (3D ResUNet trained on 43 ground truth structure sets plus 45 contoured by one observer). Execution time was measured for each method. Autosegmented structures were compared to ground truth structures using the Dice similarity coefficient, mean surface distance, Hausdorff distance, and Jaccard index. For each metric and OAR, performance was compared to the inter-observer variability using Dunn's test with control. Methods were compared pairwise using the Steel-Dwass test for each metric pooled across all OARs. Further dosimetric analysis was performed on three high-performing autosegmentation methods (DL, IPP with RF and 4 fractions (IPP_RF_4), IPP with 1 fraction (IPP_1)), and one low-performing (PAL with STAPLE and 5 atlases (PAL_ST_5)). For five patients, delivered doses from clinical plans were recalculated on setup images with ground truth and autosegmented structure sets. Differences in maximum and mean dose to each structure between the ground truth and autosegmented structures were calculated and correlated with geometric metrics. Results: DL and IPP methods performed best overall, all significantly outperforming inter-observer variability and with no significant difference between methods in pairwise comparison. PAL methods performed worst overall; most were not significantly different from the inter-observer variability or from each other. DL was the fastest method (33 seconds per case) and PAL methods the slowest (3.7 - 13.8 minutes per case). Execution time increased with number of prior fractions/atlases for IPP and PAL. For DL, IPP_1, and IPP_RF_4, the majority (95%) of dose differences were within 250 cGy from ground truth, but outlier differences up to 785 cGy occurred. Dose differences were much higher for PAL_ST_5, with outlier differences up to 1920 cGy. Dose differences showed weak but significant correlations with all geometric metrics (R2 between 0.030 and 0.314). Conclusions: The autosegmentation methods offering the best combination of performance and execution time are DL and IPP_1. Dose reconstruction on on-board T2-weighted MRIs is feasible with autosegmented structures with minimal dosimetric variation from ground truth, but contours should be visually inspected prior to dose reconstruction in an end-to-end dose accumulation workflow.


2012 ◽  
Vol 263-266 ◽  
pp. 2173-2178
Author(s):  
Xin Guang Li ◽  
Min Feng Yao ◽  
Li Rui Jian ◽  
Zhen Jiang Li

A probabilistic neural network (PNN) speech recognition model based on the partition clustering algorithm is proposed in this paper. The most important advantage of PNN is that training is easy and instantaneous. Therefore, PNN is capable of dealing with real time speech recognition. Besides, in order to increase the performance of PNN, the selection of data set is one of the most important issues. In this paper, using the partition clustering algorithm to select data is proposed. The proposed model is tested on two data sets from the field of spoken Arabic numbers, with promising results. The performance of the proposed model is compared to single back propagation neural network and integrated back propagation neural network. The final comparison result shows that the proposed model performs better than the other two neural networks, and has an accuracy rate of 92.41%.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 879 ◽  
Author(s):  
Uwe Köckemann ◽  
Marjan Alirezaie ◽  
Jennifer Renoux ◽  
Nicolas Tsiftes ◽  
Mobyen Uddin Ahmed ◽  
...  

As research in smart homes and activity recognition is increasing, it is of ever increasing importance to have benchmarks systems and data upon which researchers can compare methods. While synthetic data can be useful for certain method developments, real data sets that are open and shared are equally as important. This paper presents the E-care@home system, its installation in a real home setting, and a series of data sets that were collected using the E-care@home system. Our first contribution, the E-care@home system, is a collection of software modules for data collection, labeling, and various reasoning tasks such as activity recognition, person counting, and configuration planning. It supports a heterogeneous set of sensors that can be extended easily and connects collected sensor data to higher-level Artificial Intelligence (AI) reasoning modules. Our second contribution is a series of open data sets which can be used to recognize activities of daily living. In addition to these data sets, we describe the technical infrastructure that we have developed to collect the data and the physical environment. Each data set is annotated with ground-truth information, making it relevant for researchers interested in benchmarking different algorithms for activity recognition.


Separations ◽  
2018 ◽  
Vol 5 (3) ◽  
pp. 44 ◽  
Author(s):  
Alyssa Allen ◽  
Mary Williams ◽  
Nicholas Thurn ◽  
Michael Sigman

Computational models for determining the strength of fire debris evidence based on likelihood ratios (LR) were developed and validated against data sets derived from different distributions of ASTM E1618-14 designated ignitable liquid class and substrate pyrolysis contributions using in-silico generated data. The models all perform well in cross validation against the distributions used to generate the model. However, a model generated based on data that does not contain representatives from all of the ASTM E1618-14 classes does not perform well in validation with data sets that contain representatives from the missing classes. A quadratic discriminant model based on a balanced data set (ignitable liquid versus substrate pyrolysis), with a uniform distribution of the ASTM E1618-14 classes, performed well (receiver operating characteristic area under the curve of 0.836) when tested against laboratory-developed casework-relevant samples of known ground truth.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4408 ◽  
Author(s):  
Hyun-Myung Cho ◽  
Heesu Park ◽  
Suh-Yeon Dong ◽  
Inchan Youn

The goals of this study are the suggestion of a better classification method for detecting stressed states based on raw electrocardiogram (ECG) data and a method for training a deep neural network (DNN) with a smaller data set. We suggest an end-to-end architecture to detect stress using raw ECGs. The architecture consists of successive stages that contain convolutional layers. In this study, two kinds of data sets are used to train and validate the model: A driving data set and a mental arithmetic data set, which smaller than the driving data set. We apply a transfer learning method to train a model with a small data set. The proposed model shows better performance, based on receiver operating curves, than conventional methods. Compared with other DNN methods using raw ECGs, the proposed model improves the accuracy from 87.39% to 90.19%. The transfer learning method improves accuracy by 12.01% and 10.06% when 10 s and 60 s of ECG signals, respectively, are used in the model. In conclusion, our model outperforms previous models using raw ECGs from a small data set and, so, we believe that our model can significantly contribute to mobile healthcare for stress management in daily life.


Sign in / Sign up

Export Citation Format

Share Document