scholarly journals Validation of image segmentation by estimating rater bias and variance

Author(s):  
Simon K Warfield ◽  
Kelly H Zou ◽  
William M Wells

The accuracy and precision of segmentations of medical images has been difficult to quantify in the absence of a ‘ground truth’ or reference standard segmentation for clinical data. Although physical or digital phantoms can help by providing a reference standard, they do not allow the reproduction of the full range of imaging and anatomical characteristics observed in clinical data. An alternative assessment approach is to compare with segmentations generated by domain experts. Segmentations may be generated by raters who are trained experts or by automated image analysis algorithms. Typically, these segmentations differ due to intra-rater and inter-rater variability. The most appropriate way to compare such segmentations has been unclear. We present here a new algorithm to enable the estimation of performance characteristics, and a true labelling, from observations of segmentations of imaging data where segmentation labels may be ordered or continuous measures. This approach may be used with, among others, surface, distance transform or level-set representations of segmentations, and can be used to assess whether or not a rater consistently overestimates or underestimates the position of a boundary.

Data Mining ◽  
2013 ◽  
pp. 1794-1818
Author(s):  
William H. Horsthemke ◽  
Daniela S. Raicu ◽  
Jacob D. Furst ◽  
Samuel G. Armato

Evaluating the success of computer-aided decision support systems depends upon a reliable reference standard, a ground truth. The ideal gold standard is expected to result from the marking, labeling, and rating by domain experts of the image of interest. However experts often disagree, and this lack of agreement challenges the development and evaluation of image-based feature prediction of expert-defined “truth.” The following discussion addresses the success and limitation of developing computer-aided models to characterize suspicious pulmonary nodules based upon ratings provided by multiple expert radiologists. These prediction models attempt to bridge the semantic gap between images and medically-meaningful, descriptive opinions about visual characteristics of nodules. The resultant computer-aided diagnostic characterizations (CADc) are directly usable for indexing and retrieving in content-based medical image retrieval and supporting computer-aided diagnosis. The predictive performance of CADc models are directly related to the extent of agreement between radiologists; the models better predict radiologists’ opinions when radiologists agree more with each other about the characteristics of nodules.


Author(s):  
William H. Horsthemke ◽  
Daniela S. Raicu ◽  
Jacob D. Furst ◽  
Samuel G. Armato

Evaluating the success of computer-aided decision support systems depends upon a reliable reference standard, a ground truth. The ideal gold standard is expected to result from the marking, labeling, and rating by domain experts of the image of interest. However experts often disagree, and this lack of agreement challenges the development and evaluation of image-based feature prediction of expert-defined “truth.” The following discussion addresses the success and limitation of developing computer-aided models to characterize suspicious pulmonary nodules based upon ratings provided by multiple expert radiologists. These prediction models attempt to bridge the semantic gap between images and medically-meaningful, descriptive opinions about visual characteristics of nodules. The resultant computer-aided diagnostic characterizations (CADc) are directly usable for indexing and retrieving in content-based medical image retrieval and supporting computer-aided diagnosis. The predictive performance of CADc models are directly related to the extent of agreement between radiologists; the models better predict radiologists’ opinions when radiologists agree more with each other about the characteristics of nodules.


2020 ◽  
Vol 133 (22) ◽  
pp. jcs241422
Author(s):  
Claire Mitchell ◽  
Lauryanne Caroff ◽  
Jose Alonso Solis-Lemus ◽  
Constantino Carlos Reyes-Aldasoro ◽  
Alessandra Vigilante ◽  
...  

ABSTRACTAccurate measurements of cell morphology and behaviour are fundamentally important for understanding how disease, molecules and drugs affect cell function in vivo. Here, by using muscle stem cell (muSC) responses to injury in zebrafish as our biological paradigm, we established a ‘ground truth’ for muSC behaviour. This revealed that segmentation and tracking algorithms from commonly used programs are error-prone, leading us to develop a fast semi-automated image analysis pipeline that allows user-defined parameters for segmentation and correction of cell tracking. Cell Tracking Profiler (CTP) is a package that runs two existing programs, HK Means and Phagosight within the Icy image analysis suite, to enable user-managed cell tracking from 3D time-lapse datasets to provide measures of cell shape and movement. We demonstrate how CTP can be used to reveal changes to cell behaviour of muSCs in response to manipulation of the cell cytoskeleton by small-molecule inhibitors. CTP and the associated tools we have developed for analysis of outputs thus provide a powerful framework for analysing complex cell behaviour in vivo from 4D datasets that are not amenable to straightforward analysis.


2019 ◽  
Author(s):  
Claire Mitchell ◽  
Lauryanne Caroff ◽  
Alessandra Vigilante ◽  
Jose Alonso Solis-Lemus ◽  
Constantino Carlos Reyes-Aldasoro ◽  
...  

AbstractAccurate measurements of cell morphology and behaviour are fundamentally important for understanding how disease, molecules and drugs affect cell function in vivo. Using muscle stem cell (muSC) responses to injury in zebrafish as our biological paradigm we have established a ground truth for muSC cell behaviour. This revealed that variability in segmentation and tracking algorithms from commonly used programs are error-prone, leading us to develop a fast semi-automated image analysis pipeline that allows user defined parameters for segmentation and correction of cell tracking. Cell Tracking Profiler (CTP) operates through the freely available Icy platform, and allows user-managed cell tracking from 3D time-lapsed datasets to provide measures of cell shape and movement. Using dimensionality reduction methods, multiple correlation and regression analyses we identify myosin II-dependent parameters of muSC behaviour during regeneration. CTP and the associated statistical tools we have developed thus provide a powerful framework for analysing complex cell behaviour in vivo from 4D datasets.SummaryAnalysis of cell shape and movement from 3D time-lapsed datasets is currently very challenging. We therefore designed Cell Tracking Profiler for analysing cell behaviour from complex datasets and demonstrate its effectiveness by analysing stem cell behaviour during muscle regeneration in zebrafish.


2019 ◽  
Author(s):  
Hesam Mazidi ◽  
Tianben Ding ◽  
Arye Nehorai ◽  
Matthew D. Lew

The resolution and accuracy of single-molecule localization micro-scopes (SMLMs) are routinely benchmarked using simulated data, calibration “rulers,” or comparisons to secondary imaging modalities. However, these methods cannot quantify the nanoscale accuracy of an arbitrary SMLM dataset. Here, we show that by computing localization stability under a well-chosen perturbation with accurate knowledge of the imaging system, we can robustly measure the confidence of individual localizations without ground-truth knowledge of the sample. We demonstrate that our method, termed Wasserstein-induced flux (WIF), measures the accuracy of various reconstruction algorithms directly on experimental 2D and 3D data of microtubules and amyloid fibrils. We further show that WIF confidences can be used to evaluate the mismatch between computational models and imaging data, enhance the accuracy and resolution of recon-structed structures, and discover hidden molecular heterogeneities. As a computational methodology, WIF is broadly applicable to any SMLM dataset, imaging system, and localization algorithm.


Author(s):  
John Chiverton ◽  
Kevin Wells

This chapter applies a Bayesian formulation of the Partial Volume (PV) effect, based on the Benford distribution, to the statistical classification of nuclear medicine imaging data: specifically Positron Emission Tomography (PET) acquired as part of a PET-CT phantom imaging procedure. The Benford distribution is a discrete probability distribution of great interest for medical imaging, because it describes the probabilities of occurrence of single digits in many sources of data. The chapter thus describes the PET-CT imaging and post-processing process to derive a gold standard. Moreover, this chapter uses it as a ground truth for the assessment of a Benford classifier formulation. The use of this gold standard shows that the classification of both the simulated and real phantom imaging data is well described by the Benford distribution.


2019 ◽  
Vol 46 (13) ◽  
pp. 2722-2730 ◽  
Author(s):  
Andreas Holzinger ◽  
Benjamin Haibe-Kains ◽  
Igor Jurisica
Keyword(s):  

Minerals ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1265
Author(s):  
Sebastian Iwaszenko ◽  
Leokadia Róg

The study of the petrographic structure of medium- and high-rank coals is important from both a cognitive and a utilitarian point of view. The petrographic constituents and their individual characteristics and features are responsible for the properties of coal and the way it behaves in various technological processes. This paper considers the application of convolutional neural networks for coal petrographic images segmentation. The U-Net-based model for segmentation was proposed. The network was trained to segment inertinite, liptinite, and vitrinite. The segmentations prepared manually by a domain expert were used as the ground truth. The results show that inertinite and vitrinite can be successfully segmented with minimal difference from the ground truth. The liptinite turned out to be much more difficult to segment. After usage of transfer learning, moderate results were obtained. Nevertheless, the application of the U-Net-based network for petrographic image segmentation was successful. The results are good enough to consider the method as a supporting tool for domain experts in everyday work.


2019 ◽  
Author(s):  
Cody Baker ◽  
Emmanouil Froudarakis ◽  
Dimitri Yatsenko ◽  
Andreas S. Tolias ◽  
Robert Rosenbaum

AbstractA major goal in neuroscience is to estimate neural connectivity from large scale extracellular recordings of neural activity in vivo. This is challenging in part because any such activity is modulated by the unmeasured external synaptic input to the network, known as the common input problem. Many different measures of functional connectivity have been proposed in the literature, but their direct relationship to synaptic connectivity is often assumed or ignored. For in vivo data, measurements of this relationship would require a knowledge of ground truth connectivity, which is nearly always unavailable. Instead, many studies use in silico simulations as benchmarks for investigation, but such approaches necessarily rely upon a variety of simplifying assumptions about the simulated network and can depend on numerous simulation parameters. We combine neuronal network simulations, mathematical analysis, and calcium imaging data to address the question of when and how functional connectivity, synaptic connectivity, and latent external input variability can be untangled. We show numerically and analytically that, even though the precision matrix of recorded spiking activity does not uniquely determine synaptic connectivity, it is often closely related to synaptic connectivity in practice under various network models. This relation becomes more pronounced when the spatial structure of neuronal variability is considered jointly with precision.


2017 ◽  
Author(s):  
Stephanie Reynolds ◽  
Therese Abrahamsson ◽  
P. Jesper Sjöström ◽  
Simon R. Schultz ◽  
Pier Luigi Dragotti

AbstractIn recent years, the development of algorithms to detect neuronal spiking activity from two-photon calcium imaging data has received much attention. Meanwhile, few researchers have examined the metrics used to assess the similarity of detected spike trains with the ground truth. We highlight the limitations of the two most commonly used metrics, the spike train correlation and success rate, and propose an alternative, which we refer to as CosMIC. Rather than operating on the true and estimated spike trains directly, the proposed metric assesses the similarity of the pulse trains obtained from convolution of the spike trains with a smoothing pulse. The pulse width, which is derived from the statistics of the imaging data, reflects the temporal tolerance of the metric. The final metric score is the size of the commonalities of the pulse trains as a fraction of their average size. Viewed through the lens of set theory, CosMIC resembles a continuous Sørensen-Dice coefficient — an index commonly used to assess the similarity of discrete, presence/absence data. We demonstrate the ability of the proposed metric to discriminate the precision and recall of spike train estimates. Unlike the spike train correlation, which appears to reward overestimation, the proposed metric score is maximised when the correct number of spikes have been detected. Furthermore, we show that CosMIC is more sensitive to the temporal precision of estimates than the success rate.


Sign in / Sign up

Export Citation Format

Share Document