Limitations of CNNs for Approximating the Ideal Observer Despite Quantity of Training Data or Depth of Network

The performance of a convolutional neural network (CNN) on an image texture detection task as a function of linear image processing and the number of training images is investigated. Performance is quantified by the area under (AUC) the receiver operating characteristic (ROC) curve. The Ideal Observer (IO) maximizes AUC but depends on high-dimensional image likelihoods. In many cases, the CNN performance can approximate the IO performance. This work demonstrates counterexamples where a full-rank linear transform degrades the CNN performance below the IO in the limit of large quantities of training data and network layers. A subsequent linear transform changes the images’ correlation structure, improves the AUC, and again demonstrates the CNN dependence on linear processing. Compression strictly decreases or maintains the IO detection performance while compression can increase the CNN performance especially for small quantities of training data. Results indicate an optimal compression ratio for the CNN based on task difficulty, compression method, and number of training images.

Download Full-text

CNN performance dependence on linear image processing

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.10.ipas-182 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 310-1-310-7

Author(s):

Khalid Omer ◽

Luca Caucci ◽

Meredith Kupinski

Keyword(s):

Image Processing ◽

Texture Classification ◽

Full Rank ◽

Detection Performance ◽

Ideal Observer ◽

Training Data ◽

Image Texture ◽

Training Images ◽

Analytic Expressions ◽

Linear Compression

This work reports on convolutional neural network (CNN) performance on an image texture classification task as a function of linear image processing and number of training images. Detection performance of single and multi-layer CNNs (sCNN/mCNN) are compared to optimal observers. Performance is quantified by the area under the receiver operating characteristic (ROC) curve, also known as the AUC. For perfect detection AUC = 1.0 and AUC = 0.5 for guessing. The Ideal Observer (IO) maximizes AUC but is prohibitive in practice because it depends on high-dimensional image likelihoods. The IO performance is invariant to any fullrank, invertible linear image processing. This work demonstrates the existence of full-rank, invertible linear transforms that can degrade both sCNN and mCNN even in the limit of large quantities of training data. A subsequent invertible linear transform changes the images’ correlation structure again and can improve this AUC. Stationary textures sampled from zero mean and unequal covariance Gaussian distributions allow closed-form analytic expressions for the IO and optimal linear compression. Linear compression is a mitigation technique for high-dimension low sample size (HDLSS) applications. By definition, compression strictly decreases or maintains IO detection performance. For small quantities of training data, linear image compression prior to the sCNN architecture can increase AUC from 0.56 to 0.93. Results indicate an optimal compression ratio for CNN based on task difficulty, compression method, and number of training images.

Download Full-text

Bounds on the area under the receiver operating characteristic curve for the ideal observer

Journal of the Optical Society of America A ◽

10.1364/josaa.19.001963 ◽

2002 ◽

Vol 19 (10) ◽

pp. 1963 ◽

Cited By ~ 9

Author(s):

Eric Clarkson

Keyword(s):

Receiver Operating Characteristic Curve ◽

Receiver Operating Characteristic ◽

Operating Characteristic ◽

Characteristic Curve ◽

Ideal Observer ◽

Operating Characteristic Curve ◽

The Ideal ◽

Receiver Operating

Download Full-text

Psychophysics and the ideal observer decision models in reaction time

PsycEXTRA Dataset ◽

10.1037/e685262012-087 ◽

1963 ◽

Author(s):

R. J. Audley

Keyword(s):

Reaction Time ◽

Ideal Observer ◽

Decision Models ◽

The Ideal

Download Full-text

Object Detection Using an Ideal Observer Model

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.16.avm-041 ◽

2020 ◽

Vol 2020 (16) ◽

pp. 41-1-41-7

Author(s):

Orit Skorka ◽

Paul J. Kane

Keyword(s):

Object Detection ◽

Matched Filter ◽

Image Sensors ◽

Ideal Observer ◽

Noise Power ◽

Relative Evaluation ◽

Detection And Identification ◽

Cross Correlation Analysis ◽

Observer Model ◽

The Ideal

Many of the metrics developed for informational imaging are useful in automotive imaging, since many of the tasks – for example, object detection and identification – are similar. This work discusses sensor characterization parameters for the Ideal Observer SNR model, and elaborates on the noise power spectrum. It presents cross-correlation analysis results for matched-filter detection of a tribar pattern in sets of resolution target images that were captured with three image sensors over a range of illumination levels. Lastly, the work compares the crosscorrelation data to predictions made by the Ideal Observer Model and demonstrates good agreement between the two methods on relative evaluation of detection capabilities.

Download Full-text

Image quality, the ideal observer, and human performance of radiologic decision tasks

Academic Radiology ◽

10.1016/s1076-6332(05)80411-8 ◽

1995 ◽

Vol 2 (6) ◽

pp. 522-526 ◽

Cited By ~ 27

Author(s):

Arthur Burgess

Keyword(s):

Image Quality ◽

Human Performance ◽

Ideal Observer ◽

The Ideal ◽

Decision Tasks

Download Full-text

Optimization of energy window for90Y bremsstrahlung SPECT imaging for detection tasks using the ideal observer with model-mismatch

Medical Physics ◽

10.1118/1.4805095 ◽

2013 ◽

Vol 40 (6Part1) ◽

pp. 062502 ◽

Cited By ~ 11

Author(s):

Xing Rong ◽

Michael Ghaly ◽

Eric C. Frey

Keyword(s):

Ideal Observer ◽

Spect Imaging ◽

Energy Window ◽

Model Mismatch ◽

The Ideal

Download Full-text

Collimator optimization and collimator-detector response compensation in myocardial perfusion SPECT using the ideal observer with and without model mismatch and an anthropomorphic model observer

Physics in Medicine and Biology ◽

10.1088/0031-9155/61/5/2109 ◽

2016 ◽

Vol 61 (5) ◽

pp. 2109-2123

Author(s):

Michael Ghaly ◽

Jonathan M Links ◽

Eric C Frey

Keyword(s):

Myocardial Perfusion ◽

Myocardial Perfusion Spect ◽

Ideal Observer ◽

Detector Response ◽

Model Mismatch ◽

Perfusion Spect ◽

Model Observer ◽

Collimator Optimization ◽

The Ideal

Download Full-text

Tactile orientation perception: an ideal observer analysis of human psychophysical performance in relation to macaque area 3b receptive fields

Journal of Neurophysiology ◽

10.1152/jn.00631.2015 ◽

2015 ◽

Vol 114 (6) ◽

pp. 3076-3096 ◽

Cited By ~ 9

Author(s):

Ryan M. Peters ◽

Phillip Staibano ◽

Daniel Goldreich

Keyword(s):

Cortical Neurons ◽

Human Performance ◽

Spatial Perception ◽

Receptive Fields ◽

Human Perception ◽

Ideal Observer ◽

Orientation Discrimination ◽

Orientation Perception ◽

Area 3B ◽

The Ideal

The ability to resolve the orientation of edges is crucial to daily tactile and sensorimotor function, yet the means by which edge perception occurs is not well understood. Primate cortical area 3b neurons have diverse receptive field (RF) spatial structures that may participate in edge orientation perception. We evaluated five candidate RF models for macaque area 3b neurons, previously recorded while an oriented bar contacted the monkey's fingertip. We used a Bayesian classifier to assign each neuron a best-fit RF structure. We generated predictions for human performance by implementing an ideal observer that optimally decoded stimulus-evoked spike counts in the model neurons. The ideal observer predicted a saturating reduction in bar orientation discrimination threshold with increasing bar length. We tested 24 humans on an automated, precision-controlled bar orientation discrimination task and observed performance consistent with that predicted. We next queried the ideal observer to discover the RF structure and number of cortical neurons that best matched each participant's performance. Human perception was matched with a median of 24 model neurons firing throughout a 1-s period. The 10 lowest-performing participants were fit with RFs lacking inhibitory sidebands, whereas 12 of the 14 higher-performing participants were fit with RFs containing inhibitory sidebands. Participants whose discrimination improved as bar length increased to 10 mm were fit with longer RFs; those who performed well on the 2-mm bar, with narrower RFs. These results suggest plausible RF features and computational strategies underlying tactile spatial perception and may have implications for perceptual learning.

Download Full-text