scholarly journals Local reliability weighting explains identification of partially masked objects in natural images

2020 ◽  
Vol 117 (47) ◽  
pp. 29363-29370 ◽  
Author(s):  
Stephen Sebastian ◽  
Eric S. Seemiller ◽  
Wilson S. Geisler

A fundamental natural visual task is the identification of specific target objects in the environments that surround us. It has long been known that some properties of the background have strong effects on target visibility. The most well-known properties are the luminance, contrast, and similarity of the background to the target. In previous studies, we found that these properties have highly lawful effects on detection in natural backgrounds. However, there is another important factor affecting detection in natural backgrounds that has received little or no attention in the masking literature, which has been concerned with detection in simpler backgrounds. Namely, in natural backgrounds the properties of the background often vary under the target, and hence some parts of the target are masked more than others. We began studying this factor, which we call the “partial masking factor,” by measuring detection thresholds in backgrounds of contrast-modulated white noise that was constructed so that the standard template-matching (TM) observer performs equally well whether or not the noise contrast modulates in the target region. If noise contrast is uniform in the target region, then this TM observer is the Bayesian optimal observer. However, when the noise contrast modulates then the Bayesian optimal observer weights the template at each pixel location by the estimated reliability at that location. We find that human performance for modulated noise backgrounds is predicted by this reliability-weighted TM (RTM) observer. More surprisingly, we find that human performance for natural backgrounds is also predicted by the RTM observer.

1993 ◽  
Vol 71 (5) ◽  
pp. 926-932 ◽  
Author(s):  
S. D. Turnbull ◽  
J. M. Terhune

Pure-tone hearing thresholds of a harbour seal (Phoca vitulina) were measured in air and underwater using behavioural psychophysical techniques. A 50-ms sinusoidal pulse was presented in both white-noise masked and unmasked situations at pulse repetition rates of 1, 2, 4, and 10/s. Test frequencies were 0.5, 1.0, 2.0, 4.0, and 8.0 kHz in air and 2.0, 4.0, 8.0, and 16.0 kHz underwater. Relative to 1 pulse/s, mean threshold shifts were −1, −3, and −5 dB at 2, 4, and 10 pulses/s, respectively. The threshold shifts from 1 to 10 pulses/s were significant (F = 12.457, df = 2,36, p < 0.001) and there was no difference in the threshold shifts between the masked and unmasked situations (F = 2.585; df = 1,50; p > 0.10). Broadband masking caused by meteorological or industrial sources will closely resemble the white-noise situation. At high calling rates, the numerous overlapping calls of some species (e.g., harp seal, Phoca groenlandica) present virtually continous "background noise" which also resembles the broadband white-noise masking situation. An implication of lower detection thresholds is that if a seal regularly repeats short vocalizations, the communication range of that call could be increased significantly (80% at 10 pulses/s). This could have important implications during the breeding season should storms or shipping noises occur or when some pinniped species become increasingly vocal and the background noise of conspecifics increases.


1991 ◽  
Vol 69 (8) ◽  
pp. 2059-2066 ◽  
Author(s):  
J. M. Terhune

In-air pure tone detection thresholds of a harbour seal (Phoca vitulina) were measured using behavioural psychophysical techniques. Thresholds dropped from about 70 dB re 20 μPa at 0.1 kHz to about 35 dB re 20 μ Pa at 4 kHz and then increased to about 45 dB re 20 μPa at 16 kHz. Increased sensitivities at 2 and 8 kHz, which have been reported in other pinnipeds, were not evident. In-air intensity detection thresholds averaged 32 dB above their underwater counterparts (1–16 kHz). Masking studies found the critical ratios at 0.25, 0.5, and 1 kHz to be 24, 15, and 21 dB, respectively (white noise masker). From 0.2 to 1.5 kHz, bandwidths 20 dB below the level of pure tone maskers were 0.16–0.18 kHz. Circumstantial evidence suggests the possibility that blood vascular changes associated with diving might also influence the sensitivity of the auditory systems of seals. Under optimal conditions, a pup's airborne cries may be detected by its mother at ranges of 1 km or more.


2010 ◽  
Vol 104 (4) ◽  
pp. 2291-2301 ◽  
Author(s):  
Kiyohiro Maeda ◽  
Hiroki Yamamoto ◽  
Masaki Fukunaga ◽  
Masahiro Umeda ◽  
Chuzo Tanaka ◽  
...  

Metacontrast is a visual illusion in which the visibility of a target stimulus is virtually lost when immediately followed by a nonoverlapping mask stimulus. For a colored target, metacontrast is color-selective, with target visibility markedly reduced when the mask and target are the same color, but only slightly reduced when the colors differ. This study investigated neural correlates of color-selective metacontrast for cone-opponent red and green stimuli in the human V1, V2, and V3 using functional magnetic resonance imaging. Neural activity was suppressed when the target was rendered less visible by the same-colored mask, and the suppression was localized in the cortical region retinotopically representing the target, correlating with the perceptual topography of visibility/invisibility rather than the physical topography of the stimulus. Retinotopy-based group analysis found that activity suppression was statistically significant for V2 and V3 and that its localization to the target region was statistically significant for V2. These results suggest that retinotopic color representations in early visual areas, especially in V2, are closely linked to the visibility of color.


2009 ◽  
Vol 26 (1) ◽  
pp. 109-121 ◽  
Author(s):  
WILSON S. GEISLER ◽  
JEFFREY S. PERRY

AbstractCorrectly interpreting a natural image requires dealing properly with the effects of occlusion, and hence, contour grouping across occlusions is a major component of many natural visual tasks. To better understand the mechanisms of contour grouping across occlusions, we (a) measured the pair-wise statistics of edge elements from contours in natural images, as a function of edge element geometry and contrast polarity, (b) derived the ideal Bayesian observer for a contour occlusion task where the stimuli were extracted directly from natural images, and then (c) measured human performance in the same contour occlusion task. In addition to discovering new statistical properties of natural contours, we found that naïve human observers closely parallel ideal performance in our contour occlusion task. In fact, there was no region of the four-dimensional stimulus space (three geometry dimensions and one contrast dimension) where humans did not closely parallel the performance of the ideal observer (i.e., efficiency was approximately constant over the entire space). These results reject many other contour grouping hypotheses and strongly suggest that the neural mechanisms of contour grouping are tightly related to the statistical properties of contours in natural images.


1994 ◽  
Vol 72 (11) ◽  
pp. 1863-1866 ◽  
Author(s):  
S. D. Turnbull

The masked pure tone thresholds of a harbour seal (Phoca vitulina) were measured at various angles using a white noise masker. The white noise source was placed at 0°, 30°, 60°, and 90° relative to the midline of the seal's head (0°). The masked pure tone thresholds for each angle were determined at 2, 4, 8, and 16 kHz. As the angle separating the signal and noise sources increased from 0° to 90°, the critical ratios of the harbour seal decreased by 1–4 dB. This shift in masked thresholds from a reference point of 0° azimuth was significant (H = 10.374, df = 3,16, p < 0.05). No significant difference was found in masked thresholds between 0° and 30° or between 60° and 90°. This indicates that if a noise source is separated by more than 30° relative to the location of a vocalizing seal, signal detection thresholds will be enhanced and communication distances increased.


2009 ◽  
Vol 26 (1) ◽  
pp. 93-108 ◽  
Author(s):  
SHENG ZHANG ◽  
CRAIG K. ABBEY ◽  
MIGUEL P. ECKSTEIN

AbstractThe neural mechanisms driving perception and saccades during search use information about the target but are also based on an inhibitory surround not present in the target luminance profile (e.g., Eckstein et al., 2007). Here, we ask whether these inhibitory surrounds might reflect a strategy that the brain has adapted to optimize the search for targets in natural scenes. To test this hypothesis, we sought to estimate the best linear template (behavioral receptive field), built from linear combinations of Gabor channels representing V1 simple cells in search for an additive Gaussian target embedded in natural images. Statistically nonstationary and non-Gaussian properties of natural scenes preclude calculation of the best linear template from analytic expressions and require an iterative optimization method such as a virtual evolution via a genetic algorithm. Evolved linear receptive fields built from linear combinations of Gabor functions include substantial inhibitory surround, larger than those found in humans performing target search in white noise. The inhibitory surrounds were robust to changes in the contrast of the signal, generalized to a larger calibrated natural image data set, and tasks in which the signal occluded other objects in the image. We show that channel nonlinearities can have strong effects on the observed linear behavioral receptive field but preserve the inhibitory surrounds. Together, the results suggest that the apparent suboptimality of inhibitory surrounds in human behavioral receptive fields when searching for a target in white noise might reflect a strategy to optimize detection of signals in natural scenes. Finally, we contend that optimized linear detection of spatially compact signals in natural images might be a new possible hypothesis, distinct from decorrelation of visual input and sparse representations (e.g., Graham et al., 2006), to explain the evolution of center–surround organization of receptive fields in early vision.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 289-289
Author(s):  
R Näsänen ◽  
C O'Leary

Using a forced-choice method, we determined human contrast thresholds for recognising handwritten numerals. Digitised numerals were presented on a computer display with additive white static noise. The numerals were either unfiltered or were filtered to two-octave spatial-frequency bands of different centre frequencies varying from 1.2 to 17.7 cycles/object height. We had ten variations of each numeral representing the handwriting of different persons. Human performance was compared with the performance of an ideal ‘signals-known-exactly’ (template matching) observer, and the results were presented in terms of efficiency. The highest efficiency for the band-pass filtered numerals was about 11% at centre frequencies of 3 – 5 cycles/object. The efficiency declined towards lower and higher centre frequencies so that at 1.2 cycles/object and 18 cycles/object the efficiency was about 4%. The efficiencies for unfiltered numerals were about 10% – 14%, being thus slightly higher than or equal to the highest efficiency of the band-pass filtered numerals. If only a two-octave band of spatial frequencies contributed character recognition, as has been suggested previously, the unfiltered numerals would contain redundant low-frequency and high-frequency information. Band-pass filtered numerals of optimal centre frequency would contain less redundancy, and a larger proportion of contrast energy would be used. Therefore, efficiency for them should have been higher than for unfiltered numerals. Since this was not the case, it seems that human observers are able to use a relatively broad band of spatial frequencies in character recognition.


2002 ◽  
Vol 205 (4) ◽  
pp. 559-572 ◽  
Author(s):  
Raymond Campan ◽  
Miriam Lehrer

SUMMARY In the present study, the performance of two bee species, the honeybee Apis mellifera and the leaf-cutter bee Megachile rotundata, in discriminating among various closed (convex) shapes was examined systematically for the first time. Bees were trained to each of five different shapes, a disc, a square, a diamond and two different triangles, all of the same area, using fresh bees in each experiment. In subsequent tests, the trained bees were given a choice between the learned shape and each of the other four shapes. Two sets of experiments were conducted with both species. In the first, solid black shapes were presented against a white background, thus providing a high luminance contrast. In the second, the shapes carried a random black-and-white pattern and were presented 5 cm in front of a similar pattern, thus producing motion contrast, rather than luminance contrast, against the background. The results obtained with the solid shapes reveal that both bee species accomplish the discrimination, although the performance of the honeybee is significantly better than that of the leaf-cutter bee. Furthermore, the effectiveness of the various shapes differs between the two species. However, in neither species is the discrimination performance correlated with the amount of overlap of the black areas contained in the various pairs of shapes, suggesting that, in our experiments, shape discrimination is not based on a template-matching process. We propose that it is based on the use of local parameters situated at the outline of the shape, such as the position of angles or acute points and, in particular, the position and orientation of edges. This conclusion is supported by the finding that bees of both species accomplish the discrimination even with the patterned shapes. These shapes are visible only because of the discontinuity of the speed of image motion perceived at the edge between the shape and the background.


2020 ◽  
Vol 10 (10) ◽  
pp. 3423
Author(s):  
Hsiang-Chieh Chen

This article presents an automated vision-based algorithm for the die-scale inspection of wafer images captured using scanning acoustic tomography (SAT). This algorithm can find defective and abnormal die-scale patterns, and produce a wafer map to visualize the distribution of defects and anomalies on the wafer. The main procedures include standard template extraction, die detection through template matching, pattern candidate prediction through clustering, and pattern classification through deep learning. To conduct the template matching, we first introduce a two-step method to obtain a standard template from the original SAT image. Subsequently, a majority of the die patterns are detected through template matching. Thereafter, the columns and rows arranged from the detected dies are predicted using a clustering method; thus, an initial wafer map is produced. This map is composed of detected die patterns and predicted pattern candidates. In the final phase of the proposed algorithm, we implement a deep learning-based model to determine defective and abnormal patterns in the wafer map. The experimental results verified the effectiveness and efficiency of our proposed algorithm. In conclusion, the proposed method performs well in identifying defective and abnormal die patterns, and produces a wafer map that presents important information for solving wafer fabrication issues.


2019 ◽  
Vol 2019 ◽  
pp. 1-11 ◽  
Author(s):  
Jianjiang Zhu ◽  
Siquan Yu ◽  
Zhi Han ◽  
Yandong Tang ◽  
Chengdong Wu

Underwater object recognition in sonar images, such as mine detection and wreckage detection of a submerged airplane, is a very challenging task. The main difficulties include but are not limited to object rotation, confusion from false targets and complex backgrounds, and extensibility of recognition ability on diverse types of objects. In this paper, we propose an underwater object detection and recognition method using a transformable template matching approach based on prior knowledge. Specifically, we first extract features and construct a template from sonar video sequences based on the analysis of acoustic shadows and highlight regions. Then, we identify the target region in the objective image by fast saliency detection techniques based on FFT, which can significantly improve efficiency by avoiding an exhaustive global search. After affine transformation of the template according to the orientation of the target, we extract normalized gradient features and calculate the similarity between the template and the target region, which can solve various difficulties mentioned above using only one template. Experimental results demonstrate that the proposed method can well recognize different underwater objects, such as mine-like objects and triangle-like objects and can satisfy the demands of real-time application.


Sign in / Sign up

Export Citation Format

Share Document