Stimulus Context and View Dependence in Object Recognition

Perception ◽  
1998 ◽  
Vol 27 (1) ◽  
pp. 47-68 ◽  
Author(s):  
Fiona N Newell

The effect of stimulus factors such as interobject similarity and stimulus density on the recognition of objects across changes in view was investigated in five experiments. The recognition of objects across views was found to depend on the degree of interobject similarity and on stimulus density: recognition was view dependent when both interobject similarity and stimulus density were high, irrespective of the familiarity of the target object. However, when stimulus density or interobject similarity was low recognition was invariant to viewpoint. It was found that recognition was accomplished through view-dependent procedures when discriminability between objects was low. The findings are discussed in terms of an exemplar-based model in which the dimensions used for discriminating between objects are optimised to maximise the differences between the objects. This optimisation process is characterised as a perceptual ‘ruler’ which measures interobject similarity by stretching across objects in representational space. It is proposed that the ‘ruler’ optimises the feature differences between objects in such a way that recognition is view invariant but that such a process incurs a cost in discriminating between small feature differences, which results in view-dependent recognition performance.

Author(s):  
Michael S. Brickner ◽  
Amir Zvuloni

Thermal imaging (TI) systems, transform the distribution of relative temperatures in a scene into a visible TV image. TIs differ significantly from regular TV images. Most TI systems allow their operators to select preferred polarity which determines the way in which gray shades represent different temperatures. Polarity may be set to either black hot (BH) or white hot (WH). The present experiments were designed to investigate the effects of polarity on object recognition performance in TI and to compare object recognition performance of experts and novices. In the first experiment, twenty flight candidates were asked to recognize target objects in 60 dynamic TI recordings taken from two different TI systems. The targets included a variety of human placed and natural objects. Each subject viewed half the targets in BH and the other half in WH polarity in a balanced experimental design. For 24 out of the 60 targets one direction of polarity produced better performance than the other. Although the direction of superior polarity (BH or WH better) was not consistent, the preferred representation of the target object was very consistent. For example, vegetation was more readily recognized when presented as dark objects on a brighter background. The results are discussed in terms of importance of surface determinants versus edge determinants in the recognition of TI objects. In the second experiment, the performance of 10 expert TI users was found to be significantly more accurate but not much faster than the performance of 20 novice subjects.


2020 ◽  
Author(s):  
Noor Seijdel ◽  
Jessica Loke ◽  
Ron van de Klundert ◽  
Matthew van der Meer ◽  
Eva Quispel ◽  
...  

AbstractWhile feed-forward activity may suffice for recognizing objects in isolation, additional visual operations that aid object recognition might be needed for real-world scenes. One such additional operation is figure-ground segmentation; extracting the relevant features and locations of the target object while ignoring irrelevant features. In this study of 60 participants, we show objects on backgrounds of increasing complexity to investigate whether recurrent computations are increasingly important for segmenting objects from more complex backgrounds. Three lines of evidence show that recurrent processing is critical for recognition of objects embedded in complex scenes. First, behavioral results indicated a greater reduction in performance after masking objects presented on more complex backgrounds; with the degree of impairment increasing with increasing background complexity. Second, electroencephalography (EEG) measurements showed clear differences in the evoked response potentials (ERPs) between conditions around 200ms - a time point beyond feed-forward activity and object decoding based on the EEG signal indicated later decoding onsets for objects embedded in more complex backgrounds. Third, Deep Convolutional Neural Network performance confirmed this interpretation; feed-forward and less deep networks showed a higher degree of impairment in recognition for objects in complex backgrounds compared to recurrent and deeper networks. Together, these results support the notion that recurrent computations drive figure-ground segmentation of objects in complex scenes.


2019 ◽  
Vol 35 (05) ◽  
pp. 525-533
Author(s):  
Evrim Gülbetekin ◽  
Seda Bayraktar ◽  
Özlenen Özkan ◽  
Hilmi Uysal ◽  
Ömer Özkan

AbstractThe authors tested face discrimination, face recognition, object discrimination, and object recognition in two face transplantation patients (FTPs) who had facial injury since infancy, a patient who had a facial surgery due to a recent wound, and two control subjects. In Experiment 1, the authors showed them original faces and morphed forms of those faces and asked them to rate the similarity between the two. In Experiment 2, they showed old, new, and implicit faces and asked whether they recognized them or not. In Experiment 3, they showed them original objects and morphed forms of those objects and asked them to rate the similarity between the two. In Experiment 4, they showed old, new, and implicit objects and asked whether they recognized them or not. Object discrimination and object recognition performance did not differ between the FTPs and the controls. However, the face discrimination performance of FTP2 and face recognition performance of the FTP1 were poorer than that of the controls were. Therefore, the authors concluded that the structure of the face might affect face processing.


Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 210 ◽  
Author(s):  
Yi-Chun Du ◽  
Muslikhin Muslikhin ◽  
Tsung-Han Hsieh ◽  
Ming-Shyan Wang

This paper develops a hybrid algorithm of adaptive network-based fuzzy inference system (ANFIS) and regions with convolutional neural network (R-CNN) for stereo vision-based object recognition and manipulation. The stereo camera at an eye-to-hand configuration firstly captures the image of the target object. Then, the shape, features, and centroid of the object are estimated. Similar pixels are segmented by the image segmentation method, and similar regions are merged through selective search. The eye-to-hand calibration is based on ANFIS to reduce computing burden. A six-degree-of-freedom (6-DOF) robot arm with a gripper will conduct experiments to demonstrate the effectiveness of the proposed system.


2013 ◽  
Vol 2 (2) ◽  
pp. 66-79 ◽  
Author(s):  
Onsy A. Abdel Alim ◽  
Amin Shoukry ◽  
Neamat A. Elboughdadly ◽  
Gehan Abouelseoud

In this paper, a pattern recognition module that makes use of 3-D images of objects is presented. The proposed module takes advantage of both the generalization capability of neural networks and the possibility of manipulating 3-D images to generate views at different poses of the object that is to be recognized. This allows the construction of a robust 3-D object recognition module that can find use in various applications including military, biomedical and mine detection applications. The paper proposes an efficient training procedure and decision making strategy for the suggested neural network. Sample results of testing the module on 3-D images of several objects are also included along with an insightful discussion of the implications of the results.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 33-33
Author(s):  
G M Wallis ◽  
H H Bülthoff

The view-based approach to object recognition supposes that objects are stored as a series of associated views. Although representation of these views as combinations of 2-D features allows generalisation to similar views, it remains unclear how very different views might be associated together to allow recognition from any viewpoint. One cue present in the real world other than spatial similarity, is that we usually experience different objects in temporally constrained, coherent order, and not as randomly ordered snapshots. In a series of recent neural-network simulations, Wallis and Baddeley (1997 Neural Computation9 883 – 894) describe how the association of views on the basis of temporal as well as spatial correlations is both theoretically advantageous and biologically plausible. We describe an experiment aimed at testing their hypothesis in human object-recognition learning. We investigated recognition performance of faces previously presented in sequences. These sequences consisted of five views of five different people's faces, presented in orderly sequence from left to right profile in 45° steps. According to the temporal-association hypothesis, the visual system should associate the images together and represent them as different views of the same person's face, although in truth they are images of different people's faces. In a same/different task, subjects were asked to say whether two faces seen from different viewpoints were views of the same person or not. In accordance with theory, discrimination errors increased for those faces seen earlier in the same sequence as compared with those faces which were not ( p<0.05).


2011 ◽  
Vol 2 (2) ◽  
pp. 207-226
Author(s):  
LYDIA SÁNCHEZ ◽  
MANUEL CAMPOS

Puzzles concerning attitude reports are at the origin of traditional theories of content. According to most of these theories, content has to involve some sort of conceptual entities, like senses, which determine reference. Conceptual views, however, have been challenged by direct reference theories and informational perspectives on content. In this paper we lay down the central elements of the more relevant strategies for solving cognitive puzzles. We then argue that the best solution available to those who maintain a view of content as truth conditions is to abandon the idea that content is the only element of mental attitudes that can make a difference as to the truth value of attitude reports. We finally resort to means of recognition of objects as one obvious element that helps explain differences in attitudes.


2015 ◽  
Vol 27 (4) ◽  
pp. 787-797 ◽  
Author(s):  
Matthias Guggenmos ◽  
Marcus Rothkirch ◽  
Klaus Obermayer ◽  
John-Dylan Haynes ◽  
Philipp Sterzer

Perceptual learning is the improvement in perceptual performance through training or exposure. Here, we used fMRI before and after extensive behavioral training to investigate the effects of perceptual learning on the recognition of objects under challenging viewing conditions. Objects belonged either to trained or untrained categories. Trained categories were further subdivided into trained and untrained exemplars and were coupled with high or low monetary rewards during training. After a 3-day training, object recognition was markedly improved. Although there was a considerable transfer of learning to untrained exemplars within categories, an enhancing effect of reward reinforcement was specific to trained exemplars. fMRI showed that hippocampus responses to both trained and untrained exemplars of trained categories were enhanced by perceptual learning and correlated with the effect of reward reinforcement. Our results suggest a key role of hippocampus in object recognition after perceptual learning.


Sign in / Sign up

Export Citation Format

Share Document