Visual target-object search using Gabor transform

1999 ◽  
Author(s):  
Dili Zhang ◽  
Isamu Matsuura ◽  
Yoshihiko Nomura
2018 ◽  
Vol 71 (6) ◽  
pp. 1457-1468
Author(s):  
Péter Pongrácz ◽  
András Péter ◽  
Ádám Miklósi

A central problem of behavioural studies providing artificial visual stimuli for non-human animals is to determine how subjects perceive and process these stimuli. Especially in the case of videos, it is important to ascertain that animals perceive the actual content of the images and are not just reacting to the motion cues in the presentation. In this study, we set out to investigate how dogs process life-sized videos. We aimed to find out whether dogs perceive the actual content of video images or whether they only react to the videos as a set of dynamic visual elements. For this purpose, dogs were presented with an object search task where a life-sized projected human was hiding a target object. The videos were either normally oriented or displayed upside down, and we analysed dogs’ reactions towards the projector screen after the video presentations, and their performance in the search task. Results indicated that in the case of the normally oriented videos, dogs spontaneously perceived the actual content of the images. However, the ‘Inverted’ videos were first processed as a set of unrelated visual elements, and only after some exposure to these videos did the dogs show signs of perceiving the unusual configuration of the depicted scene. Our most important conclusion was that dogs process the same type of artificial visual stimuli in different ways, depending on the familiarity of the depicted scene, and that the processing mode can change with exposure to unfamiliar stimuli.


2009 ◽  
Vol 101 (4) ◽  
pp. 1699-1704 ◽  
Author(s):  
Jeremiah Y. Cohen ◽  
Richard P. Heitz ◽  
Geoffrey F. Woodman ◽  
Jeffrey D. Schall

Visual search for a target object among distractors often takes longer when more distractors are present. To understand the neural basis of this capacity limitation, we recorded activity from visually responsive neurons in the frontal eye field (FEF) of macaque monkeys searching for a target among distractors defined by form (randomly oriented T or L). To test the hypothesis that the delay of response time with increasing number of distractors originates in the delay of attention allocation by FEF neurons, we manipulated the number of distractors presented with the search target. When monkeys were presented with more distractors, visual target selection was delayed and neuronal activity was reduced in proportion to longer response time. These findings indicate that the time taken by FEF neurons to select the target contributes to the variation in visual search efficiency.


Author(s):  
Пилип Олександрович Приставка ◽  
Дмитро Ігорович Гісь ◽  
Артем Валерійович Чирков

2018 ◽  
Vol 43 (1) ◽  
pp. 123-152 ◽  
Author(s):  
Mohsen Kaboli ◽  
Kunpeng Yao ◽  
Di Feng ◽  
Gordon Cheng

2018 ◽  
Author(s):  
Noam Roth ◽  
Nicole C. Rust

ABSTRACTSearching for a specific visual object requires our brain to compare the items in view with a remembered representation of the sought target to determine whether a target match is present. This comparison is thought to be implemented, in part, via the combination of top-down modulations reflecting target identity with feed-forward visual representations. However, it remains unclear whether top-down signals are integrated at a single locus within the ventral visual pathway (e.g. V4) or at multiple stages (e.g. both V4 and inferotemporal cortex, IT). To investigate, we recorded neural responses in V4 and IT as rhesus monkeys performed a task that required them to identify when a target object appeared across variation in position, size and background context. We found non-visual, task-specific signals in both V4 and IT. To evaluate whether V4 was the only locus for the integration of top-down signals, we evaluated several feed-forward accounts of processing from V4 to IT, including a model in which IT preferentially sampled from the best V4 units and a model that allowed for nonlinear IT computation. IT task-specific modulation was not accounted for by any of these feed-forward descriptions, suggesting that during object search, top-down signals are integrated directly within IT.NEW & NOTEWORTHYTo find specific objects, the brain must integrate top-down, target-specific signals with visual information about objects in view. However, the exact route of this integration in the ventral visual pathway is unclear. In the first study to systematically compare V4 and IT during an invariant object search task, we demonstrate that top-down signals found in IT cannot be described as being inherited from V4, but rather must be integrated directly within IT itself.


Author(s):  
Zhen Zeng ◽  
Adrian Röfer ◽  
Odest Chadwicke Jenkins

We aim for mobile robots to function in a variety of common human environments, which requires them to efficiently search previously unseen target objects. We can exploit background knowledge about common spatial relations between landmark objects and target objects to narrow down search space. In this paper, we propose an active visual object search strategy method through our introduction of the Semantic Linking Maps (SLiM) model. SLiM simultaneously maintains the belief over a target object's location as well as landmark objects' locations, while accounting for probabilistic inter-object spatial relations. Based on SLiM, we describe a hybrid search strategy that selects the next best view pose for searching for the target object based on the maintained belief. We demonstrate the efficiency of our SLiM-based search strategy through comparative experiments in simulated environments. We further demonstrate the real-world applicability of SLiM-based search in scenarios with a Fetch mobile manipulation robot.


2019 ◽  
Vol 122 (6) ◽  
pp. 2522-2540 ◽  
Author(s):  
Noam Roth ◽  
Nicole C. Rust

Searching for a specific visual object requires our brain to compare the items in view with a remembered representation of the sought target to determine whether a target match is present. This comparison is thought to be implemented, in part, via the combination of top-down modulations reflecting target identity with feed-forward visual representations. However, it remains unclear whether top-down signals are integrated at a single locus within the ventral visual pathway (e.g., V4) or at multiple stages [e.g., both V4 and inferotemporal cortex (IT)]. To investigate, we recorded neural responses in V4 and IT as rhesus monkeys performed a task that required them to identify when a target object appeared across variation in position, size, and background context. We found nonvisual, task-specific signals in both V4 and IT. To evaluate whether V4 was the only locus for the integration of top-down signals, we evaluated several feed-forward accounts of processing from V4 to IT, including a model in which IT preferentially sampled from the best V4 units and a model that allowed for nonlinear IT computation. IT task-specific modulation was not accounted for by any of these feed-forward descriptions, suggesting that during object search, top-down signals are integrated directly within IT. NEW & NOTEWORTHY To find specific objects, the brain must integrate top-down, target-specific signals with visual information about objects in view. However, the exact route of this integration in the ventral visual pathway is unclear. In the first study to systematically compare V4 and inferotemporal cortex (IT) during an invariant object search task, we demonstrate that top-down signals found in IT cannot be described as being inherited from V4 but rather must be integrated directly within IT itself.


Author(s):  
Tan Yu ◽  
Jingjing Meng ◽  
Junsong Yuan

This paper addresses the problem of video-level object instance search, which aims to retrieve the videos in the database that contain a given query object instance. Without prior knowledge about "when" and "where" an object of interest may appear in a video, determining "whether" a video contains the target object is computationally prohibitive, as it requires exhaustively matching the query against all possible spatial-temporal locations in each video that an object may appear. To alleviate the computational and memory cost, we propose the Reconstruction-based Object SEarch (ROSE) method.It characterizes a huge corpus of features of possible spatial-temporal locations in the video into the parameters of the reconstruction model. Since the memory cost of storing reconstruction model is much less than that of storing features of possible spatial-temporal locations in the video, the efficiency of the search is significantly boosted. Comprehensive experiments on three benchmark datasets demonstrate the promising performance of the proposed ROSE method.


Sign in / Sign up

Export Citation Format

Share Document