Online Planning for Target Object Search in Clutter under Partial Observability

Author(s):  
Yuchen Xiao ◽  
Sammie Katt ◽  
Andreas ten Pas ◽  
Shengjian Chen ◽  
Christopher Amato
2018 ◽  
Vol 71 (6) ◽  
pp. 1457-1468
Author(s):  
Péter Pongrácz ◽  
András Péter ◽  
Ádám Miklósi

A central problem of behavioural studies providing artificial visual stimuli for non-human animals is to determine how subjects perceive and process these stimuli. Especially in the case of videos, it is important to ascertain that animals perceive the actual content of the images and are not just reacting to the motion cues in the presentation. In this study, we set out to investigate how dogs process life-sized videos. We aimed to find out whether dogs perceive the actual content of video images or whether they only react to the videos as a set of dynamic visual elements. For this purpose, dogs were presented with an object search task where a life-sized projected human was hiding a target object. The videos were either normally oriented or displayed upside down, and we analysed dogs’ reactions towards the projector screen after the video presentations, and their performance in the search task. Results indicated that in the case of the normally oriented videos, dogs spontaneously perceived the actual content of the images. However, the ‘Inverted’ videos were first processed as a set of unrelated visual elements, and only after some exposure to these videos did the dogs show signs of perceiving the unusual configuration of the depicted scene. Our most important conclusion was that dogs process the same type of artificial visual stimuli in different ways, depending on the familiarity of the depicted scene, and that the processing mode can change with exposure to unfamiliar stimuli.


Author(s):  
Пилип Олександрович Приставка ◽  
Дмитро Ігорович Гісь ◽  
Артем Валерійович Чирков

2018 ◽  
Vol 43 (1) ◽  
pp. 123-152 ◽  
Author(s):  
Mohsen Kaboli ◽  
Kunpeng Yao ◽  
Di Feng ◽  
Gordon Cheng

2018 ◽  
Author(s):  
Noam Roth ◽  
Nicole C. Rust

ABSTRACTSearching for a specific visual object requires our brain to compare the items in view with a remembered representation of the sought target to determine whether a target match is present. This comparison is thought to be implemented, in part, via the combination of top-down modulations reflecting target identity with feed-forward visual representations. However, it remains unclear whether top-down signals are integrated at a single locus within the ventral visual pathway (e.g. V4) or at multiple stages (e.g. both V4 and inferotemporal cortex, IT). To investigate, we recorded neural responses in V4 and IT as rhesus monkeys performed a task that required them to identify when a target object appeared across variation in position, size and background context. We found non-visual, task-specific signals in both V4 and IT. To evaluate whether V4 was the only locus for the integration of top-down signals, we evaluated several feed-forward accounts of processing from V4 to IT, including a model in which IT preferentially sampled from the best V4 units and a model that allowed for nonlinear IT computation. IT task-specific modulation was not accounted for by any of these feed-forward descriptions, suggesting that during object search, top-down signals are integrated directly within IT.NEW & NOTEWORTHYTo find specific objects, the brain must integrate top-down, target-specific signals with visual information about objects in view. However, the exact route of this integration in the ventral visual pathway is unclear. In the first study to systematically compare V4 and IT during an invariant object search task, we demonstrate that top-down signals found in IT cannot be described as being inherited from V4, but rather must be integrated directly within IT itself.


Author(s):  
Zhen Zeng ◽  
Adrian Röfer ◽  
Odest Chadwicke Jenkins

We aim for mobile robots to function in a variety of common human environments, which requires them to efficiently search previously unseen target objects. We can exploit background knowledge about common spatial relations between landmark objects and target objects to narrow down search space. In this paper, we propose an active visual object search strategy method through our introduction of the Semantic Linking Maps (SLiM) model. SLiM simultaneously maintains the belief over a target object's location as well as landmark objects' locations, while accounting for probabilistic inter-object spatial relations. Based on SLiM, we describe a hybrid search strategy that selects the next best view pose for searching for the target object based on the maintained belief. We demonstrate the efficiency of our SLiM-based search strategy through comparative experiments in simulated environments. We further demonstrate the real-world applicability of SLiM-based search in scenarios with a Fetch mobile manipulation robot.


2019 ◽  
Vol 122 (6) ◽  
pp. 2522-2540 ◽  
Author(s):  
Noam Roth ◽  
Nicole C. Rust

Searching for a specific visual object requires our brain to compare the items in view with a remembered representation of the sought target to determine whether a target match is present. This comparison is thought to be implemented, in part, via the combination of top-down modulations reflecting target identity with feed-forward visual representations. However, it remains unclear whether top-down signals are integrated at a single locus within the ventral visual pathway (e.g., V4) or at multiple stages [e.g., both V4 and inferotemporal cortex (IT)]. To investigate, we recorded neural responses in V4 and IT as rhesus monkeys performed a task that required them to identify when a target object appeared across variation in position, size, and background context. We found nonvisual, task-specific signals in both V4 and IT. To evaluate whether V4 was the only locus for the integration of top-down signals, we evaluated several feed-forward accounts of processing from V4 to IT, including a model in which IT preferentially sampled from the best V4 units and a model that allowed for nonlinear IT computation. IT task-specific modulation was not accounted for by any of these feed-forward descriptions, suggesting that during object search, top-down signals are integrated directly within IT. NEW & NOTEWORTHY To find specific objects, the brain must integrate top-down, target-specific signals with visual information about objects in view. However, the exact route of this integration in the ventral visual pathway is unclear. In the first study to systematically compare V4 and inferotemporal cortex (IT) during an invariant object search task, we demonstrate that top-down signals found in IT cannot be described as being inherited from V4 but rather must be integrated directly within IT itself.


1999 ◽  
Author(s):  
Dili Zhang ◽  
Isamu Matsuura ◽  
Yoshihiko Nomura

Author(s):  
Tan Yu ◽  
Jingjing Meng ◽  
Junsong Yuan

This paper addresses the problem of video-level object instance search, which aims to retrieve the videos in the database that contain a given query object instance. Without prior knowledge about "when" and "where" an object of interest may appear in a video, determining "whether" a video contains the target object is computationally prohibitive, as it requires exhaustively matching the query against all possible spatial-temporal locations in each video that an object may appear. To alleviate the computational and memory cost, we propose the Reconstruction-based Object SEarch (ROSE) method.It characterizes a huge corpus of features of possible spatial-temporal locations in the video into the parameters of the reconstruction model. Since the memory cost of storing reconstruction model is much less than that of storing features of possible spatial-temporal locations in the video, the efficiency of the search is significantly boosted. Comprehensive experiments on three benchmark datasets demonstrate the promising performance of the proposed ROSE method.


Perception ◽  
1996 ◽  
Vol 25 (1_suppl) ◽  
pp. 110-110 ◽  
Author(s):  
F N Newell

Previous studies have found that the recognition of familiar objects is dependent on the orientation of the object in the picture plane. Here the time taken to locate rotated objects in the periphery was examined. Eye movements were also recorded. In all experiments, familiar objects were arranged in a clock face display. In experiment 1, subjects were instructed to locate a match to a central, upright object from amongst a set of randomly rotated objects. The target object was rotated in the frontoparallel plane. Search performance was dependent on rotation, yielding the classic ‘M’ function found in recognition tasks. When matching a single object in periphery, match times were dependent on the angular deviations between the central and target objects and showed no advantage for the upright (experiment 2). In experiment 3 the central object was shown in either the upright rotation or rotated by 120° from the upright. The target object was similarly rotated given four different match conditions. Distractor objects were aligned with the target object. Search times were faster when the centre and target object were aligned and also when the centre object was rotated and the target was upright. Search times were slower when matching a central upright object to a rotated target object. These results suggest that in simple tasks matching is based on image characteristics. However, in complex search tasks a contribution from the object's representation is made which gives an advantage to the canonical, upright view in peripheral vision.


Sign in / Sign up

Export Citation Format

Share Document