scholarly journals Semantic Linking Maps for Active Visual Object Search (Extended Abstract)

Author(s):  
Zhen Zeng ◽  
Adrian Röfer ◽  
Odest Chadwicke Jenkins

We aim for mobile robots to function in a variety of common human environments, which requires them to efficiently search previously unseen target objects. We can exploit background knowledge about common spatial relations between landmark objects and target objects to narrow down search space. In this paper, we propose an active visual object search strategy method through our introduction of the Semantic Linking Maps (SLiM) model. SLiM simultaneously maintains the belief over a target object's location as well as landmark objects' locations, while accounting for probabilistic inter-object spatial relations. Based on SLiM, we describe a hybrid search strategy that selects the next best view pose for searching for the target object based on the maintained belief. We demonstrate the efficiency of our SLiM-based search strategy through comparative experiments in simulated environments. We further demonstrate the real-world applicability of SLiM-based search in scenarios with a Fetch mobile manipulation robot.

2008 ◽  
Vol 17 (02) ◽  
pp. 303-320 ◽  
Author(s):  
WEI SONG ◽  
BINGRU YANG ◽  
ZHANGYAN XU

Because of the inherent computational complexity, mining the complete frequent item-set in dense datasets remains to be a challenging task. Mining Maximal Frequent Item-set (MFI) is an alternative to address the problem. Set-Enumeration Tree (SET) is a common data structure used in several MFI mining algorithms. For this kind of algorithm, the process of mining MFI's can also be viewed as the process of searching in set-enumeration tree. To reduce the search space, in this paper, a new algorithm, Index-MaxMiner, for mining MFI is proposed by employing a hybrid search strategy blending breadth-first and depth-first. Firstly, the index array is proposed, and based on bitmap, an algorithm for computing index array is presented. By adding subsume index to frequent items, Index-MaxMiner discovers the candidate MFI's using breadth-first search at one time, which avoids first-level nodes that would not participate in the answer set and reduces drastically the number of candidate itemsets. Then, for candidate MFI's, depth-first search strategy is used to generate all MFI's. Thus, the jumping search in SET is implemented, and the search space is reduced greatly. The experimental results show that the proposed algorithm is efficient especially for dense datasets.


2018 ◽  
Author(s):  
Noam Roth ◽  
Nicole C. Rust

ABSTRACTSearching for a specific visual object requires our brain to compare the items in view with a remembered representation of the sought target to determine whether a target match is present. This comparison is thought to be implemented, in part, via the combination of top-down modulations reflecting target identity with feed-forward visual representations. However, it remains unclear whether top-down signals are integrated at a single locus within the ventral visual pathway (e.g. V4) or at multiple stages (e.g. both V4 and inferotemporal cortex, IT). To investigate, we recorded neural responses in V4 and IT as rhesus monkeys performed a task that required them to identify when a target object appeared across variation in position, size and background context. We found non-visual, task-specific signals in both V4 and IT. To evaluate whether V4 was the only locus for the integration of top-down signals, we evaluated several feed-forward accounts of processing from V4 to IT, including a model in which IT preferentially sampled from the best V4 units and a model that allowed for nonlinear IT computation. IT task-specific modulation was not accounted for by any of these feed-forward descriptions, suggesting that during object search, top-down signals are integrated directly within IT.NEW & NOTEWORTHYTo find specific objects, the brain must integrate top-down, target-specific signals with visual information about objects in view. However, the exact route of this integration in the ventral visual pathway is unclear. In the first study to systematically compare V4 and IT during an invariant object search task, we demonstrate that top-down signals found in IT cannot be described as being inherited from V4, but rather must be integrated directly within IT itself.


2019 ◽  
Vol 122 (6) ◽  
pp. 2522-2540 ◽  
Author(s):  
Noam Roth ◽  
Nicole C. Rust

Searching for a specific visual object requires our brain to compare the items in view with a remembered representation of the sought target to determine whether a target match is present. This comparison is thought to be implemented, in part, via the combination of top-down modulations reflecting target identity with feed-forward visual representations. However, it remains unclear whether top-down signals are integrated at a single locus within the ventral visual pathway (e.g., V4) or at multiple stages [e.g., both V4 and inferotemporal cortex (IT)]. To investigate, we recorded neural responses in V4 and IT as rhesus monkeys performed a task that required them to identify when a target object appeared across variation in position, size, and background context. We found nonvisual, task-specific signals in both V4 and IT. To evaluate whether V4 was the only locus for the integration of top-down signals, we evaluated several feed-forward accounts of processing from V4 to IT, including a model in which IT preferentially sampled from the best V4 units and a model that allowed for nonlinear IT computation. IT task-specific modulation was not accounted for by any of these feed-forward descriptions, suggesting that during object search, top-down signals are integrated directly within IT. NEW & NOTEWORTHY To find specific objects, the brain must integrate top-down, target-specific signals with visual information about objects in view. However, the exact route of this integration in the ventral visual pathway is unclear. In the first study to systematically compare V4 and inferotemporal cortex (IT) during an invariant object search task, we demonstrate that top-down signals found in IT cannot be described as being inherited from V4 but rather must be integrated directly within IT itself.


Author(s):  
Gwendolyn Rehrig ◽  
Reese A. Cullimore ◽  
John M. Henderson ◽  
Fernanda Ferreira

Abstract According to the Gricean Maxim of Quantity, speakers provide the amount of information listeners require to correctly interpret an utterance, and no more (Grice in Logic and conversation, 1975). However, speakers do tend to violate the Maxim of Quantity often, especially when the redundant information improves reference precision (Degen et al. in Psychol Rev 127(4):591–621, 2020). Redundant (non-contrastive) information may facilitate real-world search if it narrows the spatial scope under consideration, or improves target template specificity. The current study investigated whether non-contrastive modifiers that improve reference precision facilitate visual search in real-world scenes. In two visual search experiments, we compared search performance when perceptually relevant, but non-contrastive modifiers were included in the search instruction. Participants (NExp. 1 = 48, NExp. 2 = 48) searched for a unique target object following a search instruction that contained either no modifier, a location modifier (Experiment 1: on the top left, Experiment 2: on the shelf), or a color modifier (the black lamp). In Experiment 1 only, the target was located faster when the verbal instruction included either modifier, and there was an overall benefit of color modifiers in a combined analysis for scenes and conditions common to both experiments. The results suggest that violations of the Maxim of Quantity can facilitate search when the violations include task-relevant information that either augments the target template or constrains the search space, and when at least one modifier provides a highly reliable cue. Consistent with Degen et al. (2020), we conclude that listeners benefit from non-contrastive information that improves reference precision, and engage in rational reference comprehension. Significance statement This study investigated whether providing more information than someone needs to find an object in a photograph helps them to find that object more easily, even though it means they need to interpret a more complicated sentence. Before searching a scene, participants were either given information about where the object would be located in the scene, what color the object was, or were only told what object to search for. The results showed that providing additional information helped participants locate an object in an image more easily only when at least one piece of information communicated what part of the scene the object was in, which suggests that more information can be beneficial as long as that information is specific and helps the recipient achieve a goal. We conclude that people will pay attention to redundant information when it supports their task. In practice, our results suggest that instructions in other contexts (e.g., real-world navigation, using a smartphone app, prescription instructions, etc.) can benefit from the inclusion of what appears to be redundant information.


Author(s):  
Fergal McGrath ◽  
Rebecca Purcell

This chapter introduces external knowledge search strategy as a central element of an organizations overall knowledge management strategy. The argument cites how knowledge management has developed around a myopic internal focus and has thus far failed to take full account of the many sources of knowledge external to the organization. The chapter offers external knowledge search strategy as a means of integrating this external focus into knowledge management understanding, by providing a conceptual framework for organizations involved in the external knowledge management activity of external knowledge search. The framework identifies 10 search paths organizations may follow into the search space, four of which relate exclusively to external knowledge search. The authors hope that establishing an external element within knowledge management strategy will inform knowledge management’s recognition of the value of the extended enterprise.


Perception ◽  
2018 ◽  
Vol 47 (9) ◽  
pp. 966-975 ◽  
Author(s):  
Shinyoung Jung ◽  
Yosun Yoon ◽  
Suk Won Han

People’s attention is well attracted to stimuli matching their working memory. This memory-driven attentional capture has been demonstrated in simplified and controlled laboratory settings. The present study investigated whether working memory contents capture attention in a setting that closely resembles real-world environment. In the experiment, participants performed a task of searching for a target object in real-world indoor scenes, while maintaining a visual object in working memory. To create a setting similar to real-world environment, images taken from IKEA®’s online catalogue were used. The results showed that participants’ attention was biased toward a working memory-matching object, interfering with the target search. This was so even when participants did not expect that a memory-matching stimulus would appear in the search array. These results suggest that working memory can bias attention in complex, natural environment and this memory-driven attentional capture in real-world setting takes place in an automatic manner.


Proceedings ◽  
2020 ◽  
Vol 39 (1) ◽  
pp. 18
Author(s):  
Nenchoo ◽  
Tantrairatn

This paper presents an estimation of 3D UAV position in real-time condition by using Intel RealSense Depth camera D435i with visual object detection technique as a local positioning system for indoor environment. Nowadays, global positioning system or GPS is able to specify UAV position for outdoor environment. However, for indoor environment GPS hasn’t a capability to determine UAV position. Therefore, Depth stereo camera D435i is proposed to observe on ground to specify UAV position for indoor environment instead of GPS. Using deep learning for object detection to identify target object with depth camera to specifies 2D position of target object. In addition, depth position is estimated by stereo camera and target size. For experiment, Parrot Bebop2 as a target object is detected by using YOLOv3 as a real-time object detection system. However, trained Fully Convolutional Neural Networks (FCNNs) model is considerably significant for object detection, thus the model has been trained for bebop2 only. To conclude, this proposed system is able to specifies 3D position of bebop2 for indoor environment. For future work, this research will be developed and apply for visualized navigation control of drone swarm.


Sign in / Sign up

Export Citation Format

Share Document