Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks

Author(s):  
Chunshui Cao ◽  
Xianming Liu ◽  
Yi Yang ◽  
Yinan Yu ◽  
Jiang Wang ◽  
...  
Author(s):  
Abraham Montoya Obeso ◽  
Jenny Benois-Pineau ◽  
Mireya Sarai Garcia Vazquez ◽  
Alejandro A. Ramirez Acosta

2021 ◽  
Vol 15 ◽  
Author(s):  
Taicheng Huang ◽  
Zonglei Zhen ◽  
Jia Liu

Human not only can effortlessly recognize objects, but also characterize object categories into semantic concepts with a nested hierarchical structure. One dominant view is that top-down conceptual guidance is necessary to form such hierarchy. Here we challenged this idea by examining whether deep convolutional neural networks (DCNNs) could learn relations among objects purely based on bottom-up perceptual experience of objects through training for object categorization. Specifically, we explored representational similarity among objects in a typical DCNN (e.g., AlexNet), and found that representations of object categories were organized in a hierarchical fashion, suggesting that the relatedness among objects emerged automatically when learning to recognize them. Critically, the emerged relatedness of objects in the DCNN was highly similar to the WordNet in human, implying that top-down conceptual guidance may not be a prerequisite for human learning the relatedness among objects. In addition, the developmental trajectory of the relatedness among objects during training revealed that the hierarchical structure was constructed in a coarse-to-fine fashion, and evolved into maturity before the establishment of object recognition ability. Finally, the fineness of the relatedness was greatly shaped by the demand of tasks that the DCNN performed, as the higher superordinate level of object classification was, the coarser the hierarchical structure of the relatedness emerged. Taken together, our study provides the first empirical evidence that semantic relatedness of objects emerged as a by-product of object recognition in DCNNs, implying that human may acquire semantic knowledge on objects without explicit top-down conceptual guidance.


2018 ◽  
Vol 77 (22) ◽  
pp. 29231-29244 ◽  
Author(s):  
Meijun Sun ◽  
Ziqi Zhou ◽  
Dong Zhang ◽  
Zheng Wang

Author(s):  
Ziyu Liu ◽  
Alexander McClung ◽  
Henry W. F. Yeung ◽  
Yuk Ying Chung ◽  
Seid Miad Zandavi

Author(s):  
L. Hashemi-Beni ◽  
A. Gebrehiwot

Abstract. This research examines the ability of deep learning methods for remote sensing image classification for agriculture applications. U-net and convolutional neural networks are fine-tuned, utilized and tested for crop/weed classification. The dataset for this study includes 60 top-down images of an organic carrots field, which was collected by an autonomous vehicle and labeled by experts. FCN-8s model achieved 75.1% accuracy on detecting weeds compared to 66.72% of U-net using 60 training images. However, the U-net model performed better on detecting crops which is 60.48% compared to 47.86% of FCN-8s.


2020 ◽  
Author(s):  
Taicheng Huang ◽  
Zonglei Zhen ◽  
Jia Liu

AbstractHuman not only can effortlessly recognize objects, but also characterize object categories into semantic concepts and construct nested hierarchical structures. Similarly, deep convolutional neural networks (DCNNs) can learn to recognize objects as perfectly as human; yet it is unclear whether they can learn semantic relatedness among objects that is not provided in the learning dataset. This is important because it may shed light on how human acquire semantic knowledge on objects without top-down conceptual guidance. To do this, we explored the relation among object categories, indexed by representational similarity, in two typical DCNNs (AlexNet and VGG11). We found that representations of object categories were organized in a hierarchical fashion, suggesting that the relatedness among objects emerged automatically when learning to recognize them. Critically, the emerged relatedness of objects in the DCNNs was highly similar to the WordNet in human, implying that top-down conceptual guidance may not be a prerequisite for human learning the relatedness among objects. Finally, the developmental trajectory of the relatedness among objects during training revealed that the hierarchical structure was constructed in a coarse-to-fine fashion, and evolved into maturity before the establishment of object recognition ability. Taken together, our study provides the first empirical evidence that semantic relatedness of objects emerged as a by-product of object recognition, implying that human may acquire semantic knowledge on objects without explicit top-down conceptual guidance.Significance StatementThe origin of semantic concepts is in a long-standing debate, where top-down conceptual guidance is thought necessary to form the hierarchy structure of objects. Here we challenged this hypothesis by examining whether deep convolutional neural networks (DCNNs) for object recognition can emerge the semantic relatedness of objects with no relation information in training object datasets. We found that in the DCNNs representations of objects were organized in a hierarchical fashion, which was highly similar to WordNet in human. This finding suggests that top-down conceptual guidance may not be a prerequisite for human learning the relatedness among objects; rather, semantic relatedness of objects may emerge as a by-product of object recognition.


Author(s):  
Xiaoliang Luo ◽  
Brett D. Roads ◽  
Bradley C. Love

AbstractPeople deploy top-down, goal-directed attention to accomplish tasks, such as finding lost keys. By tuning the visual system to relevant information sources, object recognition can become more efficient (a benefit) and more biased toward the target (a potential cost). Motivated by selective attention in categorisation models, we developed a goal-directed attention mechanism that can process naturalistic (photographic) stimuli. Our attention mechanism can be incorporated into any existing deep convolutional neural networks (DCNNs). The processing stages in DCNNs have been related to ventral visual stream. In that light, our attentional mechanism incorporates top-down influences from prefrontal cortex (PFC) to support goal-directed behaviour. Akin to how attention weights in categorisation models warp representational spaces, we introduce a layer of attention weights to the mid-level of a DCNN that amplify or attenuate activity to further a goal. We evaluated the attentional mechanism using photographic stimuli, varying the attentional target. We found that increasing goal-directed attention has benefits (increasing hit rates) and costs (increasing false alarm rates). At a moderate level, attention improves sensitivity (i.e. increases $d^{\prime }$ d ′ ) at only a moderate increase in bias for tasks involving standard images, blended images and natural adversarial images chosen to fool DCNNs. These results suggest that goal-directed attention can reconfigure general-purpose DCNNs to better suit the current task goal, much like PFC modulates activity along the ventral stream. In addition to being more parsimonious and brain consistent, the mid-level attention approach performed better than a standard machine learning approach for transfer learning, namely retraining the final network layer to accommodate the new task.


Sign in / Sign up

Export Citation Format

Share Document