Spanet: Spatial Pyramid Attention Network for Enhanced Image Recognition

Author(s):  
Jingda Guo ◽  
Xu Ma ◽  
Andrew Sansom ◽  
Mara McGuire ◽  
Andrew Kalaani ◽  
...  
2019 ◽  
Vol 31 (12) ◽  
pp. 9295-9305 ◽  
Author(s):  
Jiaxu Leng ◽  
Ying Liu ◽  
Shang Chen

2017 ◽  
Vol 15 (3) ◽  
pp. 617-629 ◽  
Author(s):  
Xiaoning Zhu ◽  
Qingyue Meng ◽  
Lize Gu

Author(s):  
Xuefeng Hu ◽  
Zhihan Zhang ◽  
Zhenye Jiang ◽  
Syomantak Chaudhuri ◽  
Zhenheng Yang ◽  
...  

2021 ◽  
Vol 13 (7) ◽  
pp. 1312
Author(s):  
Wei Cui ◽  
Xin He ◽  
Meng Yao ◽  
Ziwei Wang ◽  
Yuanjie Hao ◽  
...  

The pixel-based semantic segmentation methods take pixels as recognitions units, and are restricted by the limited range of receptive fields, so they cannot carry richer and higher-level semantics. These reduce the accuracy of remote sensing (RS) semantic segmentation to a certain extent. Comparing with the pixel-based methods, the graph neural networks (GNNs) usually use objects as input nodes, so they not only have relatively small computational complexity, but also can carry richer semantic information. However, the traditional GNNs are more rely on the context information of the individual samples and lack geographic prior knowledge that reflects the overall situation of the research area. Therefore, these methods may be disturbed by the confusion of “different objects with the same spectrum” or “violating the first law of geography” in some areas. To address the above problems, we propose a remote sensing semantic segmentation model called knowledge and spatial pyramid distance-based gated graph attention network (KSPGAT), which is based on prior knowledge, spatial pyramid distance and a graph attention network (GAT) with gating mechanism. The model first uses superpixels (geographical objects) to form the nodes of a graph neural network and then uses a novel spatial pyramid distance recognition algorithm to recognize the spatial relationships. Finally, based on the integration of feature similarity and the spatial relationships of geographic objects, a multi-source attention mechanism and gating mechanism are designed to control the process of node aggregation, as a result, the high-level semantics, spatial relationships and prior knowledge can be introduced into a remote sensing semantic segmentation network. The experimental results show that our model improves the overall accuracy by 4.43% compared with the U-Net Network, and 3.80% compared with the baseline GAT network.


2020 ◽  
Vol 34 (07) ◽  
pp. 11815-11822 ◽  
Author(s):  
Boxiao Pan ◽  
Zhangjie Cao ◽  
Ehsan Adeli ◽  
Juan Carlos Niebles

Action recognition has been a widely studied topic with a heavy focus on supervised learning involving sufficient labeled videos. However, the problem of cross-domain action recognition, where training and testing videos are drawn from different underlying distributions, remains largely under-explored. Previous methods directly employ techniques for cross-domain image recognition, which tend to suffer from the severe temporal misalignment problem. This paper proposes a Temporal Co-attention Network (TCoN), which matches the distributions of temporally aligned action features between source and target domains using a novel cross-domain co-attention mechanism. Experimental results on three cross-domain action recognition datasets demonstrate that TCoN improves both previous single-domain and cross-domain methods significantly under the cross-domain setting.


2021 ◽  
Author(s):  
Zan Gao ◽  
Yanbo Liu ◽  
Guangpin Xu ◽  
Xianbin Wen

Author(s):  
Adu Asare Baffour ◽  
Zhen Qin ◽  
Yong Wang ◽  
Zhiguang Qin ◽  
Kim-Kwang Raymond Choo

Sign in / Sign up

Export Citation Format

Share Document