Semantic segmentation using stride spatial pyramid pooling and dual attention decoder

2020 ◽  
Vol 107 ◽  
pp. 107498 ◽  
Author(s):  
Chengli Peng ◽  
Jiayi Ma
2020 ◽  
Vol 2020 (10) ◽  
pp. 27-1-27-7
Author(s):  
Congcong Wang ◽  
Faouzi Alaya Cheikh ◽  
Azeddine Beghdadi ◽  
Ole Jakob Elle

The object sizes in images are diverse, therefore, capturing multiple scale context information is essential for semantic segmentation. Existing context aggregation methods such as pyramid pooling module (PPM) and atrous spatial pyramid pooling (ASPP) employ different pooling size or atrous rate, such that multiple scale information is captured. However, the pooling sizes and atrous rates are chosen empirically. Rethinking of ASPP leads to our observation that learnable sampling locations of the convolution operation can endow the network learnable fieldof- view, thus the ability of capturing object context information adaptively. Following this observation, in this paper, we propose an adaptive context encoding (ACE) module based on deformable convolution operation where sampling locations of the convolution operation are learnable. Our ACE module can be embedded into other Convolutional Neural Networks (CNNs) easily for context aggregation. The effectiveness of the proposed module is demonstrated on Pascal-Context and ADE20K datasets. Although our proposed ACE only consists of three deformable convolution blocks, it outperforms PPM and ASPP in terms of mean Intersection of Union (mIoU) on both datasets. All the experimental studies confirm that our proposed module is effective compared to the state-of-the-art methods.


2020 ◽  
Vol 91 ◽  
pp. 106209
Author(s):  
Zhengyu Xia ◽  
Joohee Kim

2019 ◽  
Vol 11 (9) ◽  
pp. 1015 ◽  
Author(s):  
Hao He ◽  
Dongfang Yang ◽  
Shicheng Wang ◽  
Shuyang Wang ◽  
Yongfei Li

The technology used for road extraction from remote sensing images plays an important role in urban planning, traffic management, navigation, and other geographic applications. Although deep learning methods have greatly enhanced the development of road extractions in recent years, this technology is still in its infancy. Because the characteristics of road targets are complex, the accuracy of road extractions is still limited. In addition, the ambiguous prediction of semantic segmentation methods also makes the road extraction result blurry. In this study, we improved the performance of the road extraction network by integrating atrous spatial pyramid pooling (ASPP) with an Encoder-Decoder network. The proposed approach takes advantage of ASPP’s ability to extract multiscale features and the Encoder-Decoder network’s ability to extract detailed features. Therefore, it can achieve accurate and detailed road extraction results. For the first time, we utilized the structural similarity (SSIM) as a loss function for road extraction. Therefore, the ambiguous predictions in the extraction results can be removed, and the image quality of the extracted roads can be improved. The experimental results using the Massachusetts Road dataset show that our method achieves an F1-score of 83.5% and an SSIM of 0.893. Compared with the normal U-net, our method improves the F1-score by 2.6% and the SSIM by 0.18. Therefore, it is demonstrated that the proposed approach can extract roads from remote sensing images more effectively and clearly than the other compared methods.


Road extraction from satellite images has several Applications such as geographic information system (GIS). Having an accurate and up-to-date road network database will facilitate transportation, disaster management and GPS navigation. Most active field of research for automatic extraction of road network involves semantic segmentation using convolutional neural network (CNN). Although they can produce accurate results, typically the models give up performance for accuracy and vice-versa. In this paper, we are proposing architecture for semantic segmentation of road networks using Atrous Spatial Pyramid Pooling (ASPP). The network contains residual blocks for extracting low level features. Atrous convolutions with different dilation rates are taken and spatial pyramid pooling is performed on these features for extracting the spatial information. The low level features from residual blocks are added to the multi scale context information to produce the final segmentation image. Our proposed model significantly reduces the number of parameters that are required to train the model. The proposed model was trained on the Massachusetts roads dataset and the results have shown that our model produces superior results than that of popular state-of-the art models.


Author(s):  
Lixiang Ru ◽  
Bo Du ◽  
Chen Wu

Current weakly-supervised semantic segmentation (WSSS) methods with image-level labels mainly adopt class activation maps (CAM) to generate the initial pseudo labels. However, CAM usually only identifies the most discriminative object extents, which is attributed to the fact that the network doesn't need to discover the integral object to recognize image-level labels. In this work, to tackle this problem, we proposed to simultaneously learn the image-level labels and local visual word labels. Specifically, in each forward propagation, the feature maps of the input image will be encoded to visual words with a learnable codebook. By enforcing the network to classify the encoded fine-grained visual words, the generated CAM could cover more semantic regions. Besides, we also proposed a hybrid spatial pyramid pooling module that could preserve local maximum and global average values of feature maps, so that more object details and less background were considered. Based on the proposed methods, we conducted experiments on the PASCAL VOC 2012 dataset. Our proposed method achieved 67.2% mIoU on the val set and 67.3% mIoU on the test set, which outperformed recent state-of-the-art methods.


Author(s):  
Jiayi Yang ◽  
Tianshi Hu ◽  
Junli Yang ◽  
Zhaoxing Zhang ◽  
Yue Pan

2021 ◽  
Vol 110 ◽  
pp. 107622
Author(s):  
Xuhang Lian ◽  
Yanwei Pang ◽  
Jungong Han ◽  
Jing Pan

Sign in / Sign up

Export Citation Format

Share Document