scholarly journals PFMNet: Few-Shot Segmentation with Query Feature Enhancement and Multi-Scale Feature Matching

Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 406
Author(s):  
Jingyao Li ◽  
Lianglun Cheng ◽  
Zewen Zheng ◽  
Jiahong Chen ◽  
Genping Zhao ◽  
...  

The datasets in the latest semantic segmentation model often need to be manually labeled for each pixel, which is time-consuming and requires much effort. General models are unable to make better predictions, for new categories of information that have never been seen before, than the few-shot segmentation that has emerged. However, the few-shot segmentation is still faced up with two challenges. One is the inadequate exploration of semantic information conveyed in the high-level features, and the other is the inconsistency of segmenting objects at different scales. To solve these two problems, we have proposed a prior feature matching network (PFMNet). It includes two novel modules: (1) the Query Feature Enhancement Module (QFEM), which makes full use of the high-level semantic information in the support set to enhance the query feature, and (2) the multi-scale feature matching module (MSFMM), which increases the matching probability of multi-scales of objects. Our method achieves an intersection over union average score of 61.3% for one-shot segmentation and 63.4% for five-shot segmentation, which surpasses the state-of-the-art results by 0.5% and 1.5%, respectively.


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 950
Author(s):  
Hong Liang ◽  
Junlong Yang ◽  
Mingwen Shao

Because small targets have fewer pixels and carry fewer features, most target detection algorithms cannot effectively use the edge information and semantic information of small targets in the feature map, resulting in low detection accuracy, missed detections, and false detections from time to time. To solve the shortcoming of insufficient information features of small targets in the RetinaNet, this work introduces a parallel-assisted multi-scale feature enhancement module MFEM (Multi-scale Feature Enhancement Model), which uses dilated convolution with different expansion rates to avoid multiple down sampling. MFEM avoids information loss caused by multiple down sampling, and at the same time helps to assist shallow extraction of multi-scale context information. Additionally, this work adopts a backbone network improvement plan specifically designed for target detection tasks, which can effectively save small target information in high-level feature maps. The traditional top-down pyramid structure focuses on transferring high-level semantics from the top to the bottom, and the one-way information flow is not conducive to the detection of small targets. In this work, the auxiliary MFEM branch is combined with RetinaNet to construct a model with a bidirectional feature pyramid network, which can effectively integrate the strong semantic information of the high-level network and high-resolution information regarding the low level. The bidirectional feature pyramid network designed in this work is a symmetrical structure, including a top-down branch and a bottom-up branch, performs the transfer and fusion of strong semantic information and strong resolution information. To prove the effectiveness of the algorithm FE-RetinaNet (Feature Enhancement RetinaNet), this work conducts experiments on the MS COCO. Compared with the original RetinaNet, the improved RetinaNet has achieved a 1.8% improvement in the detection accuracy (mAP) on the MS COCO, and the COCO AP is 36.2%; FE-RetinaNet has a good detection effect on small targets, with APs increased by 3.2%.



Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1625
Author(s):  
Jing Du ◽  
Zuning Jiang ◽  
Shangfeng Huang ◽  
Zongyue Wang ◽  
Jinhe Su ◽  
...  

The semantic segmentation of small objects in point clouds is currently one of the most demanding tasks in photogrammetry and remote sensing applications. Multi-resolution feature extraction and fusion can significantly enhance the ability of object classification and segmentation, so it is widely used in the image field. For this motivation, we propose a point cloud semantic segmentation network based on multi-scale feature fusion (MSSCN) to aggregate the feature of a point cloud with different densities and improve the performance of semantic segmentation. In our method, random downsampling is first applied to obtain point clouds of different densities. A Spatial Aggregation Net (SAN) is then employed as the backbone network to extract local features from these point clouds, followed by concatenation of the extracted feature descriptors at different scales. Finally, a loss function is used to combine the different semantic information from point clouds of different densities for network optimization. Experiments were conducted on the S3DIS and ScanNet datasets, and our MSSCN achieved accuracies of 89.80% and 86.3%, respectively, on these datasets. Our method showed better performance than the recent methods PointNet, PointNet++, PointCNN, PointSIFT, and SAN.



Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3270
Author(s):  
Yong Liao ◽  
Qiong Liu

The main challenges of semantic segmentation in vehicle-mounted scenes are object scale variation and trading off model accuracy and efficiency. Lightweight backbone networks for semantic segmentation usually extract single-scale features layer-by-layer only by using a fixed receptive field. Most modern real-time semantic segmentation networks heavily compromise spatial details when encoding semantics, and sacrifice accuracy for speed. Many improving strategies adopt dilated convolution and add a sub-network, in which either intensive computation or redundant parameters are brought. We propose a multi-level and multi-scale feature aggregation network (MMFANet). A spatial pyramid module is designed by cascading dilated convolutions with different receptive fields to extract multi-scale features layer-by-layer. Subseqently, a lightweight backbone network is built by reducing the feature channel capacity of the module. To improve the accuracy of our network, we design two additional modules to separately capture spatial details and high-level semantics from the backbone network without significantly increasing the computation cost. Comprehensive experimental results show that our model achieves 79.3% MIoU on the Cityscapes test dataset at a speed of 58.5 FPS, and it is more accurate than SwiftNet (75.5% MIoU). Furthermore, the number of parameters of our model is at least 53.38% less than that of other state-of-the-art models.



Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1820
Author(s):  
Xiaotao Shao ◽  
Qing Wang ◽  
Wei Yang ◽  
Yun Chen ◽  
Yi Xie ◽  
...  

The existing pedestrian detection algorithms cannot effectively extract features of heavily occluded targets which results in lower detection accuracy. To solve the heavy occlusion in crowds, we propose a multi-scale feature pyramid network based on ResNet (MFPN) to enhance the features of occluded targets and improve the detection accuracy. MFPN includes two modules, namely double feature pyramid network (FPN) integrated with ResNet (DFR) and repulsion loss of minimum (RLM). We propose the double FPN which improves the architecture to further enhance the semantic information and contours of occluded pedestrians, and provide a new way for feature extraction of occluded targets. The features extracted by our network can be more separated and clearer, especially those heavily occluded pedestrians. Repulsion loss is introduced to improve the loss function which can keep predicted boxes away from the ground truths of the unrelated targets. Experiments carried out on the public CrowdHuman dataset, we obtain 90.96% AP which yields the best performance, 5.16% AP gains compared to the FPN-ResNet50 baseline. Compared with the state-of-the-art works, the performance of the pedestrian detection system has been boosted with our method.



IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Jingfang Yang ◽  
Bochang Zou ◽  
Huadong Qiu ◽  
Zhi Li


2021 ◽  
Author(s):  
Liang Chao ◽  
Wang Xiaoyu ◽  
Song Yu ◽  
Jiang Changhong


2017 ◽  
Vol 9 (6) ◽  
pp. 576 ◽  
Author(s):  
Dan Zeng ◽  
Lidan Wu ◽  
Boyang Chen ◽  
Wei Shen


Author(s):  
Yong Yi Lee ◽  
Min Ki Park ◽  
Jae Doug Yoo ◽  
Kwan H. Lee


Sign in / Sign up

Export Citation Format

Share Document