Real-Time Semantic Segmentation Algorithm Based on Feature Fusion Technology

In the near future, combo of UAV (Unmanned Aerial Vehicle) and computer vision will play a vital role in monitoring the condition of the railroad periodically to ensure passenger safety. The most significant module involved in railroad visual processing is obstacle detection, in which caution is obstacle fallen near track gage inside or outside. This leads to the importance of detecting and segment the railroad as three key regions, such as gage inside, rails, and background. Traditional railroad segmentation methods depend on either manual feature selection or expensive dedicated devices such as Lidar, which is typically less reliable in railroad semantic segmentation. Also, cameras mounted on moving vehicles like a drone can produce high-resolution images, so segmenting precise pixel information from those aerial images has been challenging due to the railroad surroundings chaos. RSNet is a multi-level feature fusion algorithm for segmenting railroad aerial images captured by UAV and proposes an attention-based efficient convolutional encoder for feature extraction, which is robust and computationally efficient and modified residual decoder for segmentation which considers only essential features and produces less overhead with higher performance even in real-time railroad drone imagery. The network is trained and tested on a railroad scenic view segmentation dataset (RSSD), which we have built from real-time UAV images and achieves 0.973 dice coefficient and 0.94 jaccard on test data that exhibits better results compared to the existing approaches like a residual unit and residual squeeze net.

Download Full-text

A Multi-level Feature Fusion Network for Real-time Semantic Segmentation

2019 11th International Conference on Wireless Communications and Signal Processing (WCSP) ◽

10.1109/wcsp.2019.8927880 ◽

2019 ◽

Author(s):

Lu Wang ◽

Qinzhen Xu ◽

Zixiang Xiong ◽

Yongming Huang ◽

Luxi Yang

Keyword(s):

Real Time ◽

Feature Fusion ◽

Semantic Segmentation ◽

Multi Level

Download Full-text

A Real-Time Semantic Segmentation Algorithm Based on Improved Lightweight Network

2020 International Symposium on Autonomous Systems (ISAS) ◽

10.1109/isas49493.2020.9378857 ◽

2020 ◽

Author(s):

Cheng Liu ◽

Hongxia Gao ◽

An Chen

Keyword(s):

Real Time ◽

Semantic Segmentation ◽

Segmentation Algorithm

Download Full-text

EFRNet: A Lightweight Network with Efficient Feature Fusion and Refinement for Real-Time Semantic Segmentation

2021 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme51207.2021.9428371 ◽

2021 ◽

Author(s):

Kuayue Zhang ◽

Qingmin Liao ◽

Juncheng Zhang ◽

Shaojun Liu ◽

Haoyu Ma ◽

...

Keyword(s):

Real Time ◽

Feature Fusion ◽

Semantic Segmentation

Download Full-text

Enet Semantic Segmentation Combined with Attention Mechanism

10.21203/rs.3.rs-425438/v1 ◽

2021 ◽

Author(s):

Wei Bai

Keyword(s):

Feature Fusion ◽

Receptive Fields ◽

Design Feature ◽

Semantic Segmentation ◽

Attention Mechanism ◽

Segmentation Algorithm ◽

Feature Maps ◽

Model Learning ◽

Simple Method ◽

Feature Map

Abstract Image semantic segmentation is one of the core tasks of computer vision. It is widely used in fields such as unmanned driving, medical image processing, geographic information systems and intelligent robots. Aiming at the problem that the existing semantic segmentation algorithm ignores the different channel and location features of the feature map and the simple method when the feature map is fused, this paper designs a semantic segmentation algorithm that combines the attention mechanism. Firstly, dilated convolution is used, and a smaller downsampling factor is used to maintain the resolution of the image and obtain the detailed information of the image. Secondly, the attention mechanism module is introduced to assign weights to different parts of the feature map, which reduces the accuracy loss. The design feature fusion module assigns weights to the feature maps of different receptive fields obtained by the two paths, and merges them together to obtain the final segmentation result. Finally, through experiments, it was verified on the Camvid, Cityscapes and PASCAL VOC2012 datasets. Mean intersection over union (MIoU) and mean pixel accuracy (MPA) are used as metrics. The method in this paper can make up for the loss of accuracy caused by downsampling while ensuring the receptive field and improving the resolution, which can better guide the model learning. And the proposed feature fusion module can better integrate the features of different receptive fields. Therefore, the proposed method can significantly improve the segmentation performance compared to the traditional method.

Download Full-text

Asynchronous Semantic Background Subtraction

Journal of Imaging ◽

10.3390/jimaging6060050 ◽

2020 ◽

Vol 6 (6) ◽

pp. 50

Author(s):

Anthony Cioppa ◽

Marc Braham ◽

Marc Van Droogenbroeck

Keyword(s):

Real Time ◽

Background Subtraction ◽

Semantic Information ◽

Moving Objects ◽

Temporal Evolution ◽

Semantic Segmentation ◽

Feedback Mechanism ◽

Segmentation Algorithm ◽

Frame Rate ◽

The One

The method of Semantic Background Subtraction (SBS), which combines semantic segmentation and background subtraction, has recently emerged for the task of segmenting moving objects in video sequences. While SBS has been shown to improve background subtraction, a major difficulty is that it combines two streams generated at different frame rates. This results in SBS operating at the slowest frame rate of the two streams, usually being the one of the semantic segmentation algorithm. We present a method, referred to as “Asynchronous Semantic Background Subtraction” (ASBS), able to combine a semantic segmentation algorithm with any background subtraction algorithm asynchronously. It achieves performances close to that of SBS while operating at the fastest possible frame rate, being the one of the background subtraction algorithm. Our method consists in analyzing the temporal evolution of pixel features to possibly replicate the decisions previously enforced by semantics when no semantic information is computed. We showcase ASBS with several background subtraction algorithms and also add a feedback mechanism that feeds the background model of the background subtraction algorithm to upgrade its updating strategy and, consequently, enhance the decision. Experiments show that we systematically improve the performance, even when the semantic stream has a much slower frame rate than the frame rate of the background subtraction algorithm. In addition, we establish that, with the help of ASBS, a real-time background subtraction algorithm, such as ViBe, stays real time and competes with some of the best non-real-time unsupervised background subtraction algorithms such as SuBSENSE.

Download Full-text

Implementation of a Lightweight Semantic Segmentation Algorithm in Road Obstacle Detection

Sensors ◽

10.3390/s20247089 ◽

2020 ◽

Vol 20 (24) ◽

pp. 7089

Author(s):

Bushi Liu ◽

Yongbo Lv ◽

Yang Gu ◽

Wanjun Lv

Keyword(s):

Real Time ◽

Spatial Information ◽

Feature Fusion ◽

Semantic Segmentation ◽

Spatial Location ◽

Autonomous Driving ◽

Obstacle Detection ◽

Depth Information ◽

Long Time ◽

Deep Learning Network

Due to deep learning’s accurate cognition of the street environment, the convolutional neural network has achieved dramatic development in the application of street scenes. Considering the needs of autonomous driving and assisted driving, in a general way, computer vision technology is used to find obstacles to avoid collisions, which has made semantic segmentation a research priority in recent years. However, semantic segmentation has been constantly facing new challenges for quite a long time. Complex network depth information, large datasets, real-time requirements, etc., are typical problems that need to be solved urgently in the realization of autonomous driving technology. In order to address these problems, we propose an improved lightweight real-time semantic segmentation network, which is based on an efficient image cascading network (ICNet) architecture, using multi-scale branches and a cascaded feature fusion unit to extract rich multi-level features. In this paper, a spatial information network is designed to transmit more prior knowledge of spatial location and edge information. During the course of the training phase, we append an external loss function to enhance the learning process of the deep learning network system as well. This lightweight network can quickly perceive obstacles and detect roads in the drivable area from images to satisfy autonomous driving characteristics. The proposed model shows substantial performance on the Cityscapes dataset. With the premise of ensuring real-time performance, several sets of experimental comparisons illustrate that SP-ICNet enhances the accuracy of road obstacle detection and provides nearly ideal prediction outputs. Compared to the current popular semantic segmentation network, this study also demonstrates the effectiveness of our lightweight network for road obstacle detection in autonomous driving.

Download Full-text

MBFFNet: Multi-Branch Feature Fusion Network for Colonoscopy

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2021.696251 ◽

2021 ◽

Vol 9 ◽

Author(s):

Houcheng Su ◽

Bin Lin ◽

Xiaoshuang Huang ◽

Jiao Li ◽

Kailin Jiang ◽

...

Keyword(s):

Real Time ◽

Feature Fusion ◽

Rapid Development ◽

Semantic Segmentation ◽

Medical Image Segmentation ◽

Feature Maps ◽

Improve Model ◽

Segmentation Methods ◽

Rectal Polyps ◽

Good Potential

Colonoscopy is currently one of the main methods for the detection of rectal polyps, rectal cancer, and other diseases. With the rapid development of computer vision, deep learning–based semantic segmentation methods can be applied to the detection of medical lesions. However, it is challenging for current methods to detect polyps with high accuracy and real-time performance. To solve this problem, we propose a multi-branch feature fusion network (MBFFNet), which is an accurate real-time segmentation method for detecting colonoscopy. First, we use UNet as the basis of our model architecture and adopt stepwise sampling with channel multiplication to integrate features, which decreases the number of flops caused by stacking channels in UNet. Second, to improve model accuracy, we extract features from multiple layers and resize feature maps to the same size in different ways, such as up-sampling and pooling, to supplement information lost in multiplication-based up-sampling. Based on mIOU and Dice loss with cross entropy (CE), we conduct experiments in both CPU and GPU environments to verify the effectiveness of our model. The experimental results show that our proposed MBFFNet is superior to the selected baselines in terms of accuracy, model size, and flops. mIOU, F score, and Dice loss with CE reached 0.8952, 0.9450, and 0.1602, respectively, which were better than those of UNet, UNet++, and other networks. Compared with UNet, the flop count decreased by 73.2%, and the number of participants also decreased. The actual segmentation effect of MBFFNet is only lower than that of PraNet, the number of parameters is 78.27% of that of PraNet, and the flop count is 0.23% that of PraNet. In addition, experiments on other types of medical tasks show that MBFFNet has good potential for general application in medical image segmentation.

Download Full-text

A Lightweight Multi-scale Feature Fusion Network for Real-Time Semantic Segmentation

10.1007/978-3-030-92270-2_17 ◽

2021 ◽

pp. 193-205

Author(s):

Tanmay Singha ◽

Duc-Son Pham ◽

Aneesh Krishna ◽

Tom Gedeon

Keyword(s):

Real Time ◽

Feature Fusion ◽

Semantic Segmentation ◽

Scale Feature ◽

Multi Scale

Download Full-text

Reconstruction of High-Precision Semantic Map

Sensors ◽

10.3390/s20216264 ◽

2020 ◽

Vol 20 (21) ◽

pp. 6264

Author(s):

Xinyuan Tu ◽

Jian Zhang ◽

Runhao Luo ◽

Kai Wang ◽

Qingji Zeng ◽

...

Keyword(s):

Real Time ◽

Point Cloud ◽

Semantic Information ◽

Feature Fusion ◽

Three Dimensional ◽

Semantic Segmentation ◽

Line Of Sight ◽

Lidar Data ◽

Implicit Surface ◽

Robotic Mapping

We present a real-time Truncated Signed Distance Field (TSDF)-based three-dimensional (3D) semantic reconstruction for LiDAR point cloud, which achieves incremental surface reconstruction and highly accurate semantic segmentation. The high-precise 3D semantic reconstruction in real time on LiDAR data is important but challenging. Lighting Detection and Ranging (LiDAR) data with high accuracy is massive for 3D reconstruction. We so propose a line-of-sight algorithm to update implicit surface incrementally. Meanwhile, in order to use more semantic information effectively, an online attention-based spatial and temporal feature fusion method is proposed, which is well integrated into the reconstruction system. We implement parallel computation in the reconstruction and semantic fusion process, which achieves real-time performance. We demonstrate our approach on the CARLA dataset, Apollo dataset, and our dataset. When compared with the state-of-art mapping methods, our method has a great advantage in terms of both quality and speed, which meets the needs of robotic mapping and navigation.

Download Full-text