Deep Ensembles for Semantic Segmentation on Road Detection

Author(s):  
Deniz Uzun ◽  
Dewei Yi
2021 ◽  
Vol 13 (11) ◽  
pp. 2080
Author(s):  
Xiaochen Wei ◽  
Xiaolei Lv ◽  
Kaiyu Zhang

The road extraction task is mainly composed of two subtasks, namely, road detection and road centerline extraction. As the road detection task and road centerline extraction task are strongly correlated, in this paper, we introduce a multitask learning framework to detect roads and extract road centerlines simultaneously. For the road centerline extraction problem, existing works rely either on regression-based methods, or classification-based methods. The regression-based methods suffer from slow convergence and unsatisfactory local solutions. The classification-based methods ignore the fact that the closer the pixel is to the centerline, the higher our tolerance for its misclassification. To overcome these problems, we first convert the road centerline extraction problem into the problem of discrete normalized distance label prediction, which can be resolved by training an ordinal regressor. For the road extraction task, most of the previous studies apply pixel-wise loss function, for example, Cross-Entropy loss, which is not sufficient, as the road has special topology characteristics such as connectivity. Therefore, we propose a road-topology loss function to improve the connectivity and completeness of the extracted road. The road-topology loss function has two key characteristics: (i) The road-topology loss function combines road detection prediction and road centerline extraction prediction to promote the two subtasks to each other by using the correlation between the two subtasks; (ii) The road-topology loss can emphatically penalize gaps that often appear in road detection results and spurious segments that easily appear in centerline extraction results. In this paper, we select the AdamW optimizer to minimize the road-topology loss. Since there is no public dataset, we build a road extraction dataset to evaluate our method. State-of-the-art semantic segmentation networks (LinkNet34, DLinkNet34, DeeplabV3plus) are used as baseline methods to compare with two kinds of method. The first kind of method modifies the baseline method by adding the road centerline extraction task branch based on ordinal regression. The second kind of method uses the road topology loss and has the same network architecture as the first kind of method. For the road detection task, the two kinds of methods improve the baseline methods by up to 3.51% and 11.98% in IoU metric on our test dataset, respectively. For the road centerline extraction task, the two kinds of methods improve the baseline methods by up to 8.22% and 10.9% in the Quality metric on our test dataset.


2021 ◽  
Vol 13 (16) ◽  
pp. 3149
Author(s):  
Xiaochen Wei ◽  
Xikai Fu ◽  
Ye Yun ◽  
Xiaolei Lv

Road detection from images has emerged as an important way to obtain road information, thereby gaining much attention in recent years. However, most existing methods only focus on extracting road information from single temporal intensity images, which may cause a decrease in image resolution due to the use of spatial filter methods to avoid coherent speckle noises. Some newly developed methods take into account the multi-temporal information in the preprocessing stage to filter the coherent speckle noise in the SAR imagery. They ignore the temporal characteristic of road objects such as the temporal consistency for the road objects in the multitemporal SAR images that cover the same area and are taken at adjacent times, causing the limitation in detection performance. In this paper, we propose a multiscale and multitemporal network (MSMTHRNet) for road detection from SAR imagery, which contains the temporal consistency enhancement module (TCEM) and multiscale fusion module (MSFM) that are based on attention mechanism. In particular, we propose the TCEM to make full use of multitemporal information, which contains temporal attention submodule that applies attention mechanism to capture temporal contextual information. We enforce temporal consistency constraint by the TCEM to obtain the enhanced feature representations of SAR imagery that help to distinguish the real roads. Since the width of roads are various, incorporating multiscale features is a promising way to improve the results of road detection. We propose the MSFM that applies learned weights to combine predictions of different scale features. Since there is no public dataset, we build a multitemporal road detection dataset to evaluate our methods. State-of-the-art semantic segmentation network HRNetV2 is used as a baseline method to compare with MSHRNet that only has MSFM and the MSMTHRNet. The MSHRNet(TAF) whose input is the SAR image after the temporal filter is adopted to compare with our proposed MSMTHRNet. On our test dataset, MSHRNet and MSMTHRNet improve over the HRNetV2 by 2.1% and 14.19%, respectively, in the IoU metric and by 3.25% and 17.08%, respectively, in the APLS metric. MSMTHRNet improves over the MSMTHRNet(TAF) by 8.23% and 8.81% in the IoU metric and APLS metric, respectively.


2018 ◽  
Vol 11 (6) ◽  
pp. 304
Author(s):  
Javier Pinzon-Arenas ◽  
Robinson Jimenez-Moreno ◽  
Ruben Hernandez-Beleno

Impact ◽  
2020 ◽  
Vol 2020 (2) ◽  
pp. 9-11
Author(s):  
Tomohiro Fukuda

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.


Sign in / Sign up

Export Citation Format

Share Document