scholarly journals Effective Use of Synthetic Data for Urban Scene Semantic Segmentation

Author(s):  
Fatemeh Sadat Saleh ◽  
Mohammad Sadegh Aliakbarian ◽  
Mathieu Salzmann ◽  
Lars Petersson ◽  
Jose M. Alvarez
2021 ◽  
Vol 11 (10) ◽  
pp. 4554
Author(s):  
João F. Teixeira ◽  
Mariana Dias ◽  
Eva Batista ◽  
Joana Costa ◽  
Luís F. Teixeira ◽  
...  

The scarcity of balanced and annotated datasets has been a recurring problem in medical image analysis. Several researchers have tried to fill this gap employing dataset synthesis with adversarial networks (GANs). Breast magnetic resonance imaging (MRI) provides complex, texture-rich medical images, with the same annotation shortage issues, for which, to the best of our knowledge, no previous work tried synthesizing data. Within this context, our work addresses the problem of synthesizing breast MRI images from corresponding annotations and evaluate the impact of this data augmentation strategy on a semantic segmentation task. We explored variations of image-to-image translation using conditional GANs, namely fitting the generator’s architecture with residual blocks and experimenting with cycle consistency approaches. We studied the impact of these changes on visual verisimilarity and how an U-Net segmentation model is affected by the usage of synthetic data. We achieved sufficiently realistic-looking breast MRI images and maintained a stable segmentation score even when completely replacing the dataset with the synthetic set. Our results were promising, especially when concerning to Pix2PixHD and Residual CycleGAN architectures.


2021 ◽  
Vol 13 (16) ◽  
pp. 3065
Author(s):  
Libo Wang ◽  
Rui Li ◽  
Dongzhi Wang ◽  
Chenxi Duan ◽  
Teng Wang ◽  
...  

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.


2021 ◽  
Vol 13 (15) ◽  
pp. 3021
Author(s):  
Bufan Zhao ◽  
Xianghong Hua ◽  
Kegen Yu ◽  
Xiaoxing He ◽  
Weixing Xue ◽  
...  

Urban object segmentation and classification tasks are critical data processing steps in scene understanding, intelligent vehicles and 3D high-precision maps. Semantic segmentation of 3D point clouds is the foundational step in object recognition. To identify the intersecting objects and improve the accuracy of classification, this paper proposes a segment-based classification method for 3D point clouds. This method firstly divides points into multi-scale supervoxels and groups them by proposed inverse node graph (IN-Graph) construction, which does not need to define prior information about the node, it divides supervoxels by judging the connection state of edges between them. This method reaches minimum global energy by graph cutting, obtains the structural segments as completely as possible, and retains boundaries at the same time. Then, the random forest classifier is utilized for supervised classification. To deal with the mislabeling of scattered fragments, higher-order CRF with small-label cluster optimization is proposed to refine the classification results. Experiments were carried out on mobile laser scan (MLS) point dataset and terrestrial laser scan (TLS) points dataset, and the results show that overall accuracies of 97.57% and 96.39% were obtained in the two datasets. The boundaries of objects were retained well, and the method achieved a good result in the classification of cars and motorcycles. More experimental analyses have verified the advantages of the proposed method and proved the practicability and versatility of the method.


Author(s):  
X.-F. Xing ◽  
M. A. Mostafavi ◽  
G. Edwards ◽  
N. Sabo

<p><strong>Abstract.</strong> Automatic semantic segmentation of point clouds observed in a 3D complex urban scene is a challenging issue. Semantic segmentation of urban scenes based on machine learning algorithm requires appropriate features to distinguish objects from mobile terrestrial and airborne LiDAR point clouds in point level. In this paper, we propose a pointwise semantic segmentation method based on our proposed features derived from Difference of Normal and the features “directional height above” that compare height difference between a given point and neighbors in eight directions in addition to the features based on normal estimation. Random forest classifier is chosen to classify points in mobile terrestrial and airborne LiDAR point clouds. The results obtained from our experiments show that the proposed features are effective for semantic segmentation of mobile terrestrial and airborne LiDAR point clouds, especially for vegetation, building and ground classes in an airborne LiDAR point clouds in urban areas.</p>


Author(s):  
Swami Sankaranarayanan ◽  
Yogesh Balaji ◽  
Arpit Jain ◽  
Ser Nam Lim ◽  
Rama Chellappa

Author(s):  
D. Tosic ◽  
S. Tuttas ◽  
L. Hoegner ◽  
U. Stilla

<p><strong>Abstract.</strong> This work proposes an approach for semantic classification of an outdoor-scene point cloud acquired with a high precision Mobile Mapping System (MMS), with major goal to contribute to the automatic creation of High Definition (HD) Maps. The automatic point labeling is achieved by utilizing the combination of a feature-based approach for semantic classification of point clouds and a deep learning approach for semantic segmentation of images. Both, point cloud data, as well as the data from a multi-camera system are used for gaining spatial information in an urban scene. Two types of classification applied for this task are: 1) Feature-based approach, in which the point cloud is organized into a supervoxel structure for capturing geometric characteristics of points. Several geometric features are then extracted for appropriate representation of the local geometry, followed by removing the effect of local tendency for each supervoxel to enhance the distinction between similar structures. And lastly, the Random Forests (RF) algorithm is applied in the classification phase, for assigning labels to supervoxels and therefore to points within them. 2) The deep learning approach is employed for semantic segmentation of MMS images of the same scene. To achieve this, an implementation of Pyramid Scene Parsing Network is used. Resulting segmented images with each pixel containing a class label are then projected onto the point cloud, enabling label assignment for each point. At the end, experiment results are presented from a complex urban scene and the performance of this method is evaluated on a manually labeled dataset, for the deep learning and feature-based classification individually, as well as for the result of the labels fusion. The achieved overall accuracy with fusioned output is 0.87 on the final test set, which significantly outperforms the results of individual methods on the same point cloud. The labeled data is published on the TUM-PF Semantic-Labeling-Benchmark.</p>


2021 ◽  
Vol 11 (17) ◽  
pp. 8047
Author(s):  
Dongkyu Lee ◽  
Wee Peng Tay ◽  
Seok-Cheol Kee

In this work, a study was carried out to estimate a look-up table (LUT) that converts a camera image plane to a birds eye view (BEV) plane using a single camera. The traditional camera pose estimation fields require high costs in researching and manufacturing autonomous vehicles for the future and may require pre-configured infra. This paper proposes an autonomous vehicle driving camera calibration system that is low cost and utilizes low infra. A network that outputs an image in the form of an LUT that converts the image into a BEV by estimating the camera pose under urban road driving conditions using a single camera was studied. We propose a network that predicts human-like poses from a single image. We collected synthetic data using a simulator, made BEV and LUT as ground truth, and utilized the proposed network and ground truth to train pose estimation function. In the progress, it predicts the pose by deciphering the semantic segmentation feature and increases its performance by attaching a layer that handles the overall direction of the network. The network outputs camera angle (roll/pitch/yaw) on the 3D coordinate system so that the user can monitor learning. Since the network's output is a LUT, there is no need for additional calculation, and real-time performance is improved.


Author(s):  
Romain Cazorla ◽  
Line Poinel ◽  
Panagiotis Papadakis ◽  
Cédric Buche

Point cloud acquisition techniques are an essential tool for the digitization of industrial plants, yet the bulk of a designer's work remains manual. A first step to automatize drawing generation is to extract the semantics of the point cloud. Towards this goal, we investigate the use of deep learning to semantically segment oil and gas industrial scenes. We focus on domain characteristics such as high variation of object size, increased concavity and lack of annotated data, which hampers the use of conventional approaches. To address these issues, we advocate the use of synthetic data, adaptive downsampling and context sharing.


Author(s):  
Wujie Zhou ◽  
Jinfu Liu ◽  
Jingsheng Lei ◽  
Lu Yu ◽  
Jenq-Neng Hwang

Sign in / Sign up

Export Citation Format

Share Document