GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation

Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images

Remote Sensing ◽

10.3390/rs13163065 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3065

Author(s):

Libo Wang ◽

Rui Li ◽

Dongzhi Wang ◽

Chenxi Duan ◽

Teng Wang ◽

...

Keyword(s):

Large Scale ◽

Texture Features ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Research Field ◽

Learning Approaches ◽

Fine Grained ◽

Urban Scene ◽

Fine Resolution ◽

With Memory

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.

An Inverse Node Graph-Based Method for the Urban Scene Segmentation of 3D Point Clouds

Remote Sensing ◽

10.3390/rs13153021 ◽

2021 ◽

Vol 13 (15) ◽

pp. 3021

Author(s):

Bufan Zhao ◽

Xianghong Hua ◽

Kegen Yu ◽

Xiaoxing He ◽

Weixing Xue ◽

...

Keyword(s):

Semantic Segmentation ◽

Point Clouds ◽

Intelligent Vehicles ◽

Critical Data ◽

Multi Scale ◽

3D Point Clouds ◽

Cluster Optimization ◽

Urban Scene ◽

Processing Steps

Urban object segmentation and classification tasks are critical data processing steps in scene understanding, intelligent vehicles and 3D high-precision maps. Semantic segmentation of 3D point clouds is the foundational step in object recognition. To identify the intersecting objects and improve the accuracy of classification, this paper proposes a segment-based classification method for 3D point clouds. This method firstly divides points into multi-scale supervoxels and groups them by proposed inverse node graph (IN-Graph) construction, which does not need to define prior information about the node, it divides supervoxels by judging the connection state of edges between them. This method reaches minimum global energy by graph cutting, obtains the structural segments as completely as possible, and retains boundaries at the same time. Then, the random forest classifier is utilized for supervised classification. To deal with the mislabeling of scattered fragments, higher-order CRF with small-label cluster optimization is proposed to refine the classification results. Experiments were carried out on mobile laser scan (MLS) point dataset and terrestrial laser scan (TLS) points dataset, and the results show that overall accuracies of 97.57% and 96.39% were obtained in the two datasets. The boundaries of objects were retained well, and the method achieved a good result in the classification of cars and motorcycles. More experimental analyses have verified the advantages of the proposed method and proved the practicability and versatility of the method.

Effective Use of Synthetic Data for Urban Scene Semantic Segmentation

Computer Vision – ECCV 2018 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-01216-8_6 ◽

2018 ◽

pp. 86-103 ◽

Cited By ~ 22

Author(s):

Fatemeh Sadat Saleh ◽

Mohammad Sadegh Aliakbarian ◽

Mathieu Salzmann ◽

Lars Petersson ◽

Jose M. Alvarez

Keyword(s):

Synthetic Data ◽

Semantic Segmentation ◽

Urban Scene ◽

Effective Use

AN IMPROVED AUTOMATIC POINTWISE SEMANTIC SEGMENTATION OF A 3D URBAN SCENE FROM MOBILE TERRESTRIAL AND AIRBORNE LIDAR POINT CLOUDS: A MACHINE LEARNING APPROACH

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-4-w8-139-2019 ◽

2019 ◽

Vol IV-4/W8 ◽

pp. 139-146

Author(s):

X.-F. Xing ◽

M. A. Mostafavi ◽

G. Edwards ◽

N. Sabo

Keyword(s):

Machine Learning ◽

Urban Areas ◽

Learning Algorithm ◽

Semantic Segmentation ◽

Point Clouds ◽

Airborne Lidar ◽

Urban Scenes ◽

Machine Learning Approach ◽

Urban Scene ◽

Point Level

Abstract. Automatic semantic segmentation of point clouds observed in a 3D complex urban scene is a challenging issue. Semantic segmentation of urban scenes based on machine learning algorithm requires appropriate features to distinguish objects from mobile terrestrial and airborne LiDAR point clouds in point level. In this paper, we propose a pointwise semantic segmentation method based on our proposed features derived from Difference of Normal and the features “directional height above” that compare height difference between a given point and neighbors in eight directions in addition to the features based on normal estimation. Random forest classifier is chosen to classify points in mobile terrestrial and airborne LiDAR point clouds. The results obtained from our experiments show that the proposed features are effective for semantic segmentation of mobile terrestrial and airborne LiDAR point clouds, especially for vegetation, building and ground classes in an airborne LiDAR point clouds in urban areas.

CRU-Net: A Deep Learning Network for Semantic Segmentation of Pathological Tissue Slices

2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID) ◽

10.1109/aiid51893.2021.9456469 ◽

2021 ◽

Author(s):

Yang Li

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Tissue Slices ◽

Learning Network ◽

Deep Learning Network

3SP-Net: Semantic Segmentation Network with Stereo Image Pairs for Urban Scene Parsing

Lecture Notes in Computer Science - PRICAI 2018: Trends in Artificial Intelligence ◽

10.1007/978-3-319-97304-3_39 ◽

2018 ◽

pp. 503-517

Author(s):

Lingli Zhou ◽

Haofeng Zhang

Keyword(s):

Semantic Segmentation ◽

Stereo Image ◽

Scene Parsing ◽

Urban Scene ◽

Image Pairs

A Novel Point Cloud Encoding Method Based on Local Information for 3D Classification and Segmentation

Sensors ◽

10.3390/s20092501 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2501 ◽

Cited By ~ 2

Author(s):

Yanan Song ◽

Liang Gao ◽

Xinyu Li ◽

Weiming Shen

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Semantic Segmentation ◽

Local Information ◽

Local Region ◽

Feature Representation ◽

Learning Network ◽

Feature Representations ◽

Encoding Method ◽

Deep Learning Network

Deep learning is robust to the perturbation of a point cloud, which is an important data form in the Internet of Things. However, it cannot effectively capture the local information of the point cloud and recognize the fine-grained features of an object. Different levels of features in the deep learning network are integrated to obtain local information, but this strategy increases network complexity. This paper proposes an effective point cloud encoding method that facilitates the deep learning network to utilize the local information. An axis-aligned cube is used to search for a local region that represents the local information. All of the points in the local region are available to construct the feature representation of each point. These feature representations are then input to a deep learning network. Two well-known datasets, ModelNet40 shape classification benchmark and Stanford 3D Indoor Semantics Dataset, are used to test the performance of the proposed method. Compared with other methods with complicated structures, the proposed method with only a simple deep learning network, can achieve a higher accuracy in 3D object classification and semantic segmentation.

FUSION OF FEATURE BASED AND DEEP LEARNING METHODS FOR CLASSIFICATION OF MMS POINT CLOUDS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w16-235-2019 ◽

2019 ◽

Vol XLII-2/W16 ◽

pp. 235-242 ◽

Cited By ~ 1

Author(s):

D. Tosic ◽

S. Tuttas ◽

L. Hoegner ◽

U. Stilla

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Semantic Segmentation ◽

Point Clouds ◽

Local Geometry ◽

Learning Approach ◽

Semantic Classification ◽

Feature Based ◽

Urban Scene

Abstract. This work proposes an approach for semantic classification of an outdoor-scene point cloud acquired with a high precision Mobile Mapping System (MMS), with major goal to contribute to the automatic creation of High Definition (HD) Maps. The automatic point labeling is achieved by utilizing the combination of a feature-based approach for semantic classification of point clouds and a deep learning approach for semantic segmentation of images. Both, point cloud data, as well as the data from a multi-camera system are used for gaining spatial information in an urban scene. Two types of classification applied for this task are: 1) Feature-based approach, in which the point cloud is organized into a supervoxel structure for capturing geometric characteristics of points. Several geometric features are then extracted for appropriate representation of the local geometry, followed by removing the effect of local tendency for each supervoxel to enhance the distinction between similar structures. And lastly, the Random Forests (RF) algorithm is applied in the classification phase, for assigning labels to supervoxels and therefore to points within them. 2) The deep learning approach is employed for semantic segmentation of MMS images of the same scene. To achieve this, an implementation of Pyramid Scene Parsing Network is used. Resulting segmented images with each pixel containing a class label are then projected onto the point cloud, enabling label assignment for each point. At the end, experiment results are presented from a complex urban scene and the performance of this method is evaluated on a manually labeled dataset, for the deep learning and feature-based classification individually, as well as for the result of the labels fusion. The achieved overall accuracy with fusioned output is 0.87 on the final test set, which significantly outperforms the results of individual methods on the same point cloud. The labeled data is published on the TUM-PF Semantic-Labeling-Benchmark.

Automatic Semantic Segmentation with DeepLab Dilated Learning Network for Change Detection in Remote Sensing Images

Neural Processing Letters ◽

10.1007/s11063-019-10174-x ◽

2020 ◽

Vol 51 (3) ◽

pp. 2355-2377 ◽

Cited By ~ 1

Author(s):

N. Venugopal

Keyword(s):

Remote Sensing ◽

Change Detection ◽

Semantic Segmentation ◽

Remote Sensing Images ◽

Learning Network

JMLNet: Joint Multi-Label Learning Network for Weakly Supervised Semantic Segmentation in Aerial Images

Remote Sensing ◽

10.3390/rs12193169 ◽

2020 ◽

Vol 12 (19) ◽

pp. 3169

Author(s):

Rongxin Guo ◽

Xian Sun ◽

Kaiqiang Chen ◽

Xiao Zhou ◽

Zhiyuan Yan ◽

...

Keyword(s):

Common Knowledge ◽

Ground Truth ◽

Semantic Segmentation ◽

Aerial Images ◽

Combination Strategy ◽

Research Attention ◽

Significant Saving ◽

Learning Network ◽

Weakly Supervised ◽

Segmentation Task

Weakly supervised semantic segmentation in aerial images has attracted growing research attention due to the significant saving in annotation cost. Most of the current approaches are based on one specific pseudo label. These methods easily overfit the wrongly labeled pixels from noisy label and limit the performance and generalization of the segmentation model. To tackle these problems, we propose a novel joint multi-label learning network (JMLNet) to help the model learn common knowledge from multiple noisy labels and prevent the model from overfitting one specific label. Our combination strategy of multiple proposals is that we regard them all as ground truth and propose three new multi-label losses to use the multi-label guide segmentation model in the training process. JMLNet also contains two methods to generate high-quality proposals, which further improve the performance of the segmentation task. First we propose a detection-based GradCAM (GradCAMD) to generate segmentation proposals from object detectors. Then we use GradCAMD to adjust the GrabCut algorithm and generate segmentation proposals (GrabCutC). We report the state-of-the-art results on the semantic segmentation task of iSAID and mapping challenge dataset when training with bounding boxes annotations.