3SP-Net: Semantic Segmentation Network with Stereo Image Pairs for Urban Scene Parsing

Author(s):  
Lingli Zhou ◽  
Haofeng Zhang
2021 ◽  
Vol 13 (16) ◽  
pp. 3065
Author(s):  
Libo Wang ◽  
Rui Li ◽  
Dongzhi Wang ◽  
Chenxi Duan ◽  
Teng Wang ◽  
...  

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.


2021 ◽  
Vol 13 (15) ◽  
pp. 3021
Author(s):  
Bufan Zhao ◽  
Xianghong Hua ◽  
Kegen Yu ◽  
Xiaoxing He ◽  
Weixing Xue ◽  
...  

Urban object segmentation and classification tasks are critical data processing steps in scene understanding, intelligent vehicles and 3D high-precision maps. Semantic segmentation of 3D point clouds is the foundational step in object recognition. To identify the intersecting objects and improve the accuracy of classification, this paper proposes a segment-based classification method for 3D point clouds. This method firstly divides points into multi-scale supervoxels and groups them by proposed inverse node graph (IN-Graph) construction, which does not need to define prior information about the node, it divides supervoxels by judging the connection state of edges between them. This method reaches minimum global energy by graph cutting, obtains the structural segments as completely as possible, and retains boundaries at the same time. Then, the random forest classifier is utilized for supervised classification. To deal with the mislabeling of scattered fragments, higher-order CRF with small-label cluster optimization is proposed to refine the classification results. Experiments were carried out on mobile laser scan (MLS) point dataset and terrestrial laser scan (TLS) points dataset, and the results show that overall accuracies of 97.57% and 96.39% were obtained in the two datasets. The boundaries of objects were retained well, and the method achieved a good result in the classification of cars and motorcycles. More experimental analyses have verified the advantages of the proposed method and proved the practicability and versatility of the method.


Author(s):  
Fatemeh Sadat Saleh ◽  
Mohammad Sadegh Aliakbarian ◽  
Mathieu Salzmann ◽  
Lars Petersson ◽  
Jose M. Alvarez

2006 ◽  
Vol 18 (6) ◽  
pp. 1441-1471 ◽  
Author(s):  
Christian Eckes ◽  
Jochen Triesch ◽  
Christoph von der Malsburg

We present a system for the automatic interpretation of cluttered scenes containing multiple partly occluded objects in front of unknown, complex backgrounds. The system is based on an extended elastic graph matching algorithm that allows the explicit modeling of partial occlusions. Our approach extends an earlier system in two ways. First, we use elastic graph matching in stereo image pairs to increase matching robustness and disambiguate occlusion relations. Second, we use richer feature descriptions in the object models by integrating shape and texture with color features. We demonstrate that the combination of both extensions substantially increases recognition performance. The system learns about new objects in a simple one-shot learning approach. Despite the lack of statistical information in the object models and the lack of an explicit background model, our system performs surprisingly well for this very difficult task. Our results underscore the advantages of view-based feature constellation representations for difficult object recognition problems.


Author(s):  
X.-F. Xing ◽  
M. A. Mostafavi ◽  
G. Edwards ◽  
N. Sabo

<p><strong>Abstract.</strong> Automatic semantic segmentation of point clouds observed in a 3D complex urban scene is a challenging issue. Semantic segmentation of urban scenes based on machine learning algorithm requires appropriate features to distinguish objects from mobile terrestrial and airborne LiDAR point clouds in point level. In this paper, we propose a pointwise semantic segmentation method based on our proposed features derived from Difference of Normal and the features “directional height above” that compare height difference between a given point and neighbors in eight directions in addition to the features based on normal estimation. Random forest classifier is chosen to classify points in mobile terrestrial and airborne LiDAR point clouds. The results obtained from our experiments show that the proposed features are effective for semantic segmentation of mobile terrestrial and airborne LiDAR point clouds, especially for vegetation, building and ground classes in an airborne LiDAR point clouds in urban areas.</p>


2020 ◽  
Vol 12 (17) ◽  
pp. 2669
Author(s):  
Junhao Qian ◽  
Min Xia ◽  
Yonghong Zhang ◽  
Jia Liu ◽  
Yiqing Xu

Change detection is a very important technique for remote sensing data analysis. Its mainstream solutions are either supervised or unsupervised. In supervised methods, most of the existing change detection methods using deep learning are related to semantic segmentation. However, these methods only use deep learning models to process the global information of an image but do not carry out specific trainings on changed and unchanged areas. As a result, many details of local changes could not be detected. In this work, a trilateral change detection network is proposed. The proposed network has three branches (a main module and two auxiliary modules, all of them are composed of convolutional neural networks (CNNs)), which focus on the overall information of bitemporal Google Earth image pairs, the changed areas and the unchanged areas, respectively. The proposed method is end-to-end trainable, and each component in the network does not need to be trained separately.


2015 ◽  
Vol 738-739 ◽  
pp. 613-617 ◽  
Author(s):  
Guo Yin Cai ◽  
Jie Huan ◽  
Yang Liu ◽  
Ming Yi Du

Digital Elevation Model (DEM) is an important data source for topographic analysis, 3D visualization and satellite image ortho-rectification. This paper focused on the DEM extraction and accuracy assessment from ZY-3 satellite with 3 stereo images. DEM was extracted using three different stereo pair image groups composed of forward and nadir view images, nadir and backward view images as well as forward and backward view images. The accuracy of the DEM was indicated by root-mean-square error (RMSE) values. The results showed that the stereo pair of nadir and forward view images achieved the best accuracy, while the pair of forward and backward view images obtained the worst. This might be useful for the selection of the stereo pair images for extracting DEM using ZY-3 satellite images.


Water ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 1749 ◽  
Author(s):  
Junli Xu ◽  
Donghui Shangguan ◽  
Jian Wang

In this study, contour lines from the topographic maps at a 1:100,000 scale (mapped in 1968), Landsat MSS/TM/OLI images, ASTER images and SPOT 6-7 stereo image pairs were used to study changes in glacier length, area and surface elevation. We summarized the results using the following three conclusions: (1) During the period from 1973 to 2013, glaciers retreated by 412 ± 32 m at a mean retraction rate of 10.3 ± 0.8 m·year−1 and the relative retreat was 5.6 ± 0.4%. The glacier area shrank by 7.5 ± 3.4%, which was larger than the glacier length. In the periods of 1968–2000, 2000–2005 and 2000–2013, the glacier surface elevation change rates were −7.7 ± 1.4 m (−0.24 ± 0.04 m·year−1), −1.9 ± 1.5 m (−0.38 ± 0.25 m·year−1) and −5.0 ± 1.4 m (−0.38 ± 0.11 m·year−1), respectively. The changes in the glacier area and thickness exhibited similar trends, both showing a significant increasing reduction after 2000. (2) Eleven glaciers were identified as surging glaciers. Changes of the mass balance in surging glaciers were stronger than in non-surging glaciers between 1968 and 2013. Changes of area in surging glaciers were weaker than in non-surging glaciers. (3) Increasing temperature was the major cause of glacier thickness reduction and area shrinkage. The increase in precipitation, to a certain extent, inhibited glacial ablation but it did not change the status of the shrinkage in the glacial area and the reduction in the glacier thickness.


Sign in / Sign up

Export Citation Format

Share Document