scholarly journals Extracting Rectified Building Footprints from Traditional Orthophotos: A New Workflow

Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 207
Author(s):  
Qi Chen ◽  
Yuanyi Zhang ◽  
Xinyuan Li ◽  
Pengjie Tao

Deep learning techniques such as convolutional neural networks have largely improved the performance of building segmentation from remote sensing images. However, the images for building segmentation are often in the form of traditional orthophotos, where the relief displacement would cause non-negligible misalignment between the roof outline and the footprint of a building; such misalignment poses considerable challenges for extracting accurate building footprints, especially for high-rise buildings. Aiming at alleviating this problem, a new workflow is proposed for generating rectified building footprints from traditional orthophotos. We first use the facade labels, which are prepared efficiently at low cost, along with the roof labels to train a semantic segmentation network. Then, the well-trained network, which employs the state-of-the-art version of EfficientNet as backbone, extracts the roof segments and the facade segments of buildings from the input image. Finally, after clustering the classified pixels into instance-level building objects and tracing out the roof outlines, an energy function is proposed to drive the roof outline to maximally align with the building footprint; thus, the rectified footprints can be generated. The experiments on the aerial orthophotos covering a high-density residential area in Shanghai demonstrate that the proposed workflow can generate obviously more accurate building footprints than the baseline methods, especially for high-rise buildings.

2020 ◽  
Vol 12 (22) ◽  
pp. 3836
Author(s):  
Carlos García Rodríguez ◽  
Jordi Vitrià ◽  
Oscar Mora

In recent years, different deep learning techniques were applied to segment aerial and satellite images. Nevertheless, state of the art techniques for land cover segmentation does not provide accurate results to be used in real applications. This is a problem faced by institutions and companies that want to replace time-consuming and exhausting human work with AI technology. In this work, we propose a method that combines deep learning with a human-in-the-loop strategy to achieve expert-level results at a low cost. We use a neural network to segment the images. In parallel, another network is used to measure uncertainty for predicted pixels. Finally, we combine these neural networks with a human-in-the-loop approach to produce correct predictions as if developed by human photointerpreters. Applying this methodology shows that we can increase the accuracy of land cover segmentation tasks while decreasing human intervention.


Sensors ◽  
2019 ◽  
Vol 19 (8) ◽  
pp. 1795 ◽  
Author(s):  
Xiao Lin ◽  
Dalila Sánchez-Escobedo ◽  
Josep R. Casas ◽  
Montse Pardàs

Semantic segmentation and depth estimation are two important tasks in computer vision, and many methods have been developed to tackle them. Commonly these two tasks are addressed independently, but recently the idea of merging these two problems into a sole framework has been studied under the assumption that integrating two highly correlated tasks may benefit each other to improve the estimation accuracy. In this paper, depth estimation and semantic segmentation are jointly addressed using a single RGB input image under a unified convolutional neural network. We analyze two different architectures to evaluate which features are more relevant when shared by the two tasks and which features should be kept separated to achieve a mutual improvement. Likewise, our approaches are evaluated under two different scenarios designed to review our results versus single-task and multi-task methods. Qualitative and quantitative experiments demonstrate that the performance of our methodology outperforms the state of the art on single-task approaches, while obtaining competitive results compared with other multi-task methods.


2021 ◽  
Vol 13 (14) ◽  
pp. 2743
Author(s):  
Kun Sun ◽  
Yi Liang ◽  
Xiaorui Ma ◽  
Yuanyuan Huai ◽  
Mengdao Xing

Traditional constant false alarm rate (CFAR) based ship target detection methods do not work well in complex conditions, such as multi-scale situations or inshore ship detection. With the development of deep learning techniques, methods based on convolutional neural networks (CNN) have been applied to solve such issues and have demonstrated good performance. However, compared with optical datasets, the number of samples in SAR datasets is much smaller, thus limiting the detection performance. Moreover, most state-of-the-art CNN-based ship target detectors that focus on the detection performance ignore the computation complexity. To solve these issues, this paper proposes a lightweight densely connected sparsely activated detector (DSDet) for ship target detection. First, a style embedded ship sample data augmentation network (SEA) is constructed to augment the dataset. Then, a lightweight backbone utilizing a densely connected sparsely activated network (DSNet) is constructed, which achieves a balance between the performance and the computation complexity. Furthermore, based on the proposed backbone, a low-cost one-stage anchor-free detector is presented. Extensive experiments demonstrate that the proposed data augmentation approach can create hard SAR samples artificially. Moreover, utilizing the proposed data augmentation approach is shown to effectively improves the detection accuracy. Furthermore, the conducted experiments show that the proposed detector outperforms the state-of-the-art methods with the least parameters (0.7 M) and lowest computation complexity (3.7 GFLOPs).


2021 ◽  
Vol 13 (19) ◽  
pp. 3836
Author(s):  
Clément Dechesne ◽  
Pierre Lassalle ◽  
Sébastien Lefèvre

In recent years, numerous deep learning techniques have been proposed to tackle the semantic segmentation of aerial and satellite images, increase trust in the leaderboards of main scientific contests and represent the current state-of-the-art. Nevertheless, despite their promising results, these state-of-the-art techniques are still unable to provide results with the level of accuracy sought in real applications, i.e., in operational settings. Thus, it is mandatory to qualify these segmentation results and estimate the uncertainty brought about by a deep network. In this work, we address uncertainty estimations in semantic segmentation. To do this, we relied on a Bayesian deep learning method, based on Monte Carlo Dropout, which allows us to derive uncertainty metrics along with the semantic segmentation. Built on the most widespread U-Net architecture, our model achieves semantic segmentation with high accuracy on several state-of-the-art datasets. More importantly, uncertainty maps are also derived from our model. While they allow for the performance of a sounder qualitative evaluation of the segmentation results, they also include valuable information to improve the reference databases.


Author(s):  
Aparna .

A naturalist is someone who studies the patterns of nature identify different kingdom of flora and fauna in the nature. Being able to identify the flora and fauna around us often leads to an interest in protecting wild species, collecting and sharing information about the species we see on our travels is very useful for conserving groups like NCC. Deep-learning based techniques and methods are becoming popular in digital naturalist studies, as their performance is superior in image analysis fields, such as object detection, image classification, and semantic segmentation. Deep-learning techniques have achieved state of-the -art performance for automatic segmentation of digital naturalist through multi-model image sensing. Our task as naturalist has grown widely in the field of natural-historians. It has increased from identification to saviours as well. Not only identifying flora and fauna but also to know about their habits, habitats, living and grouping lead to fetching services for protection as well.


Atmosphere ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 772
Author(s):  
Alexandra Duminil ◽  
Jean-Philippe Tarel ◽  
Roland Brémond

From an analysis of the priors used in state-of-the-art algorithms for single image defogging, a new prior is proposed to obtain a better atmospheric veil removal. Our hypothesis is based on a physical model, considering that the fog appears denser near the horizon rather than close to the camera. It leads to more restoration when the fog depth is more important, for a more natural rendering. For this purpose, the Naka–Rushton function is used to modulate the atmospheric veil according to empirical observations on synthetic foggy images. The parameters of this function are set from features of the input image. This method also prevents over-restoration and thus preserves the sky from artifacts and noises. The algorithm generalizes to different kinds of fog, airborne particles, and illumination conditions. The proposed method is extended to the nighttime and underwater images by computing the atmospheric veil on each color channel. Qualitative and quantitative evaluations show the benefit of the proposed algorithm. The quantitative evaluation shows the efficiency of the algorithm on four databases with different types of fog, which demonstrates the broad generalization allowed by the proposed algorithm, in contrast with most of the currently available deep learning techniques.


2021 ◽  
Vol 13 (13) ◽  
pp. 2578
Author(s):  
Samir Touzani ◽  
Jessica Granderson

Advances in machine learning and computer vision, combined with increased access to unstructured data (e.g., images and text), have created an opportunity for automated extraction of building characteristics, cost-effectively, and at scale. These characteristics are relevant to a variety of urban and energy applications, yet are time consuming and costly to acquire with today’s manual methods. Several recent research studies have shown that in comparison to more traditional methods that are based on features engineering approach, an end-to-end learning approach based on deep learning algorithms significantly improved the accuracy of automatic building footprint extraction from remote sensing images. However, these studies used limited benchmark datasets that have been carefully curated and labeled. How the accuracy of these deep learning-based approach holds when using less curated training data has not received enough attention. The aim of this work is to leverage the openly available data to automatically generate a larger training dataset with more variability in term of regions and type of cities, which can be used to build more accurate deep learning models. In contrast to most benchmark datasets, the gathered data have not been manually curated. Thus, the training dataset is not perfectly clean in terms of remote sensing images exactly matching the ground truth building’s foot-print. A workflow that includes data pre-processing, deep learning semantic segmentation modeling, and results post-processing is introduced and applied to a dataset that include remote sensing images from 15 cities and five counties from various region of the USA, which include 8,607,677 buildings. The accuracy of the proposed approach was measured on an out of sample testing dataset corresponding to 364,000 buildings from three USA cities. The results favorably compared to those obtained from Microsoft’s recently released US building footprint dataset.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5460
Author(s):  
Lei Lang ◽  
Ke Xu ◽  
Qian Zhang ◽  
Dong Wang

Deep learning-based object detection in remote sensing images is an important yet challenging task due to a series of difficulties, such as complex geometry scene, dense target quantity, and large variant in object distributions and scales. Moreover, algorithm designers also have to make a trade-off between model’s complexity and accuracy to meet the real-world deployment requirements. To deal with these challenges, we proposed a lightweight YOLO-like object detector with the ability to detect objects in remote sensing images with high speed and high accuracy. The detector is constructed with efficient channel attention layers to improve the channel information sensitivity. Differential evolution was also developed to automatically find the optimal anchor configurations to address issue of large variant in object scales. Comprehensive experiment results show that the proposed network outperforms state-of-the-art lightweight models by 5.13% and 3.58% in accuracy on the RSOD and DIOR dataset, respectively. The deployed model on an NVIDIA Jetson Xavier NX embedded board can achieve a detection speed of 58 FPS with less than 10W power consumption, which makes the proposed detector very suitable for low-cost low-power remote sensing application scenarios.


Author(s):  
Lixiang Ru ◽  
Bo Du ◽  
Chen Wu

Current weakly-supervised semantic segmentation (WSSS) methods with image-level labels mainly adopt class activation maps (CAM) to generate the initial pseudo labels. However, CAM usually only identifies the most discriminative object extents, which is attributed to the fact that the network doesn't need to discover the integral object to recognize image-level labels. In this work, to tackle this problem, we proposed to simultaneously learn the image-level labels and local visual word labels. Specifically, in each forward propagation, the feature maps of the input image will be encoded to visual words with a learnable codebook. By enforcing the network to classify the encoded fine-grained visual words, the generated CAM could cover more semantic regions. Besides, we also proposed a hybrid spatial pyramid pooling module that could preserve local maximum and global average values of feature maps, so that more object details and less background were considered. Based on the proposed methods, we conducted experiments on the PASCAL VOC 2012 dataset. Our proposed method achieved 67.2% mIoU on the val set and 67.3% mIoU on the test set, which outperformed recent state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document