scholarly journals SVA-SSD: saliency visual attention single shot detector for building detection in low contrast high-resolution satellite images

2021 ◽  
Vol 7 ◽  
pp. e772
Author(s):  
Ahmed I. Shahin ◽  
Sultan Almotairi

Building detection in high-resolution satellite images has received great attention, as it is important to increase the accuracy of urban planning. The building boundary detection in the desert environment is a real challenge due to the nature of low contrast images in the desert environment. The traditional computer vision algorithms for building boundary detection lack scalability, robustness, and accuracy. On the other hand, deep learning detection algorithms have not been applied to such low contrast satellite images. So, there is a real need to employ deep learning algorithms for building detection tasks in low contrast high-resolution images. In this paper, we propose a novel building detection method based on a single-shot multi-box (SSD) detector. We develop the state-of-the-art SSD detection algorithm based on three approaches. First, we propose data-augmentation techniques to overcome the low contrast images’ appearance. Second, we develop the SSD backbone using a novel saliency visual attention mechanism. Moreover, we investigate several pre-trained networks performance and several fusion functions to increase the performance of the SSD backbone. The third approach is based on optimizing the anchor-boxes sizes which are used in the detection stage to increase the performance of the SSD head. During our experiments, we have prepared a new dataset for buildings inside Riyadh City, Saudi Arabia that consists of 3878 buildings. We have compared our proposed approach vs other approaches in the literature. The proposed system has achieved the highest average precision, recall, F1-score, and IOU performance. Our proposed method has achieved a fast average prediction time with the lowest variance for our testing set. Our experimental results are very promising and can be generalized to other object detection tasks in low contrast images.

Symmetry ◽  
2018 ◽  
Vol 11 (1) ◽  
pp. 3 ◽  
Author(s):  
Muhammad Aamir ◽  
Yi-Fei Pu ◽  
Ziaur Rahman ◽  
Muhammad Tahir ◽  
Hamad Naeem ◽  
...  

Building detection in satellite images has been considered an essential field of research in remote sensing and computer vision. There are currently numerous techniques and algorithms used to achieve building detection performance. Different algorithms have been proposed to extract building objects from high-resolution satellite images with standard contrast. However, building detection from low-contrast satellite images to predict symmetrical findings as of past studies using normal contrast images is considered a challenging task and may play an integral role in a wide range of applications. Having received significant attention in recent years, this manuscript proposes a methodology to detect buildings from low-contrast satellite images. In an effort to enhance visualization of satellite images, in this study, first, the contrast of an image is optimized to represent all the information using singular value decomposition (SVD) based on the discrete wavelet transform (DWT). Second, a line-segment detection scheme is applied to accurately detect building line segments. Third, the detected line segments are hierarchically grouped to recognize the relationship of identified line segments, and the complete contours of the building are attained to obtain candidate rectangular buildings. In this paper, the results from the method above are compared with existing approaches based on high-resolution images with reasonable contrast. The proposed method achieves high performance thus yields more diversified and insightful results over conventional techniques.


2020 ◽  
Vol 12 (3) ◽  
pp. 458 ◽  
Author(s):  
Ugur Alganci ◽  
Mehmet Soydas ◽  
Elif Sertel

Object detection from satellite images has been a challenging problem for many years. With the development of effective deep learning algorithms and advancement in hardware systems, higher accuracies have been achieved in the detection of various objects from very high-resolution (VHR) satellite images. This article provides a comparative evaluation of the state-of-the-art convolutional neural network (CNN)-based object detection models, which are Faster R-CNN, Single Shot Multi-box Detector (SSD), and You Look Only Once-v3 (YOLO-v3), to cope with the limited number of labeled data and to automatically detect airplanes in VHR satellite images. Data augmentation with rotation, rescaling, and cropping was applied on the test images to artificially increase the number of training data from satellite images. Moreover, a non-maximum suppression algorithm (NMS) was introduced at the end of the SSD and YOLO-v3 flows to get rid of the multiple detection occurrences near each detected object in the overlapping areas. The trained networks were applied to five independent VHR test images that cover airports and their surroundings to evaluate their performance objectively. Accuracy assessment results of the test regions proved that Faster R-CNN architecture provided the highest accuracy according to the F1 scores, average precision (AP) metrics, and visual inspection of the results. The YOLO-v3 ranked as second, with a slightly lower performance but providing a balanced trade-off between accuracy and speed. The SSD provided the lowest detection performance, but it was better in object localization. The results were also evaluated in terms of the object size and detection accuracy manner, which proved that large- and medium-sized airplanes were detected with higher accuracy.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4938
Author(s):  
Min Li ◽  
Zhijie Zhang ◽  
Liping Lei ◽  
Xiaofan Wang ◽  
Xudong Guo

Agricultural greenhouses (AGs) are an important facility for the development of modern agriculture. Accurately and effectively detecting AGs is a necessity for the strategic planning of modern agriculture. With the advent of deep learning algorithms, various convolutional neural network (CNN)-based models have been proposed for object detection with high spatial resolution images. In this paper, we conducted a comparative assessment of the three well-established CNN-based models, which are Faster R-CNN, You Look Only Once-v3 (YOLO v3), and Single Shot Multi-Box Detector (SSD) for detecting AGs. The transfer learning and fine-tuning approaches were implemented to train models. Accuracy and efficiency evaluation results show that YOLO v3 achieved the best performance according to the average precision (mAP), frames per second (FPS) metrics and visual inspection. The SSD demonstrated an advantage in detection speed with an FPS twice higher than Faster R-CNN, although their mAP is close on the test set. The trained models were also applied to two independent test sets, which proved that these models have a certain transability and the higher resolution images are significant for accuracy improvement. Our study suggests YOLO v3 with superiorities in both accuracy and computational efficiency can be applied to detect AGs using high-resolution satellite images operationally.


2019 ◽  
Vol 11 (4) ◽  
pp. 403 ◽  
Author(s):  
Weijia Li ◽  
Conghui He ◽  
Jiarui Fang ◽  
Juepeng Zheng ◽  
Haohuan Fu ◽  
...  

Automatic extraction of building footprints from high-resolution satellite imagery has become an important and challenging research issue receiving greater attention. Many recent studies have explored different deep learning-based semantic segmentation methods for improving the accuracy of building extraction. Although they record substantial land cover and land use information (e.g., buildings, roads, water, etc.), public geographic information system (GIS) map datasets have rarely been utilized to improve building extraction results in existing studies. In this research, we propose a U-Net-based semantic segmentation method for the extraction of building footprints from high-resolution multispectral satellite images using the SpaceNet building dataset provided in the DeepGlobe Satellite Challenge of IEEE Conference on Computer Vision and Pattern Recognition 2018 (CVPR 2018). We explore the potential of multiple public GIS map datasets (OpenStreetMap, Google Maps, and MapWorld) through integration with the WorldView-3 satellite datasets in four cities (Las Vegas, Paris, Shanghai, and Khartoum). Several strategies are designed and combined with the U-Net–based semantic segmentation model, including data augmentation, post-processing, and integration of the GIS map data and satellite images. The proposed method achieves a total F1-score of 0.704, which is an improvement of 1.1% to 12.5% compared with the top three solutions in the SpaceNet Building Detection Competition and 3.0% to 9.2% compared with the standard U-Net–based method. Moreover, the effect of each proposed strategy and the possible reasons for the building footprint extraction results are analyzed substantially considering the actual situation of the four cities.


Sign in / Sign up

Export Citation Format

Share Document