scholarly journals Model Specialization for the Use of ESRGAN on Satellite and Airborne Imagery

2021 ◽  
Vol 13 (20) ◽  
pp. 4044
Author(s):  
Étienne Clabaut ◽  
Myriam Lemelin ◽  
Mickaël Germain ◽  
Yacine Bouroubi ◽  
Tony St-Pierre

Training a deep learning model requires highly variable data to permit reasonable generalization. If the variability in the data about to be processed is low, the interest in obtaining this generalization seems limited. Yet, it could prove interesting to specialize the model with respect to a particular theme. The use of enhanced super-resolution generative adversarial networks (ERSGAN), a specific type of deep learning architecture, allows the spatial resolution of remote sensing images to be increased by “hallucinating” non-existent details. In this study, we show that ESRGAN create better quality images when trained on thematically classified images than when trained on a wide variety of examples. All things being equal, we further show that the algorithm performs better on some themes than it does on others. Texture analysis shows that these performances are correlated with the inverse difference moment and entropy of the images.

2021 ◽  
Vol 13 (6) ◽  
pp. 1104
Author(s):  
Yuanfu Gong ◽  
Puyun Liao ◽  
Xiaodong Zhang ◽  
Lifei Zhang ◽  
Guanzhou Chen ◽  
...  

Previously, generative adversarial networks (GAN) have been widely applied on super resolution reconstruction (SRR) methods, which turn low-resolution (LR) images into high-resolution (HR) ones. However, as these methods recover high frequency information with what they observed from the other images, they tend to produce artifacts when processing unfamiliar images. Optical satellite remote sensing images are of a far more complicated scene than natural images. Therefore, applying the previous networks on remote sensing images, especially mid-resolution ones, leads to unstable convergence and thus unpleasing artifacts. In this paper, we propose Enlighten-GAN for SRR tasks on large-size optical mid-resolution remote sensing images. Specifically, we design the enlighten blocks to induce network converging to a reliable point, and bring the Self-Supervised Hierarchical Perceptual Loss to attain performance improvement overpassing the other loss functions. Furthermore, limited by memory, large-scale images need to be cropped into patches to get through the network separately. To merge the reconstructed patches into a whole, we employ the internal inconsistency loss and cropping-and-clipping strategy, to avoid the seam line. Experiment results certify that Enlighten-GAN outperforms the state-of-the-art methods in terms of gradient similarity metric (GSM) on mid-resolution Sentinel-2 remote sensing images.


Author(s):  
Wangyao Shen ◽  
Yunping Chen ◽  
Yuanlei Cheng ◽  
Kangzhuo Yang ◽  
Xiang Guo ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bo Pan ◽  
Wei Zheng

Emotion recognition plays an important role in the field of human-computer interaction (HCI). Automatic emotion recognition based on EEG is an important topic in brain-computer interface (BCI) applications. Currently, deep learning has been widely used in the field of EEG emotion recognition and has achieved remarkable results. However, due to the cost of data collection, most EEG datasets have only a small amount of EEG data, and the sample categories are unbalanced in these datasets. These problems will make it difficult for the deep learning model to predict the emotional state. In this paper, we propose a new sample generation method using generative adversarial networks to solve the problem of EEG sample shortage and sample category imbalance. In experiments, we explore the performance of emotion recognition with the frequency band correlation and frequency band separation computational models before and after data augmentation on standard EEG-based emotion datasets. Our experimental results show that the method of generative adversarial networks for data augmentation can effectively improve the performance of emotion recognition based on the deep learning model. And we find that the frequency band correlation deep learning model is more conducive to emotion recognition.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Jian Huang ◽  
Shanhui Liu ◽  
Yutian Tang ◽  
Xiushan Zhang

With the continuous development of deep learning in computer vision, semantic segmentation technology is constantly employed for processing remote sensing images. For instance, it is a key technology to automatically mark important objects such as ships or port land from port area remote sensing images. However, the existing supervised semantic segmentation model based on deep learning requires a large number of training samples. Otherwise, it will not be able to correctly learn the characteristics of the target objects, which results in the poor performance or even failure of semantic segmentation task. Since the target objects such as ships may move from time to time, it is nontrivial to collect enough samples to achieve satisfactory segmentation performance. And this severely hinders the performance improvement of most of existing augmentation methods. To tackle this problem, in this paper, we propose an object-level remote sensing image augmentation approach based on leveraging the U-Net-based generative adversarial networks. Specifically, our proposed approach consists two components including the semantic tag image generator and the U-Net GAN-based translator. To evaluate the effectiveness of the proposed approach, comprehensive experiments are conducted on a public dataset HRSC2016. State-of-the-art generative models, DCGAN, WGAN, and CycleGAN, are selected as baselines. According to the experimental results, our proposed approach significantly outperforms the baselines in terms of not only drawing the outlines of target objects but also capturing their meaningful details.


2020 ◽  
Vol 12 (15) ◽  
pp. 2424
Author(s):  
Luis Salgueiro Romero ◽  
Javier Marcello ◽  
Verónica Vilaplana

Sentinel-2 satellites provide multi-spectral optical remote sensing images with four bands at 10 m of spatial resolution. These images, due to the open data distribution policy, are becoming an important resource for several applications. However, for small scale studies, the spatial detail of these images might not be sufficient. On the other hand, WorldView commercial satellites offer multi-spectral images with a very high spatial resolution, typically less than 2 m, but their use can be impractical for large areas or multi-temporal analysis due to their high cost. To exploit the free availability of Sentinel imagery, it is worth considering deep learning techniques for single-image super-resolution tasks, allowing the spatial enhancement of low-resolution (LR) images by recovering high-frequency details to produce high-resolution (HR) super-resolved images. In this work, we implement and train a model based on the Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) with pairs of WorldView-Sentinel images to generate a super-resolved multispectral Sentinel-2 output with a scaling factor of 5. Our model, named RS-ESRGAN, removes the upsampling layers of the network to make it feasible to train with co-registered remote sensing images. Results obtained outperform state-of-the-art models using standard metrics like PSNR, SSIM, ERGAS, SAM and CC. Moreover, qualitative visual analysis shows spatial improvements as well as the preservation of the spectral information, allowing the super-resolved Sentinel-2 imagery to be used in studies requiring very high spatial resolution.


2021 ◽  
Vol 13 (19) ◽  
pp. 3898
Author(s):  
Duanguang Cao ◽  
Hanfa Xing ◽  
Man Sing Wong ◽  
Mei-Po Kwan ◽  
Huaqiao Xing ◽  
...  

Automatically extracting buildings from remote sensing images with deep learning is of great significance to urban planning, disaster prevention, change detection, and other applications. Various deep learning models have been proposed to extract building information, showing both strengths and weaknesses in capturing the complex spectral and spatial characteristics of buildings in remote sensing images. To integrate the strengths of individual models and obtain fine-scale spatial and spectral building information, this study proposed a stacking ensemble deep learning model. First, an optimization method for the prediction results of the basic model is proposed based on fully connected conditional random fields (CRFs). On this basis, a stacking ensemble model (SENet) based on a sparse autoencoder integrating U-NET, SegNet, and FCN-8s models is proposed to combine the features of the optimized basic model prediction results. Utilizing several cities in Hebei Province, China as a case study, a building dataset containing attribute labels is established to assess the performance of the proposed model. The proposed SENet is compared with three individual models (U-NET, SegNet and FCN-8s), and the results show that the accuracy of SENet is 0.954, approximately 6.7%, 6.1%, and 9.8% higher than U-NET, SegNet, and FCN-8s models, respectively. The identification of building features, including colors, sizes, shapes, and shadows, is also evaluated, showing that the accuracy, recall, F1 score, and intersection over union (IoU) of the SENet model are higher than those of the three individual models. This suggests that the proposed ensemble model can effectively depict the different features of buildings and provides an alternative approach to building extraction with higher accuracy.


Sign in / Sign up

Export Citation Format

Share Document