Image Region Prediction from Thermal Videos Based on Image Prediction Generative Adversarial Network

Various studies have been conducted on object detection, tracking, and action recognition based on thermal images. However, errors occur during object detection, tracking, and action recognition when a moving object leaves the field of view (FOV) of a camera and part of the object becomes invisible. However, no studies have examined this issue so far. Therefore, this article proposes a method for widening the FOV of the current image by predicting images outside the FOV of the camera using the current image and previous sequential images. In the proposed method, the original one-channel thermal image is converted into a three-channel thermal image to perform image prediction using an image prediction generative adversarial network. When image prediction and object detection experiments were conducted using the marathon sub-dataset of the Boston University-thermal infrared video (BU-TIV) benchmark open dataset, we confirmed that the proposed method showed the higher accuracies of image prediction (structural similarity index measure (SSIM) of 0.9839) and object detection (F1 score (F1) of 0.882, accuracy (ACC) of 0.983, and intersection over union (IoU) of 0.791) than the state-of-the-art methods.

Download Full-text

Enlargement of the Field of View Based on Image Region Prediction Using Thermal Videos

Mathematics ◽

10.3390/math9192379 ◽

2021 ◽

Vol 9 (19) ◽

pp. 2379

Author(s):

Ganbayar Batchuluun ◽

Na Rae Baek ◽

Kang Ryoung Park

Keyword(s):

Similarity Index ◽

Structural Similarity ◽

Input Image ◽

The Body ◽

Human Detection ◽

Field Of View ◽

Image Region ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Image Prediction

Various studies have been conducted for detecting humans in images. However, there are the cases where a part of human body disappears in the input image and leaves the camera field of view (FOV). Moreover, there are the cases where a pedestrian comes into the FOV as a part of the body slowly appears. In these cases, human detection and tracking fail by existing methods. Therefore, we propose the method for predicting a wider region than the FOV of a thermal camera based on the image prediction generative adversarial network version 2 (IPGAN-2). When an experiment was conducted using the marathon subdataset of the Boston University-thermal infrared video benchmark open dataset, the proposed method showed higher image prediction (structural similarity index measure (SSIM) of 0.9437) and object detection (F1 score of 0.866, accuracy of 0.914, and intersection over union (IoU) of 0.730) accuracies than state-of-the-art methods.

Download Full-text

Single Image Super-Resolution: Depthwise Separable Convolution Super-Resolution Generative Adversarial Network

Applied Sciences ◽

10.3390/app10010375 ◽

2020 ◽

Vol 10 (1) ◽

pp. 375 ◽

Cited By ~ 1

Author(s):

Zetao Jiang ◽

Yongsong Huang ◽

Lirui Hu

Keyword(s):

Similarity Index ◽

Super Resolution ◽

Structural Similarity ◽

Image Features ◽

Single Image ◽

Generative Adversarial Network ◽

Visual Evaluation ◽

Adversarial Network ◽

Image Super Resolution ◽

Single Image Super Resolution

The super-resolution generative adversarial network (SRGAN) is a seminal work that is capable of generating realistic textures during single image super-resolution. However, the hallucinated details are often accompanied by unpleasant artifacts. To further enhance the visual quality, we propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The method is based on depthwise separable convolution super-resolution generative adversarial network (DSCSRGAN). A new depthwise separable convolution dense block (DSC Dense Block) was designed for the generator network, which improved the ability to represent and extract image features, while greatly reducing the total amount of parameters. For the discriminator network, the batch normalization (BN) layer was discarded, and the problem of artifacts was reduced. A frequency energy similarity loss function was designed to constrain the generator network to generate better super-resolution images. Experiments on several different datasets showed that the peak signal-to-noise ratio (PSNR) was improved by more than 3 dB, structural similarity index (SSIM) was increased by 16%, and the total parameter was reduced to 42.8% compared with the original model. Combining various objective indicators and subjective visual evaluation, the algorithm was shown to generate richer image details, clearer texture, and lower complexity.

Download Full-text

Improving the Visual Quality of Generative Adversarial Network (GAN)-Generated Images Using the Multi-Scale Structural Similarity Index

2018 25th IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2018.8451296 ◽

2018 ◽

Cited By ~ 2

Author(s):

Parimala Kancharla ◽

Sumohana S. Channappayya

Keyword(s):

Similarity Index ◽

Visual Quality ◽

Structural Similarity ◽

Structural Similarity Index ◽

Generative Adversarial Network ◽

Multi Scale ◽

Adversarial Network

Download Full-text

An Enhanced pix2pix Dehazing Network with Guided Filter Layer

Applied Sciences ◽

10.3390/app10175898 ◽

2020 ◽

Vol 10 (17) ◽

pp. 5898

Author(s):

Qirong Bu ◽

Jie Luo ◽

Kuan Ma ◽

Hongwei Feng ◽

Jun Feng

Keyword(s):

Signal To Noise Ratio ◽

Similarity Index ◽

Structural Similarity ◽

Comparison Method ◽

Guided Filter ◽

Generative Adversarial Network ◽

Test Dataset ◽

Adversarial Network ◽

Filter Layer

In this paper, we propose an enhanced pix2pix dehazing network, which generates clear images without relying on a physical scattering model. This network is a generative adversarial network (GAN) which combines multiple guided filter layers. First, the input of hazy images is smoothed to obtain high-frequency features according to different smoothing kernels of the guided filter layer. Then, these features are embedded in higher dimensions of the network and connected with the output of the generator’s encoder. Finally, Visual Geometry Group (VGG) features are introduced to serve as a loss function to improve the quality of the texture information restoration and generate better hazy-free images. We conduct experiments on NYU-Depth, I-HAZE and O-HAZE datasets. The enhanced pix2pix dehazing network we propose produces increases of 1.22 dB in the Peak Signal-to-Noise Ratio (PSNR) and 0.01 in the Structural Similarity Index Metric (SSIM) compared with a second successful comparison method using the indoor test dataset. Extensive experiments demonstrate that the proposed method has good performance for image dehazing.

Download Full-text

A cross-scanner and cross-tracer deep learning method for the recovery of standard-dose imaging quality from low-dose PET

European Journal of Nuclear Medicine and Molecular Imaging ◽

10.1007/s00259-021-05644-1 ◽

2021 ◽

Author(s):

Song Xue ◽

Rui Guo ◽

Karl Peter Bohn ◽

Jared Matzke ◽

Marco Viscione ◽

...

Keyword(s):

Deep Learning ◽

Fdg Pet ◽

Low Dose ◽

Standard Dose ◽

Similarity Index ◽

Structural Similarity ◽

Clinical Image ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Learning Development

Abstract Purpose A critical bottleneck for the credibility of artificial intelligence (AI) is replicating the results in the diversity of clinical practice. We aimed to develop an AI that can be independently applied to recover high-quality imaging from low-dose scans on different scanners and tracers. Methods Brain [18F]FDG PET imaging of 237 patients scanned with one scanner was used for the development of AI technology. The developed algorithm was then tested on [18F]FDG PET images of 45 patients scanned with three different scanners, [18F]FET PET images of 18 patients scanned with two different scanners, as well as [18F]Florbetapir images of 10 patients. A conditional generative adversarial network (GAN) was customized for cross-scanner and cross-tracer optimization. Three nuclear medicine physicians independently assessed the utility of the results in a clinical setting. Results The improvement achieved by AI recovery significantly correlated with the baseline image quality indicated by structural similarity index measurement (SSIM) (r = −0.71, p < 0.05) and normalized dose acquisition (r = −0.60, p < 0.05). Our cross-scanner and cross-tracer AI methodology showed utility based on both physical and clinical image assessment (p < 0.05). Conclusion The deep learning development for extensible application on unknown scanners and tracers may improve the trustworthiness and clinical acceptability of AI-based dose reduction.

Download Full-text

Image Text Deblurring Method Based on Generative Adversarial Network

Electronics ◽

10.3390/electronics9020220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 220

Author(s):

Chunxue Wu ◽

Haiyan Du ◽

Qunhui Wu ◽

Sheng Zhang

Keyword(s):

Similarity Index ◽

Human Perception ◽

Specific Area ◽

Structural Similarity ◽

Wasserstein Distance ◽

Generative Adversarial Networks ◽

Natural Image ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks

In the automatic sorting process of express delivery, a three-segment code is used to represent a specific area assigned by a specific delivery person. In the process of obtaining the courier order information, the camera is affected by factors such as light, noise, and subject shake, which will cause the information on the courier order to be blurred, and some information will be lost. Therefore, this paper proposes an image text deblurring method based on a generative adversarial network. The model of the algorithm consists of two generative adversarial networks, combined with Wasserstein distance, using a combination of adversarial loss and perceptual loss on unpaired datasets to train the network model to restore the captured blurred images into clear and natural image. Compared with the traditional method, the advantage of this method is that the loss function between the input and output images can be calculated indirectly through the positive and negative generative adversarial networks. The Wasserstein distance can achieve a more stable training process and a more realistic generation effect. The constraints of adversarial loss and perceptual loss make the model capable of training on unpaired datasets. The experimental results on the GOPRO test dataset and the self-built unpaired dataset showed that the two indicators, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), increased by 13.3% and 3%, respectively. The human perception test results demonstrated that the algorithm proposed in this paper was better than the traditional blur algorithm as the deblurring effect was better.

Download Full-text

IDENTIFYING EPIPHYTES IN DRONES PHOTOS WITH A CONDITIONAL GENERATIVE ADVERSARIAL NETWORK (C-GAN)

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliv-m-2-2020-99-2020 ◽

2020 ◽

Vol XLIV-M-2-2020 ◽

pp. 99-104

Author(s):

A. Shashank ◽

V. V. Sajithvariyar ◽

V. Sowmya ◽

K. P. Soman ◽

R. Sivanpillai ◽

...

Keyword(s):

Similarity Index ◽

Structural Similarity ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Validation Data ◽

Network Algorithms ◽

Adversarial Network ◽

Adversarial Networks ◽

Aerial Vehicle ◽

Index Measure

Abstract. Unmanned Aerial Vehicle (UAV) missions often collect large volumes of imagery data. However, not all images will have useful information, or be of sufficient quality. Manually sorting these images and selecting useful data are both time consuming and prone to interpreter bias. Deep neural network algorithms are capable of processing large image datasets and can be trained to identify specific targets. Generative Adversarial Networks (GANs) consist of two competing networks, Generator and Discriminator that can analyze, capture, and copy the variations within a given dataset. In this study, we selected a variant of GAN called Conditional-GAN that incorporates an additional label parameter, for identifying epiphytes in photos acquired by a UAV in forests within Costa Rica. We trained the network with 70%, 80%, and 90% of 119 photos containing the target epiphyte, Werauhia kupperiana (Bromeliaceae) and validated the algorithm’s performance using a validation data that were not used for training. The accuracy of the output was measured using structural similarity index measure (SSIM) index and histogram correlation (HC) coefficient. Results obtained in this study indicated that the output images generated by C-GAN were similar (average SSIM = 0.89–0.91 and average HC 0.97–0.99) to the analyst annotated images. However, C-GAN had difficulty to identify when the target plant was away from the camera, was not well lit, or covered by other plants. Results obtained in this study demonstrate the potential of C-GAN to reduce the time spent by botanists to identity epiphytes in images acquired by UAVs.

Download Full-text

GPR B-Scan Image Denoising via Multi-Scale Convolutional Autoencoder with Data Augmentation

Electronics ◽

10.3390/electronics10111269 ◽

2021 ◽

Vol 10 (11) ◽

pp. 1269

Author(s):

Jiabin Luo ◽

Wentai Lei ◽

Feifei Hou ◽

Chenghao Wang ◽

Qiang Ren ◽

...

Keyword(s):

Image Denoising ◽

Data Augmentation ◽

Noise Suppression ◽

Random Noise ◽

Similarity Index ◽

Structural Similarity ◽

Training Dataset ◽

Generative Adversarial Network ◽

Multi Scale ◽

Convolutional Autoencoder

Ground-penetrating radar (GPR), as a non-invasive instrument, has been widely used in civil engineering. In GPR B-scan images, there may exist random noise due to the influence of the environment and equipment hardware, which complicates the interpretability of the useful information. Many methods have been proposed to eliminate or suppress the random noise. However, the existing methods have an unsatisfactory denoising effect when the image is severely contaminated by random noise. This paper proposes a multi-scale convolutional autoencoder (MCAE) to denoise GPR data. At the same time, to solve the problem of training dataset insufficiency, we designed the data augmentation strategy, Wasserstein generative adversarial network (WGAN), to increase the training dataset of MCAE. Experimental results conducted on both simulated, generated, and field datasets demonstrated that the proposed scheme has promising performance for image denoising. In terms of three indexes: the peak signal-to-noise ratio (PSNR), the time cost, and the structural similarity index (SSIM), the proposed scheme can achieve better performance of random noise suppression compared with the state-of-the-art competing methods (e.g., CAE, BM3D, WNNM).

Download Full-text

A GENERATIVE ADVERSARIAL NETWORK APPROACH FOR SUPER-RESOLUTION OF SENTINEL-2 SATELLITE IMAGES

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b1-2020-9-2020 ◽

2020 ◽

Vol XLIII-B1-2020 ◽

pp. 9-14

Author(s):

F. Pineda ◽

V. Ayma ◽

C. Beltran

Keyword(s):

High Resolution ◽

Satellite Images ◽

Resolution Enhancement ◽

Super Resolution ◽

Structural Similarity ◽

Generative Adversarial Network ◽

Acceptable Result ◽

Adversarial Network ◽

High Resolution Satellite Images ◽

Sentinel 2

Abstract. High-resolution satellite images have always been in high demand due to the greater detail and precision they offer, as well as the wide scope of the fields in which they could be applied; however, satellites in operation offering very high-resolution (VHR) images has experienced an important increase, but they remain as a smaller proportion against existing lower resolution (HR) satellites. Recent models of convolutional neural networks (CNN) are very suitable for applications with image processing, like resolution enhancement of images; but in order to obtain an acceptable result, it is important, not only to define the kind of CNN architecture but the reference set of images to train the model. Our work proposes an alternative to improve the spatial resolution of HR images obtained by Sentinel-2 satellite by using the VHR images from PeruSat1, a peruvian satellite, which serve as the reference for the super-resolution approach implementation based on a Generative Adversarial Network (GAN) model, as an alternative for obtaining VHR images. The VHR PeruSat-1 image dataset is used for the training process of the network. The results obtained were analyzed considering the Peak Signal to Noise Ratios (PSNR) and the Structural Similarity (SSIM). Finally, some visual outcomes, over a given testing dataset, are presented so the performance of the model could be analyzed as well.

Download Full-text