scholarly journals Infer Thermal Information from Visual Information: A Cross Imaging Modality Edge Learning (CIMEL) Framework

Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7471
Author(s):  
Shuozhi Wang ◽  
Jianqiang Mei ◽  
Lichao Yang ◽  
Yifan Zhao

The measurement accuracy and reliability of thermography is largely limited by a relatively low spatial-resolution of infrared (IR) cameras in comparison to digital cameras. Using a high-end IR camera to achieve high spatial-resolution can be costly or sometimes infeasible due to the high sample rate required. Therefore, there is a strong demand to improve the quality of IR images, particularly on edges, without upgrading the hardware in the context of surveillance and industrial inspection systems. This paper proposes a novel Conditional Generative Adversarial Networks (CGAN)-based framework to enhance IR edges by learning high-frequency features from corresponding visual images. A dual-discriminator, focusing on edge and content/background, is introduced to guide the cross imaging modality learning procedure of the U-Net generator in high and low frequencies respectively. Results demonstrate that the proposed framework can effectively enhance barely visible edges in IR images without introducing artefacts, meanwhile the content information is well preserved. Different from most similar studies, this method only requires IR images for testing, which will increase the applicability of some scenarios where only one imaging modality is available, such as active thermography.

2020 ◽  
Vol 10 (5) ◽  
pp. 1729 ◽  
Author(s):  
Yuning Jiang ◽  
Jinhua Li

Objective: Super-resolution reconstruction is an increasingly important area in computer vision. To alleviate the problems that super-resolution reconstruction models based on generative adversarial networks are difficult to train and contain artifacts in reconstruction results, we propose a novel and improved algorithm. Methods: This paper presented TSRGAN (Super-Resolution Generative Adversarial Networks Combining Texture Loss) model which was also based on generative adversarial networks. We redefined the generator network and discriminator network. Firstly, on the network structure, residual dense blocks without excess batch normalization layers were used to form generator network. Visual Geometry Group (VGG)19 network was adopted as the basic framework of discriminator network. Secondly, in the loss function, the weighting of the four loss functions of texture loss, perceptual loss, adversarial loss and content loss was used as the objective function of generator. Texture loss was proposed to encourage local information matching. Perceptual loss was enhanced by employing the features before activation layer to calculate. Adversarial loss was optimized based on WGAN-GP (Wasserstein GAN with Gradient Penalty) theory. Content loss was used to ensure the accuracy of low-frequency information. During the optimization process, the target image information was reconstructed from different angles of high and low frequencies. Results: The experimental results showed that our method made the average Peak Signal to Noise Ratio of reconstructed images reach 27.99 dB and the average Structural Similarity Index reach 0.778 without losing too much speed, which was superior to other comparison algorithms in objective evaluation index. What is more, TSRGAN significantly improved subjective visual evaluations such as brightness information and texture details. We found that it could generate images with more realistic textures and more accurate brightness, which were more in line with human visual evaluation. Conclusions: Our improvements to the network structure could reduce the model’s calculation amount and stabilize the training direction. In addition, the loss function we present for generator could provide stronger supervision for restoring realistic textures and achieving brightness consistency. Experimental results prove the effectiveness and superiority of TSRGAN algorithm.


2021 ◽  
Vol 13 (20) ◽  
pp. 4044
Author(s):  
Étienne Clabaut ◽  
Myriam Lemelin ◽  
Mickaël Germain ◽  
Yacine Bouroubi ◽  
Tony St-Pierre

Training a deep learning model requires highly variable data to permit reasonable generalization. If the variability in the data about to be processed is low, the interest in obtaining this generalization seems limited. Yet, it could prove interesting to specialize the model with respect to a particular theme. The use of enhanced super-resolution generative adversarial networks (ERSGAN), a specific type of deep learning architecture, allows the spatial resolution of remote sensing images to be increased by “hallucinating” non-existent details. In this study, we show that ESRGAN create better quality images when trained on thematically classified images than when trained on a wide variety of examples. All things being equal, we further show that the algorithm performs better on some themes than it does on others. Texture analysis shows that these performances are correlated with the inverse difference moment and entropy of the images.


Author(s):  
Li Yuan ◽  
Francis EH Tay ◽  
Ping Li ◽  
Li Zhou ◽  
Jiashi Feng

In this paper, we present a novel unsupervised video summarization model that requires no manual annotation. The proposed model termed Cycle-SUM adopts a new cycleconsistent adversarial LSTM architecture that can effectively maximize the information preserving and compactness of the summary video. It consists of a frame selector and a cycle-consistent learning based evaluator. The selector is a bi-direction LSTM network that learns video representations that embed the long-range relationships among video frames. The evaluator defines a learnable information preserving metric between original video and summary video and “supervises” the selector to identify the most informative frames to form the summary video. In particular, the evaluator is composed of two generative adversarial networks (GANs), in which the forward GAN is learned to reconstruct original video from summary video while the backward GAN learns to invert the processing. The consistency between the output of such cycle learning is adopted as the information preserving metric for video summarization. We demonstrate the close relation between mutual information maximization and such cycle learning procedure. Experiments on two video summarization benchmark datasets validate the state-of-theart performance and superiority of the Cycle-SUM model over previous baselines.


Author(s):  
Dr. Naveena C ◽  
Thanush M ◽  
Vinay N.B ◽  
Yaser Ahmed N

The project entitled “Super-Resolution: A simplified approach using GANs”, aims to construct a Higher Resolution image from a Lower Resolution image, without losing much detail. In other words, it is a process of up-sampling of an under-sampled image. In the current scenario, there is a high reliance on hardware improvements to capture better highresolution images. Although many digital cameras provide enough HR-imagery, the cost to construct and purchase such a high-end camera is high. Also, many computer vision applications like medical imaging, forensic and many more still have a strong demand for higher resolution imagery which is likely to be exceeded by the capabilities of these HR digital cameras we have today. To cope with this demand, a method to generate an HR image is shown using Generative Adversarial Networks (GANs).


Sign in / Sign up

Export Citation Format

Share Document