scholarly journals Asymmetric Encoder-Decoder Structured FCN Based LiDAR to Color Image Generation

Sensors ◽  
2019 ◽  
Vol 19 (21) ◽  
pp. 4818 ◽  
Author(s):  
Hyun-Koo Kim ◽  
Kook-Yeol Yoo ◽  
Ju H. Park ◽  
Ho-Youl Jung

In this paper, we propose a method of generating a color image from light detection and ranging (LiDAR) 3D reflection intensity. The proposed method is composed of two steps: projection of LiDAR 3D reflection intensity into 2D intensity, and color image generation from the projected intensity by using a fully convolutional network (FCN). The color image should be generated from a very sparse projected intensity image. For this reason, the FCN is designed to have an asymmetric network structure, i.e., the layer depth of the decoder in the FCN is deeper than that of the encoder. The well-known KITTI dataset for various scenarios is used for the proposed FCN training and performance evaluation. Performance of the asymmetric network structures are empirically analyzed for various depth combinations for the encoder and decoder. Through simulations, it is shown that the proposed method generates fairly good visual quality of images while maintaining almost the same color as the ground truth image. Moreover, the proposed FCN has much higher performance than conventional interpolation methods and generative adversarial network based Pix2Pix. One interesting result is that the proposed FCN produces shadow-free and daylight color images. This result is caused by the fact that the LiDAR sensor data is produced by the light reflection and is, therefore, not affected by sunlight and shadow.

2021 ◽  
Author(s):  
Tham Vo

Abstract In abstractive summarization task, most of proposed models adopt the deep recurrent neural network (RNN)-based encoder-decoder architecture to learn and generate meaningful summary for a given input document. However, most of recent RNN-based models always suffer the challenges related to the involvement of much capturing high-frequency/reparative phrases in long documents during the training process which leads to the outcome of trivial and generic summaries are generated. Moreover, the lack of thorough analysis on the sequential and long-range dependency relationships between words within different contexts while learning the textual representation also make the generated summaries unnatural and incoherent. To deal with these challenges, in this paper we proposed a novel semantic-enhanced generative adversarial network (GAN)-based approach for abstractive text summarization task, called as: SGAN4AbSum. We use an adversarial training strategy for our text summarization model in which train the generator and discriminator to simultaneously handle the summary generation and distinguishing the generated summary with the ground-truth one. The input of generator is the jointed rich-semantic and global structural latent representations of training documents which are achieved by applying a combined BERT and graph convolutional network (GCN) textual embedding mechanism. Extensive experiments in benchmark datasets demonstrate the effectiveness of our proposed SGAN4AbSum which achieve the competitive ROUGE-based scores in comparing with state-of-the-art abstractive text summarization baselines.


Author(s):  
B. Jafrasteh ◽  
I. Manighetti ◽  
J. Zerubia

Abstract. We develop a novel method based on Deep Convolutional Networks (DCN) to automate the identification and mapping of fracture and fault traces in optical images. The method employs two DCNs in a two players game: a first network, called Generator, learns to segment images to make them resembling the ground truth; a second network, called Discriminator, measures the differences between the ground truth image and each segmented image and sends its score feedback to the Generator; based on these scores, the Generator improves its segmentation progressively. As we condition both networks to the ground truth images, the method is called Conditional Generative Adversarial Network (CGAN). We propose a new loss function for both the Generator and the Discriminator networks, to improve their accuracy. Using two criteria and a manually annotated optical image, we compare the generalization performance of the proposed method to that of a classical DCN architecture, U-net. The comparison demonstrates the suitability of the proposed CGAN architecture. Further work is however needed to improve its efficiency.


2021 ◽  
Vol 11 (4) ◽  
pp. 1380
Author(s):  
Yingbo Zhou ◽  
Pengcheng Zhao ◽  
Weiqin Tong ◽  
Yongxin Zhu

While Generative Adversarial Networks (GANs) have shown promising performance in image generation, they suffer from numerous issues such as mode collapse and training instability. To stabilize GAN training and improve image synthesis quality with diversity, we propose a simple yet effective approach as Contrastive Distance Learning GAN (CDL-GAN) in this paper. Specifically, we add Consistent Contrastive Distance (CoCD) and Characteristic Contrastive Distance (ChCD) into a principled framework to improve GAN performance. The CoCD explicitly maximizes the ratio of the distance between generated images and the increment between noise vectors to strengthen image feature learning for the generator. The ChCD measures the sampling distance of the encoded images in Euler space to boost feature representations for the discriminator. We model the framework by employing Siamese Network as a module into GANs without any modification on the backbone. Both qualitative and quantitative experiments conducted on three public datasets demonstrate the effectiveness of our method.


2021 ◽  
Author(s):  
Jialu Huang ◽  
Ying Huang ◽  
Yan-ting Lin ◽  
Zi-yang Liu ◽  
Yang Lin ◽  
...  

2021 ◽  
Author(s):  
Kazutake Uehira ◽  
Hiroshi Unno

A technique for removing unnecessary patterns from captured images by using a generative network is studied. The patterns, composed of lines and spaces, are superimposed onto a blue component image of RGB color image when the image is captured for the purpose of acquiring a depth map. The superimposed patterns become unnecessary after the depth map is acquired. We tried to remove these unnecessary patterns by using a generative adversarial network (GAN) and an auto encoder (AE). The experimental results show that the patterns can be removed by using a GAN and AE to the point of being invisible. They also show that the performance of GAN is much higher than that of AE and that its PSNR and SSIM were over 45 and about 0.99, respectively. From the results, we demonstrate the effectiveness of the technique with a GAN.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1810
Author(s):  
Dat Tien Nguyen ◽  
Tuyen Danh Pham ◽  
Ganbayar Batchuluun ◽  
Kyoung Jun Noh ◽  
Kang Ryoung Park

Although face-based biometric recognition systems have been widely used in many applications, this type of recognition method is still vulnerable to presentation attacks, which use fake samples to deceive the recognition system. To overcome this problem, presentation attack detection (PAD) methods for face recognition systems (face-PAD), which aim to classify real and presentation attack face images before performing a recognition task, have been developed. However, the performance of PAD systems is limited and biased due to the lack of presentation attack images for training PAD systems. In this paper, we propose a method for artificially generating presentation attack face images by learning the characteristics of real and presentation attack images using a few captured images. As a result, our proposed method helps save time in collecting presentation attack samples for training PAD systems and possibly enhance the performance of PAD systems. Our study is the first attempt to generate PA face images for PAD system based on CycleGAN network, a deep-learning-based framework for image generation. In addition, we propose a new measurement method to evaluate the quality of generated PA images based on a face-PAD system. Through experiments with two public datasets (CASIA and Replay-mobile), we show that the generated face images can capture the characteristics of presentation attack images, making them usable as captured presentation attack samples for PAD system training.


Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 2919 ◽  
Author(s):  
Wangyong He ◽  
Zhongzhao Xie ◽  
Yongbo Li ◽  
Xinmei Wang ◽  
Wendi Cai

Hand pose estimation is a critical technology of computer vision and human-computer interaction. Deep-learning methods require a considerable amount of tagged data. Accordingly, numerous labeled training data are required. This paper aims to generate depth hand images. Given a ground-truth 3D hand pose, the developed method can generate depth hand images. To be specific, a ground truth can be 3D hand poses with the hand structure contained, while the synthesized image has an identical size to that of the training image and a similar visual appearance to the training set. The developed method, inspired by the progress in the generative adversarial network (GAN) and image-style transfer, helps model the latent statistical relationship between the ground-truth hand pose and the corresponding depth hand image. The images synthesized using the developed method are demonstrated to be feasible for enhancing performance. On public hand pose datasets (NYU, MSRA, ICVL), comprehensive experiments prove that the developed method outperforms the existing works.


Sensors ◽  
2020 ◽  
Vol 20 (12) ◽  
pp. 3387 ◽  
Author(s):  
Hyun-Koo Kim ◽  
Kook-Yeol Yoo ◽  
Ho-Youl Jung

In this paper, a modified encoder-decoder structured fully convolutional network (ED-FCN) is proposed to generate the camera-like color image from the light detection and ranging (LiDAR) reflection image. Previously, we showed the possibility to generate a color image from a heterogeneous source using the asymmetric ED-FCN. In addition, modified ED-FCNs, i.e., UNET and selected connection UNET (SC-UNET), have been successfully applied to the biomedical image segmentation and concealed-object detection for military purposes, respectively. In this paper, we apply the SC-UNET to generate a color image from a heterogeneous image. Various connections between encoder and decoder are analyzed. The LiDAR reflection image has only 5.28% valid values, i.e., its data are extremely sparse. The severe sparseness of the reflection image limits the generation performance when the UNET is applied directly to this heterogeneous image generation. In this paper, we present a methodology of network connection in SC-UNET that considers the sparseness of each level in the encoder network and the similarity between the same levels of encoder and decoder networks. The simulation results show that the proposed SC-UNET with the connection between encoder and decoder at two lowest levels yields improvements of 3.87 dB and 0.17 in peak signal-to-noise ratio and structural similarity, respectively, over the conventional asymmetric ED-FCN. The methodology presented in this paper would be a powerful tool for generating data from heterogeneous sources.


2019 ◽  
Vol 2019 ◽  
pp. 1-8
Author(s):  
Zishu Gao ◽  
Guodong Yang ◽  
En Li ◽  
Tianyu Shen ◽  
Zhe Wang ◽  
...  

There are a large number of insulators on the transmission line, and insulator damage will have a major impact on power supply security. Image-based segmentation of the insulators in the power transmission lines is a premise and also a critical task for power line inspection. In this paper, a modified conditional generative adversarial network for insulator pixel-level segmentation is proposed. The generator is reconstructed by encoder-decoder layers with asymmetric convolution kernel which can simplify the network complexity and extract more kinds of feature information. The discriminator is composed of a fully convolutional network based on patchGAN and learns the loss to train the generator. It is verified in experiments that the proposed method has better performances on mIoU and computational efficiency than Pix2pix, SegNet, and other state-of-the-art networks.


Sign in / Sign up

Export Citation Format

Share Document