scholarly journals Throwaway Shadows Using Parallel Encoders Generative Adversarial Network

2022 ◽  
Vol 12 (2) ◽  
pp. 824
Author(s):  
Kamran Javed ◽  
Nizam Ud Din ◽  
Ghulam Hussain ◽  
Tahir Farooq

Face photographs taken on a bright sunny day or in floodlight contain unnecessary shadows of objects on the face. Most previous works deal with removing shadow from scene images and struggle with doing so for facial images. Faces have a complex semantic structure, due to which shadow removal is challenging. The aim of this research is to remove the shadow of an object in facial images. We propose a novel generative adversarial network (GAN) based image-to-image translation approach for shadow removal in face images. The first stage of our model automatically produces a binary segmentation mask for the shadow region. Then, the second stage, which is a GAN-based network, removes the object shadow and synthesizes the effected region. The generator network of our GAN has two parallel encoders—one is standard convolution path and the other is a partial convolution. We find that this combination in the generator results not only in learning an incorporated semantic structure but also in disentangling visual discrepancies problems under the shadow area. In addition to GAN loss, we exploit low level L1, structural level SSIM and perceptual loss from a pre-trained loss network for better texture and perceptual quality, respectively. Since there is no paired dataset for the shadow removal problem, we created a synthetic shadow dataset for training our network in a supervised manner. The proposed approach effectively removes shadows from real and synthetic test samples, while retaining complex facial semantics. Experimental evaluations consistently show the advantages of the proposed method over several representative state-of-the-art approaches.

Author(s):  
Khaled ELKarazle ◽  
Valliappan Raman ◽  
Patrick Then

Age estimation models can be employed in many applications, including soft biometrics, content access control, targeted advertising, and many more. However, as some facial images are taken in unrestrained conditions, the quality relegates, which results in the loss of several essential ageing features. This study investigates how introducing a new layer of data processing based on a super-resolution generative adversarial network (SRGAN) model can influence the accuracy of age estimation by enhancing the quality of both the training and testing samples. Additionally, we introduce a novel convolutional neural network (CNN) classifier to distinguish between several age classes. We train one of our classifiers on a reconstructed version of the original dataset and compare its performance with an identical classifier trained on the original version of the same dataset. Our findings reveal that the classifier which trains on the reconstructed dataset produces better classification accuracy, opening the door for more research into building data-centric machine learning systems.


Symmetry ◽  
2018 ◽  
Vol 10 (9) ◽  
pp. 414 ◽  
Author(s):  
Traian Caramihale ◽  
Dan Popescu ◽  
Loretta Ichim

The detection of human emotions has applicability in various domains such as assisted living, health monitoring, domestic appliance control, crowd behavior tracking real time, and emotional security. The paper proposes a new system for emotion classification based on a generative adversarial network (GAN) classifier. The generative adversarial networks have been widely used for generating realistic images, but the classification capabilities have been vaguely exploited. One of the main advantages is that by using the generator, we can extend our testing dataset and add more variety to each of the seven emotion classes we try to identify. Thus, the novelty of our study consists in increasing the number of classes from N to 2N (in the learning phase) by considering real and fake emotions. Facial key points are obtained from real and generated facial images, and vectors connecting them with the facial center of gravity are used by the discriminator to classify the image as one of the 14 classes of interest (real and fake for seven emotions). As another contribution, real images from different emotional classes are used in the generation process unlike the classical GAN approach which generates images from simple noise arrays. By using the proposed method, our system can classify emotions in facial images regardless of gender, race, ethnicity, age and face rotation. An accuracy of 75.2% was obtained on 7000 real images (14,000, also considering the generated images) from multiple combined facial datasets.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Erick Costa de Farias ◽  
Christian di Noia ◽  
Changhee Han ◽  
Evis Sala ◽  
Mauro Castelli ◽  
...  

AbstractRobust machine learning models based on radiomic features might allow for accurate diagnosis, prognosis, and medical decision-making. Unfortunately, the lack of standardized radiomic feature extraction has hampered their clinical use. Since the radiomic features tend to be affected by low voxel statistics in regions of interest, increasing the sample size would improve their robustness in clinical studies. Therefore, we propose a Generative Adversarial Network (GAN)-based lesion-focused framework for Computed Tomography (CT) image Super-Resolution (SR); for the lesion (i.e., cancer) patch-focused training, we incorporate Spatial Pyramid Pooling (SPP) into GAN-Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE). At $$2\times $$ 2 × SR, the proposed model achieved better perceptual quality with less blurring than the other considered state-of-the-art SR methods, while producing comparable results at $$4\times $$ 4 × SR. We also evaluated the robustness of our model’s radiomic feature in terms of quantization on a different lung cancer CT dataset using Principal Component Analysis (PCA). Intriguingly, the most important radiomic features in our PCA-based analysis were the most robust features extracted on the GAN-super-resolved images. These achievements pave the way for the application of GAN-based image Super-Resolution techniques for studies of radiomics for robust biomarker discovery.


2020 ◽  
Vol 39 (7) ◽  
pp. 483-494
Author(s):  
Ling Zhang ◽  
Chengjiang Long ◽  
Qingan Yan ◽  
Xiaolong Zhang ◽  
Chunxia Xiao

2021 ◽  
Vol 14 (1) ◽  
pp. 24
Author(s):  
Yuan Hu ◽  
Lei Chen ◽  
Zhibin Wang ◽  
Xiang Pan ◽  
Hao Li

Deep-learning-based radar echo extrapolation methods have achieved remarkable progress in the precipitation nowcasting field. However, they suffer from a common notorious problem—they tend to produce blurry predictions. Although some efforts have been made in recent years, the blurring problem is still under-addressed. In this work, we propose three effective strategies to assist deep-learning-based radar echo extrapolation methods to achieve more realistic and detailed prediction. Specifically, we propose a spatial generative adversarial network (GAN) and a spectrum GAN to improve image fidelity. The spatial and spectrum GANs aim at penalizing the distribution discrepancy between generated and real images from the spatial domain and spectral domain, respectively. In addition, a masked style loss is devised to further enhance the details by transferring the detailed texture of ground truth radar sequences to extrapolated ones. We apply a foreground mask to prevent the background noise from transferring to the outputs. Moreover, we also design a new metric termed the power spectral density score (PSDS) to quantify the perceptual quality from a frequency perspective. The PSDS metric can be applied as a complement to other visual evaluation metrics (e.g., LPIPS) to achieve a comprehensive measurement of image sharpness. We test our approaches with both ConvLSTM baseline and U-Net baseline, and comprehensive ablation experiments on the SEVIR dataset show that the proposed approaches are able to produce much more realistic radar images than baselines. Most notably, our methods can be readily applied to any deep-learning-based spatiotemporal forecasting models to acquire more detailed results.


2020 ◽  
Vol 13 (6) ◽  
pp. 219-228
Author(s):  
Avin Maulana ◽  
◽  
Chastine Fatichah ◽  
Nanik Suciati ◽  
◽  
...  

Facial inpainting is a process to reconstruct some missing or damaged pixels in the facial image. The reconstructed pixels should still be realistic, so the observer could not differentiate between the reconstructed pixels and the original one. However, there are a few problems that may arise when the inpainting algorithm has been done. There was an inconsistency between adjacent pixels when done on an unaligned face image, which caused a failure to reconstruct. We propose an improvement method in facial inpainting using Generative Adversarial Network (GAN) with additional loss using pre-trained network VGG-Net and face landmark. The feature reconstruction loss will help to preserve deep-feature on an image, while the landmark will increase the result’s perceptual quality. The training process has been done using a curriculum learning scenario. Qualitative results show that our inpainting method can reconstruct the missing area on unaligned face images. From the quantitative results, our proposed method achieves the average score of 21.528 and 0.665, while the maximum score of 29.922 and 0.908 on PSNR (Peak Signal to Noise Ratio) and SSIM (Structure Similarity Index Measure) metrics, respectively.


Author(s):  
Kaizheng Chen ◽  
◽  
Yaping Dai ◽  
Zhiyang Jia ◽  
Kaoru Hirota

In this paper, Spinning Detail Perceptual Generative Adversarial Networks (SDP-GAN) is proposed for single image de-raining. The proposed method adopts the Generative Adversarial Network (GAN) framework and consists of two following networks: the rain streaks generative network G and the discriminative network D. To reduce the background interference, we propose a rain streaks generative network which not only focuses on the high frequency detail map of rainy image, but also directly reduces the mapping range from input to output. To further improve the perceptual quality of generated images, we modify the perceptual loss by extracting high-level features from discriminative network D, rather than pre-trained networks. Furthermore, we introduce a new training procedure based on the notion of self spinning to improve the final de-raining performance. Extensive experiments on the synthetic and real-world datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Diqun Yan ◽  
Xiaowen Li ◽  
Li Dong ◽  
Rangding Wang

Adaptive multirate (AMR) compression audio has been exploited as an effective forensic evidence to justify audio authenticity. Little consideration has been given, however, to antiforensic techniques capable of fooling AMR compression forensic algorithms. In this paper, we present an antiforensic method based on generative adversarial network (GAN) to attack AMR compression detectors. The GAN framework is utilized to modify double AMR compressed audio to have the underlying statistics of single compressed one. Three state-of-the-art detectors of AMR compression are selected as the targets to be attacked. The experimental results demonstrate that the proposed method is capable of removing the forensically detectable artifacts of AMR compression under various ratios with an average successful attack rate about 94.75%, which means the modified audios generated by our well-trained generator can treat the forensic detector effectively. Moreover, we show that the perceptual quality of the generated AMR audio is well preserved.


2020 ◽  
Vol 10 (6) ◽  
pp. 1995 ◽  
Author(s):  
Jeong gi Kwak ◽  
Hanseok Ko

The processing of facial images is an important task, because it is required for a large number of real-world applications. As deep-learning models evolve, they require a huge number of images for training. In reality, however, the number of images available is limited. Generative adversarial networks (GANs) have thus been utilized for database augmentation, but they suffer from unstable training, low visual quality, and a lack of diversity. In this paper, we propose an auto-encoder-based GAN with an enhanced network structure and training scheme for Database (DB) augmentation and image synthesis. Our generator and decoder are divided into two separate modules that each take input vectors for low-level and high-level features; these input vectors affect all layers within the generator and decoder. The effectiveness of the proposed method is demonstrated by comparing it with baseline methods. In addition, we introduce a new scheme that can combine two existing images without the need for extra networks based on the auto-encoder structure of the discriminator in our model. We add a novel double-constraint loss to make the encoded latent vectors equal to the input vectors.


Sign in / Sign up

Export Citation Format

Share Document