scholarly journals An Enhanced Adversarial Network with Combined Latent Features for Spatio-temporal Facial Affect Estimation in the Wild

Author(s):  
Decky Aspandi ◽  
Federico Sukno ◽  
Björn Schuller ◽  
Xavier Binefa
2021 ◽  
Vol 12 (6) ◽  
pp. 1-20
Author(s):  
Fayaz Ali Dharejo ◽  
Farah Deeba ◽  
Yuanchun Zhou ◽  
Bhagwan Das ◽  
Munsif Ali Jatoi ◽  
...  

Single Image Super-resolution (SISR) produces high-resolution images with fine spatial resolutions from a remotely sensed image with low spatial resolution. Recently, deep learning and generative adversarial networks (GANs) have made breakthroughs for the challenging task of single image super-resolution (SISR) . However, the generated image still suffers from undesirable artifacts such as the absence of texture-feature representation and high-frequency information. We propose a frequency domain-based spatio-temporal remote sensing single image super-resolution technique to reconstruct the HR image combined with generative adversarial networks (GANs) on various frequency bands (TWIST-GAN). We have introduced a new method incorporating Wavelet Transform (WT) characteristics and transferred generative adversarial network. The LR image has been split into various frequency bands by using the WT, whereas the transfer generative adversarial network predicts high-frequency components via a proposed architecture. Finally, the inverse transfer of wavelets produces a reconstructed image with super-resolution. The model is first trained on an external DIV2 K dataset and validated with the UC Merced Landsat remote sensing dataset and Set14 with each image size of 256 × 256. Following that, transferred GANs are used to process spatio-temporal remote sensing images in order to minimize computation cost differences and improve texture information. The findings are compared qualitatively and qualitatively with the current state-of-art approaches. In addition, we saved about 43% of the GPU memory during training and accelerated the execution of our simplified version by eliminating batch normalization layers.


2020 ◽  
Vol 31 ◽  
pp. GCFI31-GCFI41
Author(s):  
Carlos M. Zayas Santiago ◽  
Richard S. Appeldoorn ◽  
Michelle T. Schärerer-Umpierre ◽  
Juan J. Cruz-Motta

Passive acoustic monitoring provides a method for studying grouper courtship associated sounds (CAS). For Red Hind (Epinephelus guttatus), this approach has documented spatio—temporal patterns in their spawning aggregations. This study described vocalizations produced by E. guttatus and their respective behavioral contexts in field and laboratory studies. Five sound types were identified, which included 4 calls recorded in captivity and one sound recorded in the wild, labeled as Chorus. Additionally, the Grunt call type recorded was presumed to be produced by a female. Call types consisted of variations and combinations of low frequency (50—450 Hz) pulses, grunts and tonal sounds in different combinations. Common call types exhibited diel and lunar oscillations during the spawning season, with both field and captive recordings peaking daily at 1800 AST and at 8 days after the full moon.


Author(s):  
Md. Zasim Uddin ◽  
Daigo Muramatsu ◽  
Noriko Takemura ◽  
Md. Atiqur Rahman Ahad ◽  
Yasushi Yagi

AbstractGait-based features provide the potential for a subject to be recognized even from a low-resolution image sequence, and they can be captured at a distance without the subject’s cooperation. Person recognition using gait-based features (gait recognition) is a promising real-life application. However, several body parts of the subjects are often occluded because of beams, pillars, cars and trees, or another walking person. Therefore, gait-based features are not applicable to approaches that require an unoccluded gait image sequence. Occlusion handling is a challenging but important issue for gait recognition. In this paper, we propose silhouette sequence reconstruction from an occluded sequence (sVideo) based on a conditional deep generative adversarial network (GAN). From the reconstructed sequence, we estimate the gait cycle and extract the gait features from a one gait cycle image sequence. To regularize the training of the proposed generative network, we use adversarial loss based on triplet hinge loss incorporating Wasserstein GAN (WGAN-hinge). To the best of our knowledge, WGAN-hinge is the first adversarial loss that supervises the generator network during training by incorporating pairwise similarity ranking information. The proposed approach was evaluated on multiple challenging occlusion patterns. The experimental results demonstrate that the proposed approach outperforms the existing state-of-the-art benchmarks.


Author(s):  
Takato Sasagawa ◽  
◽  
Shin Kawai ◽  
Hajime Nobuhara

A graph convolutional generative adversarial network (GCGAN) is proposed to provide recommendations for new users or items. To maintain scalability, the discriminator was improved to capture the latent features of users and items, using graph convolution from a minibatch-sized bipartite graph. In the experiment using MovieLens, it was confirmed that the proposed GCGAN had better performance than the conventional CFGAN, when MovieLens 1M was employed with sufficient data. The proposed method is characterized in such a manner that it can learn domain information of both, users and items, and it does not require to relearn a model for a new node. Further, it can be developed for any service having such conditions, in the information recommendation field.


Author(s):  
T. Shinohara ◽  
H. Xiu ◽  
M. Matsuoka

Abstract. This study introduces a novel image to a 3D point-cloud translation method with a conditional generative adversarial network that creates a large-scale 3D point cloud. This can generate supervised point clouds observed via airborne LiDAR from aerial images. The network is composed of an encoder to produce latent features of input images, generator to translate latent features to fake point clouds, and discriminator to classify false or real point clouds. The encoder is a pre-trained ResNet; to overcome the difficulty of generating 3D point clouds in an outdoor scene, we use a FoldingNet with features from ResNet. After a fixed number of iterations, our generator can produce fake point clouds that correspond to the input image. Experimental results show that our network can learn and generate certain point clouds using the data from the 2018 IEEE GRSS Data Fusion Contest.


Author(s):  
Shuaitao Zhang ◽  
Yuliang Liu ◽  
Lianwen Jin ◽  
Yaoxiong Huang ◽  
Songxuan Lai

A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SBMNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device.


Sign in / Sign up

Export Citation Format

Share Document