scholarly journals Scene wheels: Measuring perception and memory of real-world scenes with a continuous stimulus space

2020 ◽  
Author(s):  
Gaeun Son ◽  
Dirk B. Walther ◽  
Michael L. Mack

AbstractPrecisely characterizing mental representations of visual experiences requires careful control of experimental stimuli. Recent work leveraging such stimulus control in continuous report paradigms have led to important insights; however, these findings are constrained to simple visual properties like colour and line orientation. There remains a critical methodological barrier to characterizing perceptual and mnemonic representations of realistic visual experiences. Here, we introduce a novel method to systematically control visual properties of natural scene stimuli. Using generative adversarial networks (GAN), a state-of-art deep learning technique for creating highly realistic synthetic images, we generated scene wheels in which continuously changing visual properties smoothly transition between meaningful realistic scenes. To validate the efficacy of scene wheels, we conducted a memory experiment in which participants reconstructed to-be-remembered scenes from the scene wheels. Reconstruction errors for these scenes resemble error distributions observed in prior studies using simple stimulus properties. Importantly, memory precision varied systematically with scene wheel radius. These findings suggest our novel approach offers a window into the mental representations of naturalistic visual experiences.

2020 ◽  
Vol 34 (03) ◽  
pp. 2645-2652 ◽  
Author(s):  
Yaman Kumar ◽  
Dhruva Sahrawat ◽  
Shubham Maheshwari ◽  
Debanjan Mahata ◽  
Amanda Stent ◽  
...  

Visual Speech Recognition (VSR) is the process of recognizing or interpreting speech by watching the lip movements of the speaker. Recent machine learning based approaches model VSR as a classification problem; however, the scarcity of training data leads to error-prone systems with very low accuracies in predicting unseen classes. To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases. We also show that our models are language agnostic and therefore capable of seamlessly generating, using English training data, videos for a new language (Hindi). To the best of our knowledge, this is the first work to show empirical evidence of the use of GANs for generating training samples of unseen classes in the domain of VSR, hence facilitating zero-shot learning. We make the added videos for new classes publicly available along with our code1.


Author(s):  
Yao Ni ◽  
Dandan Song ◽  
Xi Zhang ◽  
Hao Wu ◽  
Lejian Liao

Generative adversarial networks (GANs) have shown impressive results, however, the generator and the discriminator are optimized in finite parameter space which means their performance still need to be improved. In this paper, we propose a novel approach of adversarial training between one generator and an exponential number of critics which are sampled from the original discriminative neural network via dropout. As discrepancy between outputs of different sub-networks of a same sample can measure the consistency of these critics, we encourage the critics to be consistent to real samples and inconsistent to generated samples during training, while the generator is trained to generate consistent samples for different critics. Experimental results demonstrate that our method can obtain state-of-the-art Inception scores of 9.17 and 10.02 on supervised CIFAR-10 and unsupervised STL-10 image generation tasks, respectively, as well as achieve competitive semi-supervised classification results on several benchmarks. Importantly, we demonstrate that our method can maintain stability in training and alleviate mode collapse.


Author(s):  
P. J. Soto ◽  
J. D. Bermudez ◽  
P. N. Happ ◽  
R. Q. Feitosa

<p><strong>Abstract.</strong> This work aims at investigating unsupervised and semi-supervised representation learning methods based on generative adversarial networks for remote sensing scene classification. The work introduces a novel approach, which consists in a semi-supervised extension of a prior unsupervised method, known as MARTA-GAN. The proposed approach was compared experimentally with two baselines upon two public datasets, <i>UC-MERCED</i> and <i>NWPU-RESISC45</i>. The experiments assessed the performance of each approach under different amounts of labeled data. The impact of fine-tuning was also investigated. The proposed method delivered in our analysis the best overall accuracy under scarce labeled samples, both in terms of absolute value and in terms of variability across multiple runs.</p>


2019 ◽  
Vol 38 (9) ◽  
pp. 698-705
Author(s):  
Ping Lu ◽  
Yuan Xiao ◽  
Yanyan Zhang ◽  
Nikolaos Mitsakos

A deep-learning-based compressive-sensing technique for reconstruction of missing seismic traces is introduced. The agility of the proposed approach lies in its ability to perfectly resolve the optimization limitation of conventional algorithms that solve inversion problems. It demonstrates how deep generative adversarial networks, equipped with an appropriate loss function that essentially leverages the distribution of the entire survey, can serve as an alternative approach for tackling compressive-sensing problems with high precision and in a computationally efficient manner. The method can be applied on both prestack and poststack seismic data, allowing for superior imaging quality with well-preconditioned and well-sampled field data, during the processing stage. To validate the robustness of the proposed approach on field data, the extent to which amplitudes and phase variations in original data are faithfully preserved is established, while subsurface consistency is also achieved. Several applications to acquisition and processing, such as decreasing bin size, increasing offset and azimuth sampling, or increasing the fold, can directly and immediately benefit from adopting the proposed technique. Furthermore, interpolation based on generative adversarial networks has been found to produce better-sampled data sets, with stronger regularization and attenuated aliasing phenomenon, while providing greater fidelity on steep-dip events and amplitude-variation-with-offset analysis with migration.


Author(s):  
B. Jafrasteh ◽  
I. Manighetti ◽  
J. Zerubia

Abstract. We develop a novel method based on Deep Convolutional Networks (DCN) to automate the identification and mapping of fracture and fault traces in optical images. The method employs two DCNs in a two players game: a first network, called Generator, learns to segment images to make them resembling the ground truth; a second network, called Discriminator, measures the differences between the ground truth image and each segmented image and sends its score feedback to the Generator; based on these scores, the Generator improves its segmentation progressively. As we condition both networks to the ground truth images, the method is called Conditional Generative Adversarial Network (CGAN). We propose a new loss function for both the Generator and the Discriminator networks, to improve their accuracy. Using two criteria and a manually annotated optical image, we compare the generalization performance of the proposed method to that of a classical DCN architecture, U-net. The comparison demonstrates the suitability of the proposed CGAN architecture. Further work is however needed to improve its efficiency.


2020 ◽  
Vol 21 (1) ◽  
pp. 63-72
Author(s):  
P G Kuppusamy ◽  
K C Ramya ◽  
S Sheebha Rani ◽  
M Sivaram ◽  
Vigneswaran Dhasarathan

Image steganography aims at hiding information in a cover medium in an imperceptible way. While traditionalsteganography methods used invisible inks and microdots, digital world started using images and video files for hiding the secret content in it. Steganalysis is a closely related field for detecting hidden information in these multimedia files. There are many steganography algorithms implemented and tested but most of them fail during Steganalysis. To overcome this issue, in this paper, we are proposing to use generative adversarial networks for image steganography which include discriminative models to identify steganography image during training stage and that helps us to reduce the error rate later during Steganalysis. The proposed modified cycle Generative Adversarial Networks (Mod Cycle GAN) algorithm is tested using the USC-SIPI database and the experimentation results were better when compared with the algorithms in the literature. Because the discriminator block evaluates the image authenticity, we could modify the embedding algorithm until the discriminator could not identify the change made and thereby increasing the robustness.


Author(s):  
Run Wang ◽  
Felix Juefei-Xu ◽  
Lei Ma ◽  
Xiaofei Xie ◽  
Yihao Huang ◽  
...  

In recent years, generative adversarial networks (GANs) and its variants have achieved unprecedented success in image synthesis. They are widely adopted in synthesizing facial images which brings potential security concerns to humans as the fakes spread and fuel the misinformation. However, robust detectors of these AI-synthesized fake faces are still in their infancy and are not ready to fully tackle this emerging challenge. In this work, we propose a novel approach, named FakeSpotter, based on monitoring neuron behaviors to spot AI-synthesized fake faces. The studies on neuron coverage and interactions have successfully shown that they can be served as testing criteria for deep learning systems, especially under the settings of being exposed to adversarial attacks. Here, we conjecture that monitoring neuron behavior can also serve as an asset in detecting fake faces since layer-by-layer neuron activation patterns may capture more subtle features that are important for the fake detector. Experimental results on detecting four types of fake faces synthesized with the state-of-the-art GANs and evading four perturbation attacks show the effectiveness and robustness of our approach.


Author(s):  
David Keetae Park ◽  
Seungjoo Yoo ◽  
Hyojin Bahng ◽  
Jaegul Choo ◽  
Noseong Park

Recently, generative adversarial networks (GANs) have shown promising performance in generating realistic images. However, they often struggle in learning complex underlying modalities in a given dataset, resulting in poor-quality generated images. To mitigate this problem, we present a novel approach called mixture of experts GAN (MEGAN), an ensemble approach of multiple generator networks. Each generator network in MEGAN specializes in generating images with a particular subset of modalities, e.g., an image class. Instead of incorporating a separate step of handcrafted clustering of multiple modalities, our proposed model is trained through an end-to-end learning of multiple generators via gating networks, which is responsible for choosing the appropriate generator network for a given condition. We adopt the categorical reparameterization trick for a categorical decision to be made in selecting a generator while maintaining the flow of the gradients. We demonstrate that individual generators learn different and salient subparts of the data and achieve a multiscale structural similarity (MS-SSIM) score of 0.2470 for CelebA and a competitive unsupervised inception score of 8.33 in CIFAR-10.


Sign in / Sign up

Export Citation Format

Share Document