scholarly journals One shot generalization in humans revealed through a drawing task

2021 ◽  
Author(s):  
Henning Tiedemann ◽  
Yaniv Morgenstern ◽  
Filipp Schmidt ◽  
Roland W. Fleming

Humans have the striking ability to learn and generalize new visual concepts from just a single exemplar. We suggest that when presented with a novel object, observers identify its significant features and infer a generative model of its shape, allowing them to mentally synthesize plausible variants. To test this, we showed participants abstract 2D shapes ("Exemplars") and asked them to draw new objects ("Variations") belonging to the same class. We show that this procedure created genuine novel categories. In line with our hypothesis, particular features of each Exemplar were preserved in its Variations and there was striking agreement between participants about which shape features were most distinctive. Also, we show that strategies to create Variations were strongly driven by part structure: new objects typically modified individual parts (e.g., limbs) of the Exemplar, often preserving part order, sometimes altering it. Together, our findings suggest that sophisticated internal generative models are key to how humans analyze and generalize from single exemplars.

Author(s):  
Masoumeh Zareapoor ◽  
Jie Yang

Image-to-Image translation aims to learn an image from a source domain to a target domain. However, there are three main challenges, such as lack of paired datasets, multimodality, and diversity, that are associated with these problems and need to be dealt with. Convolutional neural networks (CNNs), despite of having great performance in many computer vision tasks, they fail to detect the hierarchy of spatial relationships between different parts of an object and thus do not form the ideal representative model we look for. This article presents a new variation of generative models that aims to remedy this problem. We use a trainable transformer, which explicitly allows the spatial manipulation of data within training. This differentiable module can be augmented into the convolutional layers in the generative model, and it allows to freely alter the generated distributions for image-to-image translation. To reap the benefits of proposed module into generative model, our architecture incorporates a new loss function to facilitate an effective end-to-end generative learning for image-to-image translation. The proposed model is evaluated through comprehensive experiments on image synthesizing and image-to-image translation, along with comparisons with several state-of-the-art algorithms.


2019 ◽  
Vol 2019 (4) ◽  
pp. 232-249 ◽  
Author(s):  
Benjamin Hilprecht ◽  
Martin Härterich ◽  
Daniel Bernau

Abstract We present two information leakage attacks that outperform previous work on membership inference against generative models. The first attack allows membership inference without assumptions on the type of the generative model. Contrary to previous evaluation metrics for generative models, like Kernel Density Estimation, it only considers samples of the model which are close to training data records. The second attack specifically targets Variational Autoencoders, achieving high membership inference accuracy. Furthermore, previous work mostly considers membership inference adversaries who perform single record membership inference. We argue for considering regulatory actors who perform set membership inference to identify the use of specific datasets for training. The attacks are evaluated on two generative model architectures, Generative Adversarial Networks (GANs) and Variational Autoen-coders (VAEs), trained on standard image datasets. Our results show that the two attacks yield success rates superior to previous work on most data sets while at the same time having only very mild assumptions. We envision the two attacks in combination with the membership inference attack type formalization as especially useful. For example, to enforce data privacy standards and automatically assessing model quality in machine learning as a service setups. In practice, our work motivates the use of GANs since they prove less vulnerable against information leakage attacks while producing detailed samples.


2020 ◽  
Vol 34 (10) ◽  
pp. 13869-13870
Author(s):  
Yijing Liu ◽  
Shuyu Lin ◽  
Ronald Clark

Variational autoencoders (VAEs) have been a successful approach to learning meaningful representations of data in an unsupervised manner. However, suboptimal representations are often learned because the approximate inference model fails to match the true posterior of the generative model, i.e. an inconsistency exists between the learnt inference and generative models. In this paper, we introduce a novel consistency loss that directly requires the encoding of the reconstructed data point to match the encoding of the original data, leading to better representations. Through experiments on MNIST and Fashion MNIST, we demonstrate the existence of the inconsistency in VAE learning and that our method can effectively reduce such inconsistency.


2020 ◽  
Vol 34 (07) ◽  
pp. 10494-10501
Author(s):  
Tingjia Cao ◽  
Ke Han ◽  
Xiaomei Wang ◽  
Lin Ma ◽  
Yanwei Fu ◽  
...  

This paper studies the task of image captioning with novel objects, which only exist in testing images. Intrinsically, this task can reflect the generalization ability of models in understanding and captioning the semantic meanings of visual concepts and objects unseen in training set, sharing the similarity to one/zero-shot learning. The critical difficulty thus comes from that no paired images and sentences of the novel objects can be used to help train the captioning model. Inspired by recent work (Chen et al. 2019b) that boosts one-shot learning by learning to generate various image deformations, we propose learning meta-networks for deforming features for novel object captioning. To this end, we introduce the feature deformation meta-networks (FDM-net), which is trained on source data, and learn to adapt to the novel object features detected by the auxiliary detection model. FDM-net includes two sub-nets: feature deformation, and scene graph sentence reconstruction, which produce the augmented image features and corresponding sentences, respectively. Thus, rather than directly deforming images, FDM-net can efficiently and dynamically enlarge the paired images and texts by learning to deform image features. Extensive experiments are conducted on the widely used novel object captioning dataset, and the results show the effectiveness of our FDM-net. Ablation study and qualitative visualization further give insights of our model.


2020 ◽  
Vol 34 (04) ◽  
pp. 3397-3404 ◽  
Author(s):  
Oishik Chatterjee ◽  
Ganesh Ramakrishnan ◽  
Sunita Sarawagi

Scarcity of labeled data is a bottleneck for supervised learning models. A paradigm that has evolved for dealing with this problem is data programming. An existing data programming paradigm allows human supervision to be provided as a set of discrete labeling functions (LF) that output possibly noisy labels to input instances and a generative model for consolidating the weak labels. We enhance and generalize this paradigm by supporting functions that output a continuous score (instead of a hard label) that noisily correlates with labels. We show across five applications that continuous LFs are more natural to program and lead to improved recall. We also show that accuracy of existing generative models is unstable with respect to initialization, training epochs, and learning rates. We give control to the data programmer to guide the training process by providing intuitive quality guides with each LF. We propose an elegant method of incorporating these guides into the generative model. Our overall method, called CAGE, makes the data programming paradigm more reliable than other tricks based on initialization, sign-penalties, or soft-accuracy constraints.


2021 ◽  
Vol 118 (16) ◽  
pp. e2020324118
Author(s):  
Biwei Dai ◽  
Uroš Seljak

The goal of generative models is to learn the intricate relations between the data to create new simulated data, but current approaches fail in very high dimensions. When the true data-generating process is based on physical processes, these impose symmetries and constraints, and the generative model can be created by learning an effective description of the underlying physics, which enables scaling of the generative model to very high dimensions. In this work, we propose Lagrangian deep learning (LDL) for this purpose, applying it to learn outputs of cosmological hydrodynamical simulations. The model uses layers of Lagrangian displacements of particles describing the observables to learn the effective physical laws. The displacements are modeled as the gradient of an effective potential, which explicitly satisfies the translational and rotational invariance. The total number of learned parameters is only of order 10, and they can be viewed as effective theory parameters. We combine N-body solver fast particle mesh (FastPM) with LDL and apply it to a wide range of cosmological outputs, from the dark matter to the stellar maps, gas density, and temperature. The computational cost of LDL is nearly four orders of magnitude lower than that of the full hydrodynamical simulations, yet it outperforms them at the same resolution. We achieve this with only of order 10 layers from the initial conditions to the final output, in contrast to typical cosmological simulations with thousands of time steps. This opens up the possibility of analyzing cosmological observations entirely within this framework, without the need for large dark-matter simulations.


2018 ◽  
Author(s):  
Jan H. Jensen

This paper presents a comparison of a graph-based genetic algorithm (GB-GA) and machine learning (ML) results for the optimisation of logP values with a constraint for synthetic accessibility and shows that GA is as good or better than the ML approaches for this particular property. The molecules found by GB-GA bear little resemblance to the molecules used to construct the initial mating pool, indicating that the GB-GA approach can traverse a relatively large distance in chemical space using relatively few (50) generations. The paper also introduces a new non-ML graph-based generative model (GB-GM) that can be parameterized using very small data sets and combined with a Monte Carlo tree search (MCTS) algorithm. The results are comparable to previously published results (Sci. Technol. Adv. Mater. 2017, 18, 972-976) using a recurrent neural network (RNN) generative model, while the GB-GM-based method is orders of magnitude faster. The MCTS results seem more dependent on the composition of the training set than the GA approach for this particular property. Our results suggest that the performance of new ML-based generative models should be compared to more traditional, and often simpler, approaches such a GA.


Algorithms ◽  
2020 ◽  
Vol 13 (12) ◽  
pp. 319
Author(s):  
Wang Xi ◽  
Guillaume Devineau ◽  
Fabien Moutarde ◽  
Jie Yang

Generative models for images, audio, text, and other low-dimension data have achieved great success in recent years. Generating artificial human movements can also be useful for many applications, including improvement of data augmentation methods for human gesture recognition. The objective of this research is to develop a generative model for skeletal human movement, allowing to control the action type of generated motion while keeping the authenticity of the result and the natural style variability of gesture execution. We propose to use a conditional Deep Convolutional Generative Adversarial Network (DC-GAN) applied to pseudo-images representing skeletal pose sequences using tree structure skeleton image format. We evaluate our approach on the 3D skeletal data provided in the large NTU_RGB+D public dataset. Our generative model can output qualitatively correct skeletal human movements for any of the 60 action classes. We also quantitatively evaluate the performance of our model by computing Fréchet inception distances, which shows strong correlation to human judgement. To the best of our knowledge, our work is the first successful class-conditioned generative model for human skeletal motions based on pseudo-image representation of skeletal pose sequences.


Author(s):  
CHAN-SU LEE ◽  
DIMITRIS SAMARAS

Facial expressions convey personal characteristics and subtle emotional states. This paper presents a new framework for modeling subtle facial motions of different people with different types of expressions from high-resolution facial expression tracking data to synthesize new stylized subtle facial expressions. A conceptual facial motion manifold is used for a unified representation of facial motion dynamics from three-dimensional (3D) high-resolution facial motions as well as from two-dimensional (2D) low-resolution facial motions. Variant subtle facial motions in different people with different expressions are modeled by nonlinear mappings from the embedded conceptual manifold to input facial motions using empirical kernel maps. We represent facial expressions by a factorized nonlinear generative model, which decomposes expression style factors and expression type factors from different people with multiple expressions. We also provide a mechanism to control the high-resolution facial motion model from low-resolution facial video sequence tracking and analysis. Using the decomposable generative model with a common motion manifold embedding, we can estimate parameters to control 3D high resolution facial expressions from 2D tracking results, which allows performance-driven control of high-resolution facial expressions.


2019 ◽  
Vol 17 (1) ◽  
pp. 88-102
Author(s):  
Pierre Cutellic

This article will focus on abstracting and generalising a well-studied paradigm in visual, event-related potential based brain–computer interfaces, for the spelling of characters forming words, into the visually encoded discrimination of shape features forming design aggregates. After identifying typical technologies in neuroscience and neuropsychology of high interest for integrating fast cognitive responses into generative design and proposing the machine learning model of an ensemble of linear classifiers in order to tackle the challenging features that electroencephalography data carry, it will present experiments in encoding shape features for generative models by a mechanism of visual context updating and the computational implementation of vision as inverse graphics, to suggest that discriminative neural phenomena of event-related potentials such as P300 may be used in a visual articulation strategy for modelling in generative design.


Sign in / Sign up

Export Citation Format

Share Document