A mixture of generative models strategy helps humans generalize across tasks

AbstractWhat role do generative models play in generalization of learning in humans? Our novel multi-task prediction paradigm—where participants complete four sequence learning tasks, each being a different instance of a common generative family—allows the separate study of within-task learning (i.e., finding the solution to each of the tasks), and across-task learning (i.e., learning a task differently because of past experiences). The very first responses participants make in each task are not yet affected by within-task learning and thus reflect their priors. Our results show that these priors change across successive tasks, increasingly resembling the underlying generative family. We conceptualize multi-task learning as arising from a mixture-of-generative-models learning strategy, whereby participants simultaneously entertain multiple candidate models which compete against each other to explain the experienced sequences. This framework predicts specific error patterns, as well as a gating mechanism for learning, both of which are observed in the data.

Download Full-text

Drivetrain System Identification in a Multi-Task Learning Strategy using Partial Asynchronous Elastic Averaging Stochastic Gradient Descent

2020 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) ◽

10.1109/aim43001.2020.9158977 ◽

2020 ◽

Author(s):

Tom Staessens ◽

Guillaume Crevecoeur

Keyword(s):

System Identification ◽

Gradient Descent ◽

Learning Strategy ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Task Learning

Download Full-text

AST-MTL: An Attention-based Multi-Task Learning Strategy for Traffic Forecasting

IEEE Access ◽

10.1109/access.2021.3083412 ◽

2021 ◽

pp. 1-1

Author(s):

Giovanni Buroni ◽

Bertrand Lebichot ◽

Gianluca Bontempi

Keyword(s):

Learning Strategy ◽

Traffic Forecasting ◽

Task Learning

Download Full-text

Predictive Uncertainty Estimation for Tractable Deep Probabilistic Models

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/745 ◽

2020 ◽

Author(s):

Julissa Villanueva Llerena

Keyword(s):

Probabilistic Models ◽

Linear Time ◽

Generative Models ◽

Uncertainty Estimation ◽

Image Completion ◽

Predictive Uncertainty ◽

Learning Tasks ◽

Challenging Tasks ◽

Statistical Support ◽

Marginal Inference

Tractable Deep Probabilistic Models (TPMs) are generative models based on arithmetic circuits that allow for exact marginal inference in linear time. These models have obtained promising results in several machine learning tasks. Like many other models, TPMs can produce over-confident incorrect inferences, especially on regions with small statistical support. In this work, we will develop efficient estimators of the predictive uncertainty that are robust to data scarcity and outliers. We investigate two approaches. The first approach measures the variability of the output to perturbations of the model weights. The second approach captures the variability of the prediction to changes in the model architecture. We will evaluate the approaches on challenging tasks such as image completion and multilabel classification.

Download Full-text

Biologically Plausible Sequence Learning with Spiking Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i02.5487 ◽

2020 ◽

Vol 34 (02) ◽

pp. 1316-1323

Author(s):

Zuozhu Liu ◽

Thiparat Chotibut ◽

Christopher Hillar ◽

Shaowei Lin

Keyword(s):

Neural Networks ◽

Sequence Learning ◽

Nervous Activity ◽

Learning Rule ◽

Generative Models ◽

Spike Timing ◽

Hopfield Network ◽

Spiking Neural Networks ◽

Theoretical Ground ◽

Time Model

Motivated by the celebrated discrete-time model of nervous activity outlined by McCulloch and Pitts in 1943, we propose a novel continuous-time model, the McCulloch-Pitts network (MPN), for sequence learning in spiking neural networks. Our model has a local learning rule, such that the synaptic weight updates depend only on the information directly accessible by the synapse. By exploiting asymmetry in the connections between binary neurons, we show that MPN can be trained to robustly memorize multiple spatiotemporal patterns of binary vectors, generalizing the ability of the symmetric Hopfield network to memorize static spatial patterns. In addition, we demonstrate that the model can efficiently learn sequences of binary pictures as well as generative models for experimental neural spike-train data. Our learning rule is consistent with spike-timing-dependent plasticity (STDP), thus providing a theoretical ground for the systematic design of biologically inspired networks with large and robust long-range sequence storage capacity.

Download Full-text

Different levels of statistical learning - Hidden potentials of sequence learning tasks

PLoS ONE ◽

10.1371/journal.pone.0221966 ◽

2019 ◽

Vol 14 (9) ◽

pp. e0221966 ◽

Cited By ~ 3

Author(s):

Emese Szegedi-Hallgató ◽

Karolina Janacsek ◽

Dezso Nemeth

Keyword(s):

Statistical Learning ◽

Sequence Learning ◽

Learning Tasks ◽

Different Levels

Download Full-text

Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5446 ◽

2020 ◽

Vol 34 (01) ◽

pp. 979-988

Author(s):

Wenlin Wang ◽

Hongteng Xu ◽

Zhe Gan ◽

Bai Li ◽

Guoyin Wang ◽

...

Keyword(s):

Generative Models ◽

Healthcare Applications ◽

Convolutional Network ◽

Learning Tasks ◽

Proposed Model ◽

Heterogeneous Learning ◽

Clinical Topic ◽

Generative Processes ◽

Admission Type ◽

Uniform Manner

We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework. The proposed model is based on the fact that heterogeneous learning tasks, which correspond to different generative processes, often rely on data with a shared graph structure. Accordingly, our model combines a graph convolutional network (GCN) with multiple variational autoencoders, thus embedding the nodes of the graph (i.e., samples for the tasks) in a uniform manner, while specializing their organization and usage to different tasks. With a focus on healthcare applications (tasks), including clinical topic modeling, procedure recommendation and admission-type prediction, we demonstrate that our method successfully leverages information across different tasks, boosting performance in all tasks and outperforming existing state-of-the-art approaches.

Download Full-text

Making an Effort Versus Experiencing Load

Frontiers in Education ◽

10.3389/feduc.2021.645284 ◽

2021 ◽

Vol 6 ◽

Author(s):

Melina Klepsch ◽

Tina Seufert

Keyword(s):

Cognitive Load ◽

Learning Strategy ◽

Strategy Training ◽

German Language ◽

Design Studies ◽

Intrinsic Cognitive Load ◽

Load Theory ◽

Learning Tasks ◽

Different Types

In cognitive load theory (CLT), the role of different types of cognitive load is still under debate. Intrinsic cognitive load (ICL) and germane cognitive load (GCL) are assumed to be highly interlinked but provide different perspectives. While ICL mirrors the externally given task affordances which learners experience passively, germane resources are invested by the learner actively. Extraneous affordances (ECL) are also experienced passively. The distinction of passively experienced load and actively invested resources was inspired by an investigation where we found differential effects of a learning strategy training, which in fact resulted in reduced passive load and increased actively invested effort. This distinction is also mirrored in the active and passive forms for effort in German language: “es war anstrengend” (it has been strenuous) vs. “ich habe mich angestrengt” (I exerted myself). In two studies, we analyzed whether we could distinguish between these active and passive aspects of load by using these phrases and how this distinction relates to the three-partite concept of CLT. In two instructional design studies, we included the active and passive items into a differentiated cognitive load questionnaire. We found the factor structure to be stable, with the passive item loading on the ICL factor and the active item loading on the GCL factor. We conclude that it is possible to distinguish between active and passive aspects of load and that further research on this topic could be constructive, especially for learning tasks where learners act in a more self-regulated way and learner characteristics are taken into account.

Download Full-text

Generative models for capturing and exploiting the influence of process conditions on process curves

Journal of Intelligent Manufacturing ◽

10.1007/s10845-021-01846-4 ◽

2021 ◽

Author(s):

Tarek Iraki ◽

Norbert Link

Keyword(s):

Latent Variable ◽

Welding Process ◽

Feature Space ◽

Generative Models ◽

Process Conditions ◽

Process Data ◽

Task Learning ◽

Latent Space ◽

Variable Space ◽

Novel Method

AbstractVariations of dedicated process conditions (such as workpiece and tool properties) yield different process state evolutions, which are reflected by different time series of the observable quantities (process curves). A novel method is presented, which firstly allows to extract the statistical influence of these conditions on the process curves and its representation via generative models, and secondly represents their influence on the ensemble of curves by transformations of the representation space. A latent variable space is derived from sampled process data, which represents the curves with only few features. Generative models are formed based on conditional propability functions estimated in this space. Furthermore, the influence of conditions on the ensemble of process curves is represented by estimated transformations of the feature space, which map the process curve densities with different conditions on each other. The latent space is formed via Multi-Task-Learning of an auto-encoder and condition-detectors. The latter classifies the latent space representations of the process curves into the considered conditions. The Bayes framework and the Multi-task Learning models are used to obtain the process curve probabilty densities from the latent space densities. The methods are shown to reveal and represent the influence of combinations of workpiece and tool properties on resistance spot welding process curves.

Download Full-text

A specific implicit sequence learning deficit as an underlying cause of dyslexia? Investigating the role of attention in implicit learning tasks.

Neuropsychology ◽

10.1037/neu0000348 ◽

2017 ◽

Vol 31 (4) ◽

pp. 371-382 ◽

Cited By ~ 12

Author(s):

Eva Staels ◽

Wim Van den Broeck

Keyword(s):

Implicit Learning ◽

Sequence Learning ◽

Implicit Sequence Learning ◽

Learning Deficit ◽

Learning Tasks

Download Full-text

Learning Full-Sentence Co-Related Verb Argument Preferences from Web Corpora

Quantitative Semantics and Soft Computing Methods for the Web ◽

10.4018/978-1-60960-881-1.ch007 ◽

2011 ◽

pp. 137-162

Author(s):

Hiram Calvo ◽

Kentaro Inui ◽

Yuji Matsumoto

Keyword(s):

Machine Learning ◽

Generative Models ◽

Ensemble Model ◽

Semantic Features ◽

Data Sparseness ◽

Task Learning

Learning verb argument preferences has been approached as a verb and argument problem, or at most as a tri-nary relationship between subject, verb and object. However, the simultaneous correlation of all arguments in a sentence has not been explored thoroughly for sentence plausibility mensuration because of the increased number of potential combinations and data sparseness. In this work the authors present a review of some common methods for learning argument preferences beginning with the simplest case of considering binary co-relations, then they compare with tri-nary co-relations, and finally they consider all arguments. For this latter, the authors use an ensemble model for machine learning using discriminative and generative models, using co-occurrence features, and semantic features in different arrangements. They seek to answer questions about the number of optimal topics required for PLSI and LDA models, as well as the number of co-occurrences that should be required for improving performance. They explore the implications of using different ways of projecting co-relations, i.e., into a word space, or directly into a co-occurrence features space. The authors conducted tests using a pseudo-disambiguation task learning from large corpora extracted from Internet.

Download Full-text