scholarly journals Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

Author(s):  
Mitchell Wortsman ◽  
Kiana Ehsani ◽  
Mohammad Rastegari ◽  
Ali Farhadi ◽  
Roozbeh Mottaghi
Entropy ◽  
2021 ◽  
Vol 23 (1) ◽  
pp. 126
Author(s):  
Sharu Theresa Jose ◽  
Osvaldo Simeone

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Yihui Quek ◽  
Stanislav Fort ◽  
Hui Khoon Ng

AbstractCurrent algorithms for quantum state tomography (QST) are costly both on the experimental front, requiring measurement of many copies of the state, and on the classical computational front, needing a long time to analyze the gathered data. Here, we introduce neural adaptive quantum state tomography (NAQT), a fast, flexible machine-learning-based algorithm for QST that adapts measurements and provides orders of magnitude faster processing while retaining state-of-the-art reconstruction accuracy. As in other adaptive QST schemes, measurement adaptation makes use of the information gathered from previous measured copies of the state to perform a targeted sensing of the next copy, maximizing the information gathered from that next copy. Our NAQT approach allows for a rapid and seamless integration of measurement adaptation and statistical inference, using a neural-network replacement of the standard Bayes’ update, to obtain the best estimate of the state. Our algorithm, which falls into the machine learning subfield of “meta-learning” (in effect “learning to learn” about quantum states), does not require any ansatz about the form of the state to be estimated. Despite this generality, it can be retrained within hours on a single laptop for a two-qubit situation, which suggests a feasible time-cost when extended to larger systems and potential speed-ups if provided with additional structure, such as a state ansatz.


Author(s):  
Lin Lan ◽  
Zhenguo Li ◽  
Xiaohong Guan ◽  
Pinghui Wang

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization. Recent efforts apply meta-learning to learn a meta-learner from a set of RL tasks such that a novel but related task could be solved quickly. Though specific in some ways, different tasks in meta-RL are generally similar at a high level. However, most meta-RL methods do not explicitly and adequately model the specific and shared information among different tasks, which limits their ability to learn training tasks and to generalize to novel tasks. In this paper, we propose to capture the shared information on the one hand and meta-learn how to quickly abstract the specific information about a task on the other hand. Methodologically, we train an SGD meta-learner to quickly optimize a task encoder for each task, which generates a task embedding based on past experience. Meanwhile, we learn a policy which is shared across all tasks and conditioned on task embeddings. Empirical results on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.


2021 ◽  
Author(s):  
Francesco Poli ◽  
Tommaso Ghilardi ◽  
Rogier B. Mars ◽  
Max Hinne ◽  
Sabine Hunnius

Infants learn to navigate the complexity of the physical and social world at an outstanding pace, but how they accomplish this learning is still unknown. Recent advances in human and artificial intelligence research propose that a key feature to achieve quick and efficient learning is meta-learning, the ability to make use of prior experiences to optimize how future information is acquired. Here we show that 8-month-old infants successfully engage in meta-learning within very short timespans. We developed a Bayesian model that captures how infants attribute informativity to incoming events, and how this process is optimized by the meta-parameters of their hierarchical models over the task structure. We fitted the model using infants’ gaze behaviour during a learning task. Our results reveal that infants do not simply accumulate experiences, but actively use them to generate new inductive biases that allow learning to proceed faster in the future.


2020 ◽  
Vol 6 (9) ◽  
pp. eaaw2140 ◽  
Author(s):  
Alex Luedtke ◽  
Marco Carone ◽  
Noah Simon ◽  
Oleg Sofrygin

Traditionally, statistical procedures have been derived via analytic calculations whose validity often relies on sample size growing to infinity. We use tools from deep learning to develop a new approach, adversarial Monte Carlo meta-learning, for constructing optimal statistical procedures. Statistical problems are framed as two-player games in which Nature adversarially selects a distribution that makes it difficult for a statistician to answer the scientific question using data drawn from this distribution. The players’ strategies are parameterized via neural networks, and optimal play is learned by modifying the network weights over many repetitions of the game. Given sufficient computing time, the statistician’s strategy is (nearly) optimal at the finite observed sample size, rather than in the hypothetical scenario where sample size grows to infinity. In numerical experiments and data examples, this approach performs favorably compared to standard practice in point estimation, individual-level predictions, and interval estimation.


Author(s):  
Hadi S. Jomaa ◽  
Lars Schmidt-Thieme ◽  
Josif Grabocka

AbstractMeta-learning, or learning to learn, is a machine learning approach that utilizes prior learning experiences to expedite the learning process on unseen tasks. As a data-driven approach, meta-learning requires meta-features that represent the primary learning tasks or datasets, and are estimated traditonally as engineered dataset statistics that require expert domain knowledge tailored for every meta-task. In this paper, first, we propose a meta-feature extractor called Dataset2Vec that combines the versatility of engineered dataset meta-features with the expressivity of meta-features learned by deep neural networks. Primary learning tasks or datasets are represented as hierarchical sets, i.e., as a set of sets, esp. as a set of predictor/target pairs, and then a DeepSet architecture is employed to regress meta-features on them. Second, we propose a novel auxiliary meta-learning task with abundant data called dataset similarity learning that aims to predict if two batches stem from the same dataset or different ones. In an experiment on a large-scale hyperparameter optimization task for 120 UCI datasets with varying schemas as a meta-learning task, we show that the meta-features of Dataset2Vec outperform the expert engineered meta-features and thus demonstrate the usefulness of learned meta-features for datasets with varying schemas for the first time.


Author(s):  
Wenqian Liang ◽  
Ji Wang ◽  
Weidong Bao ◽  
Xiaomin Zhu ◽  
Qingyong Wang ◽  
...  

AbstractMulti-agent reinforcement learning (MARL) methods have shown superior performance to solve a variety of real-world problems focusing on learning distinct policies for individual tasks. These approaches face problems when applied to the non-stationary real-world: agents trained in specialized tasks cannot achieve satisfied generalization performance across multiple tasks; agents have to learn and store specialized policies for individual task and reliable identities of tasks are hardly observable in practice. To address the challenge continuously adapting to multiple tasks in MARL, we formalize the problem into a two-stage curriculum. Single-task policies are learned with MARL approaches, after that we develop a gradient-based Self-Adaptive Meta-Learning algorithm, SAML, that cannot only distill single-task policies into a unified policy but also can facilitate the unified policy to continuously adapt to new incoming tasks. In addition, to validate the continuous adaptation performance on complex task, we extend the widely adopted StarCraft benchmark SMAC and develop a new multi-task multi-agent StarCraft environment, Meta-SMAC, for testing various aspects of continuous adaptation method. Our experiments with a population of agents show that our method enables significantly more efficient adaptation than reactive baselines across different scenarios.


Author(s):  
Taisei Sugiyama ◽  
Nicolas Schweighofer ◽  
Jun Izawa

AbstractReinforcement learning enables the brain to learn optimal action selection, such as go or not go, by forming state-action and action-outcome associations. Does this mechanism also optimize the brain’s willingness to learn, such as learn or not learn? Learning to learn by rewards, i.e., reinforcement meta-learning, is a crucial mechanism for machines to develop flexibility in learning, which is also considered in the brain without empirical examinations. Here, we show that humans learn to learn or not learn to maximize rewards in visuomotor learning tasks. We also show that this regulation of learning is not a motivational bias but is a result of an instrumental, active process, which takes into account the learning-outcome structure. Our results thus demonstrate the existence of reinforcement meta-learning in the human brain. Because motor learning is a process of minimizing sensory errors, our findings uncover an essential mechanism of interaction between reward and error.


Sign in / Sign up

Export Citation Format

Share Document