AUBER: Automated BERT regularization

How can we effectively regularize BERT? Although BERT proves its effectiveness in various NLP tasks, it often overfits when there are only a small number of training instances. A promising direction to regularize BERT is based on pruning its attention heads with a proxy score for head importance. However, these methods are usually suboptimal since they resort to arbitrarily determined numbers of attention heads to be pruned and do not directly aim for the performance enhancement. In order to overcome such a limitation, we propose AUBER, an automated BERT regularization method, that leverages reinforcement learning to automatically prune the proper attention heads from BERT. We also minimize the model complexity and the action search space by proposing a low-dimensional state representation and dually-greedy approach for training. Experimental results show that AUBER outperforms existing pruning methods by achieving up to 9.58% better performance. In addition, the ablation study demonstrates the effectiveness of design choices for AUBER.

Download Full-text

A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?

Frontiers in Neuroscience ◽

10.3389/fnins.2021.660595 ◽

2021 ◽

Vol 15 ◽

Author(s):

Zheyu Feng ◽

Asako Mitsuto Nagase ◽

Kenji Morita

Keyword(s):

Reinforcement Learning ◽

Current Model ◽

Theoretical Models ◽

Optimal Choice ◽

State Representation ◽

Contributing Factors ◽

Time Step ◽

Cognitive Limitations ◽

Low Dimensional ◽

True Values

Procrastination is the voluntary but irrational postponing of a task despite being aware that the delay can lead to worse consequences. It has been extensively studied in psychological field, from contributing factors, to theoretical models. From value-based decision making and reinforcement learning (RL) perspective, procrastination has been suggested to be caused by non-optimal choice resulting from cognitive limitations. Exactly what sort of cognitive limitations are involved, however, remains elusive. In the current study, we examined if a particular type of cognitive limitation, namely, inaccurate valuation resulting from inadequate state representation, would cause procrastination. Recent work has suggested that humans may adopt a particular type of state representation called the successor representation (SR) and that humans can learn to represent states by relatively low-dimensional features. Combining these suggestions, we assumed a dimension-reduced version of SR. We modeled a series of behaviors of a “student” doing assignments during the school term, when putting off doing the assignments (i.e., procrastination) is not allowed, and during the vacation, when whether to procrastinate or not can be freely chosen. We assumed that the “student” had acquired a rigid reduced SR of each state, corresponding to each step in completing an assignment, under the policy without procrastination. The “student” learned the approximated value of each state which was computed as a linear function of features of the states in the rigid reduced SR, through temporal-difference (TD) learning. During the vacation, the “student” made decisions at each time-step whether to procrastinate based on these approximated values. Simulation results showed that the reduced SR-based RL model generated procrastination behavior, which worsened across episodes. According to the values approximated by the “student,” to procrastinate was the better choice, whereas not to procrastinate was mostly better according to the true values. Thus, the current model generated procrastination behavior caused by inaccurate value approximation, which resulted from the adoption of the reduced SR as state representation. These findings indicate that the reduced SR, or more generally, the dimension reduction in state representation, can be a potential form of cognitive limitation that leads to procrastination.

Download Full-text

Low Dimensional State Representation Learning with Reward-shaped Priors

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9412421 ◽

2021 ◽

Author(s):

Nicolo Botteghi ◽

Ruben Obbink ◽

Daan Geijs ◽

Mannes Poel ◽

Beril Sirmacek ◽

...

Keyword(s):

Representation Learning ◽

State Representation ◽

Low Dimensional

Download Full-text

Generalization-Based Acquisition of Training Data for Motor Primitive Learning by Neural Networks

Applied Sciences ◽

10.3390/app11031013 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1013

Author(s):

Zvezdan Lončarević ◽

Rok Pahič ◽

Aleš Ude ◽

Andrej Gams

Keyword(s):

Neural Networks ◽

Dimensionality Reduction ◽

Gaussian Process Regression ◽

Search Space ◽

Robot Learning ◽

Training Data ◽

Practical Applications ◽

Latent Space ◽

Real Robot ◽

Low Dimensional

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.

Download Full-text

What Can You Do with a Rock? Affordance Extraction via Word Embeddings

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/144 ◽

2017 ◽

Cited By ~ 9

Author(s):

Nancy Fulda ◽

Daniel Ricks ◽

Ben Murdoch ◽

David Wingate

Keyword(s):

Reinforcement Learning ◽

Computational Complexity ◽

Linear Algebra ◽

Autonomous Agents ◽

Common Knowledge ◽

Search Space ◽

Word Embeddings ◽

Knowledge Database ◽

Learning Agent ◽

Action Spaces

Autonomous agents must often detect affordances: the set of behaviors enabled by a situation. Affordance extraction is particularly helpful in domains with large action spaces, allowing the agent to prune its search space by avoiding futile behaviors. This paper presents a method for affordance extraction via word embeddings trained on a tagged Wikipedia corpus. The resulting word vectors are treated as a common knowledge database which can be queried using linear algebra. We apply this method to a reinforcement learning agent in a text-only environment and show that affordance-based action selection improves performance in most cases. Our method increases the computational complexity of each learning step but significantly reduces the total number of steps needed. In addition, the agent's action selections begin to resemble those a human would choose.

Download Full-text

Decoupling State Representation Methods from Reinforcement Learning in Car Racing

Proceedings of the 13th International Conference on Agents and Artificial Intelligence ◽

10.5220/0010237507520759 ◽

2021 ◽

Author(s):

Juan Montoya ◽

Imant Daunhawer ◽

Julia Vogt ◽

Marco Wiering

Keyword(s):

Reinforcement Learning ◽

State Representation ◽

Car Racing

Download Full-text

A Self-powered and Sensitive Terahertz Photodetection based on PdSe2

Chinese Physics B ◽

10.1088/1674-1056/ac4908 ◽

2022 ◽

Author(s):

Jie Zhou ◽

Xueyan Wang ◽

Zhiqingzi Chen ◽

Libo Zhang ◽

Chengyu Yao ◽

...

Keyword(s):

Homeland Security ◽

Room Temperature ◽

Performance Enhancement ◽

Rapid Development ◽

Rapid Response ◽

Medical Sciences ◽

Terahertz Detection ◽

Order Of Magnitude ◽

Self Powered ◽

Low Dimensional

Abstract With the rapid development of terahertz technology, terahertz detectors are expected to play a key role in diverse areas such as homeland security and imaging, materials diagnostics, biology and medical sciences, communication. Whereas self-powered, rapid response, and room temperature terahertz photodetectors are confronted with huge challenges. Here, we report a novel rapid response and self-powered terahertz photothermoelectronic (PTE) photodetector based on a low-dimensional material: palladium selenide (PdSe2). An order of magnitude performance enhancement was observed in photodetection based on PdSe2/graphene heterojunction that resulted from the integration of graphene and enhanced the Seebeck effect. Under 0.1 THz and 0.3 THz irradiation, the device displays a stable and repeatable photoresponse at room temperature without bias. Furthermore, rapid rise (5.0 μs) and decay (5.4 μs) times are recorded under 0.1 THz irradiation. Our results demonstrate the promising prospect of the detector based on PdSe2 in terms of air-stable, suitable sensitivity, and speed, which may have great application in terahertz detection.

Download Full-text

Emergence of Discrete and Abstract State Representation through Reinforcement Learning in a Continuous Input Task

Advances in Intelligent Systems and Computing - Robot Intelligence Technology and Applications 2012 ◽

10.1007/978-3-642-37374-9_2 ◽

2013 ◽

pp. 13-21 ◽

Cited By ~ 2

Author(s):

Yoshito Sawatsubashi ◽

Mohamad Faizal bin Samusudin ◽

Katsunari Shibata

Keyword(s):

Reinforcement Learning ◽

State Representation ◽

Continuous Input

Download Full-text

Deep inverse reinforcement learning for structural evolution of small molecules

Briefings in Bioinformatics ◽

10.1093/bib/bbaa364 ◽

2020 ◽

Author(s):

Brighter Agyemang ◽

Wei-Ping Wu ◽

Daniel Addo ◽

Michael Y Kpiebaareh ◽

Ebenezer Nanor ◽

...

Keyword(s):

Reinforcement Learning ◽

High Throughput Screening ◽

Structural Evolution ◽

Search Space ◽

New Drugs ◽

Inverse Reinforcement Learning ◽

Generative Adversarial Network ◽

Entropy Maximization ◽

Reward Function ◽

Adversarial Network

Abstract The size and quality of chemical libraries to the drug discovery pipeline are crucial for developing new drugs or repurposing existing drugs. Existing techniques such as combinatorial organic synthesis and high-throughput screening usually make the process extraordinarily tough and complicated since the search space of synthetically feasible drugs is exorbitantly huge. While reinforcement learning has been mostly exploited in the literature for generating novel compounds, the requirement of designing a reward function that succinctly represents the learning objective could prove daunting in certain complex domains. Generative adversarial network-based methods also mostly discard the discriminator after training and could be hard to train. In this study, we propose a framework for training a compound generator and learn a transferable reward function based on the entropy maximization inverse reinforcement learning (IRL) paradigm. We show from our experiments that the IRL route offers a rational alternative for generating chemical compounds in domains where reward function engineering may be less appealing or impossible while data exhibiting the desired objective is readily available.

Download Full-text

Consideration of State Representation for Semi-autonomous Reinforcement Learning of Sailing Within a Navigable Area

Robotic Sailing 2015 ◽

10.1007/978-3-319-23335-2_7 ◽

2015 ◽

pp. 89-102

Author(s):

Hideaki Manabe ◽

Kanta Tachibana

Keyword(s):

Reinforcement Learning ◽

State Representation

Download Full-text

Combined Reinforcement Learning via Abstract Representations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013582 ◽

2019 ◽

Vol 33 ◽

pp. 3582-3589 ◽

Cited By ~ 4

Author(s):

Vincent Francois-Lavet ◽

Yoshua Bengio ◽

Doina Precup ◽

Joelle Pineau

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Transfer Learning ◽

Computationally Efficient ◽

Dimensional Representation ◽

Learning Methods ◽

Model Free ◽

Abstract Representations ◽

Low Dimensional ◽

New Strategies

In the quest for efficient and robust reinforcement learning methods, both model-free and model-based approaches offer advantages. In this paper we propose a new way of explicitly bridging both approaches via a shared low-dimensional learned encoding of the environment, meant to capture summarizing abstractions. We show that the modularity brought by this approach leads to good generalization while being computationally efficient, with planning happening in a smaller latent state space. In addition, this approach recovers a sufficient low-dimensional representation of the environment, which opens up new strategies for interpretable AI, exploration and transfer learning.

Download Full-text