Unregistered Biological Words Recognition by Q-Learning with Transfer Learning

Unregistered biological words recognition is the process of identification of terms that is out of vocabulary. Although many approaches have been developed, the performance approaches are not satisfactory. As the identification process can be viewed as a Markov process, we put forward a Q-learning with transfer learning algorithm to detect unregistered biological words from texts. With the Q-learning, the recognizer can attain the optimal solution of identification during the interaction with the texts and contexts. During the processing, a transfer learning approach is utilized to fully take advantage of the knowledge gained in a source task to speed up learning in a different but related target task. A mapping, required by many transfer learning, which relates features from the source task to the target task, is carried on automatically under the reinforcement learning framework. We examined the performance of three approaches with GENIA corpus and JNLPBA04 data. The proposed approach improved performance in both experiments. The precision, recall rate, andFscore results of our approach surpassed those of conventional unregistered word recognizer as well as those of Q-learning approach without transfer learning.

Download Full-text

Perturbation Training for Human-Robot Teams

Journal of Artificial Intelligence Research ◽

10.1613/jair.5390 ◽

2017 ◽

Vol 59 ◽

pp. 495-541 ◽

Cited By ~ 3

Author(s):

Ramya Ramakrishnan ◽

Chongjie Zhang ◽

Julie Shah

Keyword(s):

Transfer Learning ◽

Large Scale ◽

Communication Protocol ◽

Learning Algorithm ◽

Training Strategy ◽

Learning Framework ◽

Perturbation Training ◽

Learning Techniques ◽

High Level ◽

Live Interaction

In this work, we design and evaluate a computational learning model that enables a human-robot team to co-develop joint strategies for performing novel tasks that require coordination. The joint strategies are learned through "perturbation training," a human team-training strategy that requires team members to practice variations of a given task to help their team generalize to new variants of that task. We formally define the problem of human-robot perturbation training and develop and evaluate the first end-to-end framework for such training, which incorporates a multi-agent transfer learning algorithm, human-robot co-learning framework and communication protocol. Our transfer learning algorithm, Adaptive Perturbation Training (AdaPT), is a hybrid of transfer and reinforcement learning techniques that learns quickly and robustly for new task variants. We empirically validate the benefits of AdaPT through comparison to other hybrid reinforcement and transfer learning techniques aimed at transferring knowledge from multiple source tasks to a single target task. We also demonstrate that AdaPT's rapid learning supports live interaction between a person and a robot, during which the human-robot team trains to achieve a high level of performance for new task variants. We augment AdaPT with a co-learning framework and a computational bi-directional communication protocol so that the robot can co-train with a person during live interaction. Results from large-scale human subject experiments (n=48) indicate that AdaPT enables an agent to learn in a manner compatible with a human's own learning process, and that a robot undergoing perturbation training with a human results in a high level of team performance. Finally, we demonstrate that human-robot training using AdaPT in a simulation environment produces effective performance for a team incorporating an embodied robot partner.

Download Full-text

Reinforcement-Learning Approach Guidelines for Energy Management

Journal of Low Power Electronics ◽

10.1166/jolpe.2019.1618 ◽

2019 ◽

Vol 15 (3) ◽

pp. 283-293 ◽

Cited By ~ 1

Author(s):

Yohann Rioual ◽

Johann Laurent ◽

Jean-Philippe Diguet

Keyword(s):

Reinforcement Learning ◽

Energy Management ◽

Sensor Node ◽

Learning Algorithm ◽

Autonomous Systems ◽

Difficult Problem ◽

Learning Approach ◽

Neural Network Approach ◽

Q Learning ◽

Energy Harvesting Devices

IoT and autonomous systems are in charge of an increasing number of sensing, processing and communications tasks. These systems may be equipped with energy harvesting devices. Nevertheless, the energy harvested is uncertain and variable, which makes it difficult to manage the energy in these systems. Reinforcement learning algorithms can handle such uncertainties, however selecting the adapted algorithm is a difficult problem. Many algorithms are available and each has its own advantages and drawbacks. In this paper, we try to provide an overview of different approaches to help designer to determine the most appropriate algorithm according to its application and system. We focus on Q-learning, a popular reinforcement learning algorithm and several of these variants. The approach of Q-learning is based on the use of look up table, however some algorithms use a neural network approach. We compare different variants of Q-learning for the energy management of a sensor node. We show that depending on the desired performance and the constraints inherent in the application of the node, the appropriate approach changes.

Download Full-text

Fog Fragment Cooperation on Bandwidth Management Based on Reinforcement Learning

Sensors ◽

10.3390/s20236942 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6942

Author(s):

Motahareh Mobasheri ◽

Yangwoo Kim ◽

Woongsup Kim

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Fog Computing ◽

Learning Approach ◽

Data Generation ◽

Bandwidth Management ◽

Q Learning ◽

Decision Making Problem ◽

Iot Devices ◽

The Internet Of Things

The term big data has emerged in network concepts since the Internet of Things (IoT) made data generation faster through various smart environments. In contrast, bandwidth improvement has been slower; therefore, it has become a bottleneck, creating the need to solve bandwidth constraints. Over time, due to smart environment extensions and the increasing number of IoT devices, the number of fog nodes has increased. In this study, we introduce fog fragment computing in contrast to conventional fog computing. We address bandwidth management using fog nodes and their cooperation to overcome the extra required bandwidth for IoT devices with emergencies and bandwidth limitations. We formulate the decision-making problem of the fog nodes using a reinforcement learning approach and develop a Q-learning algorithm to achieve efficient decisions by forcing the fog nodes to help each other under special conditions. To the best of our knowledge, there has been no research with this objective thus far. Therefore, we compare this study with another scenario that considers a single fog node to show that our new extended method performs considerably better.

Download Full-text

A Survey on Machine Learning Approach to Detect Malware

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i2.1961 ◽

2021 ◽

Vol 12 (2) ◽

Author(s):

Selvarathi C, Et. al.

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Optimal Solution ◽

The Internet ◽

Learning Approach ◽

Machine Learning Algorithm ◽

A Algorithm ◽

Internet Users ◽

Machine Learning Approach

Malware is one of the predominant challenges for the Internet users. In recent times, the injection of malwares into machines by anonymous hackers have been increased. This drives us to an urgent need of a system that detects a malware. Our idea is to build a system that learns with the previously collected data related to malwares and detects a malware in the give file, if it is present. We propose a various machine learning algorithm to detect a malware and indicates the user about the danger. In particular we propose to use a algorithm which give a optimal solution to hardware and software oriented malwares.

Download Full-text

Completely Heterogeneous Transfer Learning with Attention - What And What Not To Transfer

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/349 ◽

2017 ◽

Cited By ~ 3

Author(s):

Seungwhan Moon ◽

Jaime Carbonell

Keyword(s):

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

A Priori ◽

Classification Task ◽

Learning Approach ◽

Learning Framework ◽

Previous State

We study a transfer learning framework where source and target datasets are heterogeneous in both feature and label spaces. Specifically, we do not assume explicit relations between source and target tasks a priori, and thus it is crucial to determine what and what not to transfer from source knowledge. Towards this goal, we define a new heterogeneous transfer learning approach that (1) selects and attends to an optimized subset of source samples to transfer knowledge from, and (2) builds a unified transfer network that learns from both source and target knowledge. This method, termed "Attentional Heterogeneous Transfer", along with a newly proposed unsupervised transfer loss, improve upon the previous state-of-the-art approaches on extensive simulations as well as a challenging hetero-lingual text classification task.

Download Full-text

Adaptive visual tracking using the prioritized Q-learning algorithm: MDP-based parameter learning approach

Image and Vision Computing ◽

10.1016/j.imavis.2014.08.009 ◽

2014 ◽

Vol 32 (12) ◽

pp. 1090-1101 ◽

Cited By ~ 3

Author(s):

Sarang Khim ◽

Sungjin Hong ◽

Yoonyoung Kim ◽

Phill kyu Rhee

Keyword(s):

Visual Tracking ◽

Learning Algorithm ◽

Learning Approach ◽

Parameter Learning ◽

Q Learning

Download Full-text

Normative Rule Extraction from Implicit Learning into Explicit Representation

Knowledge Innovation Through Intelligent Software Methodologies, Tools and Techniques - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200555 ◽

2020 ◽

Author(s):

Mohd Rashdan Abdul Kadir ◽

Ali Selamat ◽

Ondrej Krejcar

Keyword(s):

Implicit Learning ◽

Learning Algorithm ◽

Rule Extraction ◽

Coordination Mechanism ◽

Autonomous Agent ◽

Q Learning ◽

State Action ◽

Speed Up ◽

Multi Agent ◽

Normative Rule

Normative multi-agent research is an alternative viewpoint in the design of adaptive autonomous agent architecture. Norms specify the standards of behaviors such as which actions or states should be achieved or avoided. The concept of norm synthesis is the process of generating useful normative rules. This study proposes a model for normative rule extraction from implicit learning, namely using the Q-learning algorithm, into explicit norm representation by implementing Dynamic Deontics and Hierarchical Knowledge Base (HKB) to synthesize useful normative rules in the form of weighted state-action pairs with deontic modality. OpenAi Gym is used to simulate the agent environment. Our proposed model is able to generate both obligative and prohibitive norms as well as deliberate and execute said norms. Results show the generated norms are best used as prior knowledge to guide agent behavior and performs poorly if not complemented by another agent coordination mechanism. Performance increases when using both obligation and prohibition norms, and in general, norms do speed up optimum policy reachability.

Download Full-text

Two-level Q-learning: learning from conflict demonstrations

The Knowledge Engineering Review ◽

10.1017/s0269888919000092 ◽

2019 ◽

Vol 34 ◽

Author(s):

Mao Li ◽

Yi Wei ◽

Daniel Kudenko

Keyword(s):

Learning Algorithm ◽

State Of The Art ◽

The State ◽

Q Learning ◽

Current State ◽

Speed Up ◽

Traditional Assumption ◽

Optimal Action ◽

Multiple Experts ◽

Novel Algorithm

Abstract One way to address this low sample efficiency of reinforcement learning (RL) is to employ human expert demonstrations to speed up the RL process (RL from demonstration or RLfD). The research so far has focused on demonstrations from a single expert. However, little attention has been given to the case where demonstrations are collected from multiple experts, whose expertise may vary on different aspects of the task. In such scenarios, it is likely that the demonstrations will contain conflicting advice in many parts of the state space. We propose a two-level Q-learning algorithm, in which the RL agent not only learns the policy of deciding on the optimal action but also learns to select the most trustworthy expert according to the current state. Thus, our approach removes the traditional assumption that demonstrations come from one single source and are mostly conflict-free. We evaluate our technique on three different domains and the results show that the state-of-the-art RLfD baseline fails to converge or performs similarly to conventional Q-learning. In contrast, the performance level of our novel algorithm increases with more experts being involved in the learning process and the proposed approach has the capability to handle demonstration conflicts well.

Download Full-text

A clinically-translatable machine learning algorithm for the prediction of Alzheimer’s disease conversion: further evidence of its accuracy via a transfer learning approach

International Psychogeriatrics ◽

10.1017/s1041610218001618 ◽

2018 ◽

Vol 31 (07) ◽

pp. 937-945 ◽

Cited By ~ 5

Author(s):

Massimiliano Grassi ◽

David A. Loewenstein ◽

Daniela Caldirola ◽

Koen Schruers ◽

Ranjan Duara ◽

...

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Cognitive Impairment ◽

Mild Cognitive Impairment ◽

Transfer Learning ◽

Learning Algorithm ◽

Support Vector ◽

Learning Approach ◽

Machine Learning Algorithm

ABSTRACTBackground:In a previous study, we developed a highly performant and clinically-translatable machine learning algorithm for a prediction of three-year conversion to Alzheimer’s disease (AD) in subjects with Mild Cognitive Impairment (MCI) and Pre-mild Cognitive Impairment. Further tests are necessary to demonstrate its accuracy when applied to subjects not used in the original training process. In this study, we aimed to provide preliminary evidence of this via a transfer learning approach.Methods:We initially employed the same baseline information (i.e. clinical and neuropsychological test scores, cardiovascular risk indexes, and a visual rating scale for brain atrophy) and the same machine learning technique (support vector machine with radial-basis function kernel) used in our previous study to retrain the algorithm to discriminate between participants with AD (n = 75) and normal cognition (n = 197). Then, the algorithm was applied to perform the original task of predicting the three-year conversion to AD in the sample of 61 MCI subjects that we used in the previous study.Results:Even after the retraining, the algorithm demonstrated a significant predictive performance in the MCI sample (AUC = 0.821, 95% CI bootstrap = 0.705–0.912, best balanced accuracy = 0.779, sensitivity = 0.852, specificity = 0.706).Conclusions:These results provide a first indirect evidence that our original algorithm can also perform relevant generalized predictions when applied to new MCI individuals. This motivates future efforts to bring the algorithm to sufficient levels of optimization and trustworthiness that will allow its application in both clinical and research settings.

Download Full-text

Curriculum Learning in Reinforcement Learning

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/757 ◽

2017 ◽

Cited By ~ 6

Author(s):

Sanmit Narvekar

Keyword(s):

Reinforcement Learning ◽

Transfer Learning ◽

Target Domain ◽

Target Task ◽

Final Performance ◽

Learning Speed ◽

Speed Up ◽

Complex Target ◽

Future Work

Transfer learning in reinforcement learning is an area of research that seeks to speed up or improve learning of a complex target task, by leveraging knowledge from one or more source tasks. This thesis will extend the concept of transfer learning to curriculum learning, where the goal is to design a sequence of source tasks for an agent to train on, such that final performance or learning speed is improved. We discuss completed work on this topic, including methods for semi-automatically generating source tasks tailored to an agent and the characteristics of a target domain, and automatically sequencing such tasks into a curriculum. Finally, we also present ideas for future work.

Download Full-text