scholarly journals Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning

2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Yuntian Feng ◽  
Hongjun Zhang ◽  
Wenning Hao ◽  
Gang Chen

We use both reinforcement learning and deep learning to simultaneously extract entities and relations from unstructured texts. For reinforcement learning, we model the task as a two-step decision process. Deep learning is used to automatically capture the most important information from unstructured texts, which represent the state in the decision process. By designing the reward function per step, our proposed method can pass the information of entity extraction to relation extraction and obtain feedback in order to extract entities and relations simultaneously. Firstly, we use bidirectional LSTM to model the context information, which realizes preliminary entity extraction. On the basis of the extraction results, attention based method can represent the sentences that include target entity pair to generate the initial state in the decision process. Then we use Tree-LSTM to represent relation mentions to generate the transition state in the decision process. Finally, we employ Q-Learning algorithm to get control policy π in the two-step decision process. Experiments on ACE2005 demonstrate that our method attains better performance than the state-of-the-art method and gets a 2.4% increase in recall-score.

2021 ◽  
Author(s):  
Peter Wurman ◽  
Samuel Barrett ◽  
Kenta Kawamoto ◽  
James MacGlashan ◽  
Kaushik Subramanian ◽  
...  

Abstract Many potential applications of artificial intelligence involve making real-time decisions in physical systems. Automobile racing represents an extreme case of real-time decision making in close proximity to other highly-skilled drivers while near the limits of vehicular control. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the nonlinear control challenges of real race cars while also encapsulating the complex multi-agent interactions. We attack, and solve for the first time, the simulated racing challenge using model-free deep reinforcement learning. We introduce a novel reinforcement learning algorithm and enhance the learning process with mixed scenario training to encourage the agent to incorporate racing tactics into an integrated control policy. In addition, we construct a reward function that enables the agent to adhere to the sport's under-specified racing etiquette rules. We demonstrate the capabilities of our agent, GT Sophy, by winning two of three races against four of the world's best Gran Turismo drivers and being competitive in the overall team score. By showing that these techniques can be successfully used to train championship-level race car drivers, we open up the possibility of their use in other complex dynamical systems and real-world applications.


2021 ◽  
Vol 54 (1) ◽  
pp. 1-39
Author(s):  
Zara Nasar ◽  
Syed Waqar Jaffry ◽  
Muhammad Kamran Malik

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.


2021 ◽  
pp. 2150011
Author(s):  
Wei Dong ◽  
Jianan Wang ◽  
Chunyan Wang ◽  
Zhenqiang Qi ◽  
Zhengtao Ding

In this paper, the optimal consensus control problem is investigated for heterogeneous linear multi-agent systems (MASs) with spanning tree condition based on game theory and reinforcement learning. First, the graphical minimax game algebraic Riccati equation (ARE) is derived by converting the consensus problem into a zero-sum game problem between each agent and its neighbors. The asymptotic stability and minimax validation of the closed-loop systems are proved theoretically. Then, a data-driven off-policy reinforcement learning algorithm is proposed to online learn the optimal control policy without the information of the system dynamics. A certain rank condition is established to guarantee the convergence of the proposed algorithm to the unique solution of the ARE. Finally, the effectiveness of the proposed method is demonstrated through a numerical simulation.


Author(s):  
Usman Ahmed ◽  
Jerry Chun-Wei Lin ◽  
Gautam Srivastava

Deep learning methods have led to a state of the art medical applications, such as image classification and segmentation. The data-driven deep learning application can help stakeholders to collaborate. However, limited labelled data set limits the deep learning algorithm to generalize for one domain into another. To handle the problem, meta-learning helps to learn from a small set of data. We proposed a meta learning-based image segmentation model that combines the learning of the state-of-the-art model and then used it to achieve domain adoption and high accuracy. Also, we proposed a prepossessing algorithm to increase the usability of the segments part and remove noise from the new test image. The proposed model can achieve 0.94 precision and 0.92 recall. The ability to increase 3.3% among the state-of-the-art algorithms.


Author(s):  
Fangjian Li ◽  
John R Wagner ◽  
Yue Wang

Abstract Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning (RL) algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning algorithm (S-AIRL). First, the control barrier function (CBF) is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further improve the safety awareness, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning (IL) performance, the proposed S-AIRL can reduce the collision rate by 32.6%.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1818
Author(s):  
Jaein Song ◽  
Yun Ji Cho ◽  
Min Hee Kang ◽  
Kee Yeon Hwang

As ridesharing services (including taxi) are often run by private companies, profitability is the top priority in operation. This leads to an increase in the driver’s refusal to take passengers to areas with low demand where they will have difficulties finding subsequent passengers, causing problems such as an extended waiting time when hailing a vehicle for passengers bound for these regions. The study used Seoul’s taxi data to find appropriate surge rates of ridesharing services between 10:00 p.m. and 4:00 a.m. by region using a reinforcement learning algorithm to resolve this problem during the worst time period. In reinforcement learning, the outcome of centrality analysis was applied as a weight affecting drivers’ destination choice probability. Furthermore, the reward function used in the learning was adjusted according to whether the passenger waiting time value was applied or not. The profit was used for reward value. By using a negative reward for the passenger waiting time, the study was able to identify a more appropriate surge level. Across the region, the surge averaged a value of 1.6. To be more specific, those located on the outskirts of the city and in residential areas showed a higher surge, while central areas had a lower surge. Due to this different surge, a driver’s refusal to take passengers can be lessened and the passenger waiting time can be shortened. The supply of ridesharing services in low-demand regions can be increased by as much as 7.5%, allowing regional equity problems related to ridesharing services in Seoul to be reduced to a greater extent.


2010 ◽  
Vol 44-47 ◽  
pp. 3611-3615 ◽  
Author(s):  
Zhi Cong Zhang ◽  
Kai Shun Hu ◽  
Hui Yu Huang ◽  
Shuai Li ◽  
Shao Yong Zhao

Reinforcement learning (RL) is a state or action value based machine learning method which approximately solves large-scale Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP). A multi-step RL algorithm called Sarsa(,k) is proposed, which is a compromised variation of Sarsa and Sarsa(). It is equivalent to Sarsa if k is 1 and is equivalent to Sarsa() if k is infinite. Sarsa(,k) adjust its performance by setting k value. Two forms of Sarsa(,k), forward view Sarsa(,k) and backward view Sarsa(,k), are constructed and proved equivalent in off-line updating.


2021 ◽  
Vol 2138 (1) ◽  
pp. 012011
Author(s):  
Yanwei Zhao ◽  
Yinong Zhang ◽  
Shuying Wang

Abstract Path planning refers to that the mobile robot can obtain the surrounding environment information and its own state information through the sensor carried by itself, which can avoid obstacles and move towards the target point. Deep reinforcement learning consists of two parts: reinforcement learning and deep learning, mainly used to deal with perception and decision-making problems, has become an important research branch in the field of artificial intelligence. This paper first introduces the basic knowledge of deep learning and reinforcement learning. Then, the research status of deep reinforcement learning algorithm based on value function and strategy gradient in path planning is described, and the application research of deep reinforcement learning in computer game, video game and autonomous navigation is described. Finally, I made a brief summary and outlook on the algorithms and applications of deep reinforcement learning.


2021 ◽  
Vol 103 (4) ◽  
Author(s):  
Bartomeu Rubí ◽  
Bernardo Morcego ◽  
Ramon Pérez

AbstractA deep reinforcement learning approach for solving the quadrotor path following and obstacle avoidance problem is proposed in this paper. The problem is solved with two agents: one for the path following task and another one for the obstacle avoidance task. A novel structure is proposed, where the action computed by the obstacle avoidance agent becomes the state of the path following agent. Compared to traditional deep reinforcement learning approaches, the proposed method allows to interpret the training process outcomes, is faster and can be safely trained on the real quadrotor. Both agents implement the Deep Deterministic Policy Gradient algorithm. The path following agent was developed in a previous work. The obstacle avoidance agent uses the information provided by a low-cost LIDAR to detect obstacles around the vehicle. Since LIDAR has a narrow field-of-view, an approach for providing the agent with a memory of the previously seen obstacles is developed. A detailed description of the process of defining the state vector, the reward function and the action of this agent is given. The agents are programmed in python/tensorflow and are trained and tested in the RotorS/gazebo platform. Simulations results prove the validity of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document