scholarly journals Hierarchical Active Tracking Control for UAVs via Deep Reinforcement Learning

2021 ◽  
Vol 11 (22) ◽  
pp. 10595
Author(s):  
Wenlong Zhao ◽  
Zhijun Meng ◽  
Kaipeng Wang ◽  
Jiahui Zhang ◽  
Shaoze Lu

Active tracking control is essential for UAVs to perform autonomous operations in GPS-denied environments. In the active tracking task, UAVs take high-dimensional raw images as input and execute motor actions to actively follow the dynamic target. Most research focuses on three-stage methods, which entail perception first, followed by high-level decision-making based on extracted spatial information of the dynamic target, and then UAV movement control, using a low-level dynamic controller. Perception methods based on deep neural networks are powerful but require considerable effort for manual ground truth labeling. Instead, we unify the perception and decision-making stages using a high-level controller and then leverage deep reinforcement learning to learn the mapping from raw images to the high-level action commands in the V-REP-based environment, where simulation data are infinite and inexpensive. This end-to-end method also has the advantages of a small parameter size and reduced effort requirements for parameter turning in the decision-making stage. The high-level controller, which has a novel architecture, explicitly encodes the spatial and temporal features of the dynamic target. Auxiliary segmentation and motion-in-depth losses are introduced to generate denser training signals for the high-level controller’s fast and stable training. The high-level controller and a conventional low-level PID controller constitute our hierarchical active tracking control framework for the UAVs’ active tracking task. Simulation experiments show that our controller trained with several augmentation techniques sufficiently generalizes dynamic targets with random appearances and velocities, and achieves significantly better performance, compared with three-stage methods.

2021 ◽  
pp. 002224372199837
Author(s):  
Walter Herzog ◽  
Johannes D. Hattula ◽  
Darren W. Dahl

This research explores how marketing managers can avoid the so-called false consensus effect—the egocentric tendency to project personal preferences onto consumers. Two pilot studies were conducted to provide evidence for the managerial importance of this research question and to explore how marketing managers attempt to avoid false consensus effects in practice. The results suggest that the debiasing tactic most frequently used by marketers is to suppress their personal preferences when predicting consumer preferences. Four subsequent studies show that, ironically, this debiasing tactic can backfire and increase managers’ susceptibility to the false consensus effect. Specifically, the results suggest that these backfire effects are most likely to occur for managers with a low level of preference certainty. In contrast, the results imply that preference suppression does not backfire but instead decreases false consensus effects for managers with a high level of preference certainty. Finally, the studies explore the mechanism behind these results and show how managers can ultimately avoid false consensus effects—regardless of their level of preference certainty and without risking backfire effects.


2021 ◽  
Vol 31 (3) ◽  
pp. 1-26
Author(s):  
Aravind Balakrishnan ◽  
Jaeyoung Lee ◽  
Ashish Gaurav ◽  
Krzysztof Czarnecki ◽  
Sean Sedwards

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.


Author(s):  
N. Sandhya Rani ◽  
M. Sarada Devi

Empowerment of tribal women is one of the central issues in the process of development all over the world. Empowerment is the process that allows one to gain the knowledge and attitude needed to cope with the changing world and the circumstances in which one lives [1]. Women empowerment is a process in which women gain greater share of control over material, human and intellectual resources as well as control over decision-making in their home, community, society and nation. Given the need to analyze the empowerment status of tribal women, the present study aimed to enhance the empowerment status through enhancing decision-making skills of tribal working women in India. The specific objective is to study the impact of intervention on enhancing status of empowerment through decision-making skills of tribal working women in Utnoor Mandal Adilabad district. The total sample population for the study was 50 tribal working women, and data was analyzed using a paired t test. Results revealed that at pretest, majority of the women were at average level of decision-making skills (78%), 12% were at low level and only 10% were at high level. After the intervention, post test results revealed that 74% of the women were high in decision making skills and remaining 26% were at average level. Interestingly, none of the respondents had low level of life skills. Thus, intervention found to be effective among women respondents to develop and enhance their empowerment status through decision-making skills.


Author(s):  
Rey Pocius ◽  
Lawrence Neal ◽  
Alan Fern

Commonly used sequential decision making tasks such as the games in the Arcade Learning Environment (ALE) provide rich observation spaces suitable for deep reinforcement learning. However, they consist mostly of low-level control tasks which are of limited use for the development of explainable artificial intelligence(XAI) due to the fine temporal resolution of the tasks. Many of these domains also lack built-in high level abstractions and symbols. Existing tasks that provide for both strategic decision-making and rich observation spaces are either difficult to simulate or are intractable. We provide a set of new strategic decision-making tasks specialized for the development and evaluation of explainable AI methods, built as constrained mini-games within the StarCraft II Learning Environment.


Author(s):  
Daoming Lyu ◽  
Fangkai Yang ◽  
Bo Liu ◽  
Daesub Yoon

Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options. This framework features a planner – controller – meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.


2019 ◽  
Author(s):  
Charles Findling ◽  
Nicolas Chopin ◽  
Etienne Koechlin

AbstractEveryday life features uncertain and ever-changing situations. In such environments, optimal adaptive behavior requires higher-order inferential capabilities to grasp the volatility of external contingencies. These capabilities however involve complex and rapidly intractable computations, so that we poorly understand how humans develop efficient adaptive behaviors in such environments. Here we demonstrate this counterintuitive result: simple, low-level inferential processes involving imprecise computations conforming to the psychophysical Weber Law actually lead to near-optimal adaptive behavior, regardless of the environment volatility. Using volatile experimental settings, we further show that such imprecise, low-level inferential processes accounted for observed human adaptive performances, unlike optimal adaptive models involving higher-order inferential capabilities, their biologically more plausible, algorithmic approximations and non-inferential adaptive models like reinforcement learning. Thus, minimal inferential capabilities may have evolved along with imprecise neural computations as contributing to near-optimal adaptive behavior in real-life environments, while leading humans to make suboptimal choices in canonical decision-making tasks.


Author(s):  
Lin Lan ◽  
Zhenguo Li ◽  
Xiaohong Guan ◽  
Pinghui Wang

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization. Recent efforts apply meta-learning to learn a meta-learner from a set of RL tasks such that a novel but related task could be solved quickly. Though specific in some ways, different tasks in meta-RL are generally similar at a high level. However, most meta-RL methods do not explicitly and adequately model the specific and shared information among different tasks, which limits their ability to learn training tasks and to generalize to novel tasks. In this paper, we propose to capture the shared information on the one hand and meta-learn how to quickly abstract the specific information about a task on the other hand. Methodologically, we train an SGD meta-learner to quickly optimize a task encoder for each task, which generates a task embedding based on past experience. Meanwhile, we learn a policy which is shared across all tasks and conditioned on task embeddings. Empirical results on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.


Author(s):  
Qiuyuan Huang ◽  
Zhe Gan ◽  
Asli Celikyilmaz ◽  
Dapeng Wu ◽  
Jianfeng Wang ◽  
...  

We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherent multi-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder. The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchically structured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.


Sign in / Sign up

Export Citation Format

Share Document