Hierarchical Active Tracking Control for UAVs via Deep Reinforcement Learning

Active tracking control is essential for UAVs to perform autonomous operations in GPS-denied environments. In the active tracking task, UAVs take high-dimensional raw images as input and execute motor actions to actively follow the dynamic target. Most research focuses on three-stage methods, which entail perception first, followed by high-level decision-making based on extracted spatial information of the dynamic target, and then UAV movement control, using a low-level dynamic controller. Perception methods based on deep neural networks are powerful but require considerable effort for manual ground truth labeling. Instead, we unify the perception and decision-making stages using a high-level controller and then leverage deep reinforcement learning to learn the mapping from raw images to the high-level action commands in the V-REP-based environment, where simulation data are infinite and inexpensive. This end-to-end method also has the advantages of a small parameter size and reduced effort requirements for parameter turning in the decision-making stage. The high-level controller, which has a novel architecture, explicitly encodes the spatial and temporal features of the dynamic target. Auxiliary segmentation and motion-in-depth losses are introduced to generate denser training signals for the high-level controller’s fast and stable training. The high-level controller and a conventional low-level PID controller constitute our hierarchical active tracking control framework for the UAVs’ active tracking task. Simulation experiments show that our controller trained with several augmentation techniques sufficiently generalizes dynamic targets with random appearances and velocities, and achieves significantly better performance, compared with three-stage methods.

Download Full-text

EXPRESS: Marketers Project Their Personal Preferences onto Consumers: Overcoming the Threat of Egocentric Decision Making

Journal of Marketing Research ◽

10.1177/0022243721998378 ◽

2021 ◽

pp. 002224372199837

Author(s):

Walter Herzog ◽

Johannes D. Hattula ◽

Darren W. Dahl

Keyword(s):

Decision Making ◽

Consumer Preferences ◽

Research Question ◽

Pilot Studies ◽

Low Level ◽

Marketing Managers ◽

False Consensus Effect ◽

False Consensus ◽

High Level

This research explores how marketing managers can avoid the so-called false consensus effect—the egocentric tendency to project personal preferences onto consumers. Two pilot studies were conducted to provide evidence for the managerial importance of this research question and to explore how marketing managers attempt to avoid false consensus effects in practice. The results suggest that the debiasing tactic most frequently used by marketers is to suppress their personal preferences when predicting consumer preferences. Four subsequent studies show that, ironically, this debiasing tactic can backfire and increase managers’ susceptibility to the false consensus effect. Specifically, the results suggest that these backfire effects are most likely to occur for managers with a low level of preference certainty. In contrast, the results imply that preference suppression does not backfire but instead decreases false consensus effects for managers with a high level of preference certainty. Finally, the studies explore the mechanism behind these results and show how managers can ultimately avoid false consensus effects—regardless of their level of preference certainty and without risking backfire effects.

Download Full-text

Transfer Reinforcement Learning for Autonomous Driving

ACM Transactions on Modeling and Computer Simulation ◽

10.1145/3449356 ◽

2021 ◽

Vol 31 (3) ◽

pp. 1-26

Author(s):

Aravind Balakrishnan ◽

Jaeyoung Lee ◽

Ashish Gaurav ◽

Krzysztof Czarnecki ◽

Sean Sedwards

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Transfer Problem ◽

Autonomous Driving ◽

High Fidelity ◽

Rule Based ◽

High Level ◽

Real Vehicle

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.

Download Full-text

Effect of Interventional Programme on Enhancing Empowerment Status through Decision Making Skills of Tribal Working Women in Utnoor Mandal Adilabad District, India

Current Journal of Applied Science and Technology ◽

10.9734/cjast/2021/v40i731327 ◽

2021 ◽

pp. 41-45

Author(s):

N. Sandhya Rani ◽

M. Sarada Devi

Keyword(s):

Decision Making ◽

Working Women ◽

Total Sample ◽

Average Level ◽

Sample Population ◽

Test Results ◽

Low Level ◽

Tribal Women ◽

High Level ◽

The Impact

Empowerment of tribal women is one of the central issues in the process of development all over the world. Empowerment is the process that allows one to gain the knowledge and attitude needed to cope with the changing world and the circumstances in which one lives [1]. Women empowerment is a process in which women gain greater share of control over material, human and intellectual resources as well as control over decision-making in their home, community, society and nation. Given the need to analyze the empowerment status of tribal women, the present study aimed to enhance the empowerment status through enhancing decision-making skills of tribal working women in India. The specific objective is to study the impact of intervention on enhancing status of empowerment through decision-making skills of tribal working women in Utnoor Mandal Adilabad district. The total sample population for the study was 50 tribal working women, and data was analyzed using a paired t test. Results revealed that at pretest, majority of the women were at average level of decision-making skills (78%), 12% were at low level and only 10% were at high level. After the intervention, post test results revealed that 74% of the women were high in decision making skills and remaining 26% were at average level. Interestingly, none of the respondents had low level of life skills. Thus, intervention found to be effective among women respondents to develop and enhance their empowerment status through decision-making skills.

Download Full-text

Deep Reinforcement Learning Based High-level Driving Behavior Decision-making Model in Heterogeneous Traffic

2019 Chinese Control Conference (CCC) ◽

10.23919/chicc.2019.8866005 ◽

2019 ◽

Author(s):

Zhengwei Bai ◽

Wei Shangguan ◽

Baigen Cai ◽

Linguo Chai

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Driving Behavior ◽

Heterogeneous Traffic ◽

High Level ◽

Decision Making Model

Download Full-text

Strategic Tasks for Explainable Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.330110007 ◽

2019 ◽

Vol 33 ◽

pp. 10007-10008 ◽

Cited By ~ 1

Author(s):

Rey Pocius ◽

Lawrence Neal ◽

Alan Fern

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Environment ◽

Strategic Decision ◽

Strategic Decision Making ◽

Sequential Decision Making ◽

Sequential Decision ◽

Level Control ◽

Mini Games ◽

High Level

Commonly used sequential decision making tasks such as the games in the Arcade Learning Environment (ALE) provide rich observation spaces suitable for deep reinforcement learning. However, they consist mostly of low-level control tasks which are of limited use for the development of explainable artificial intelligence(XAI) due to the fine temporal resolution of the tasks. Many of these domains also lack built-in high level abstractions and symbols. Existing tasks that provide for both strategic decision-making and rich observation spaces are either difficult to simulate or are intractable. We provide a set of new strategic decision-making tasks specialized for the development and evaluation of explainable AI methods, built as constrained mini-games within the StarCraft II Learning Environment.

Download Full-text

Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped

2019 International Conference on Robotics and Automation (ICRA) ◽

10.1109/icra.2019.8793864 ◽

2019 ◽

Cited By ~ 6

Author(s):

Tianyu Li ◽

Hartmut Geyer ◽

Christopher G. Atkeson ◽

Akshara Rai

Keyword(s):

Reinforcement Learning ◽

Learning To Learn ◽

High Level

Download Full-text

Logic-Based Sequential Decision-Making

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019995 ◽

2019 ◽

Vol 33 ◽

pp. 9995-9996

Author(s):

Daoming Lyu ◽

Fangkai Yang ◽

Bo Liu ◽

Daesub Yoon

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

High Dimensional ◽

Great Success ◽

Sequential Decision ◽

Sensory Inputs ◽

Hierarchical Decision ◽

High Level ◽

Data Efficiency ◽

Symbolic Planning

Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options. This framework features a planner – controller – meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.

Download Full-text

Imprecise neural computations as source of human adaptive behavior in volatile environments

10.1101/799239 ◽

2019 ◽

Cited By ~ 1

Author(s):

Charles Findling ◽

Nicolas Chopin ◽

Etienne Koechlin

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Everyday Life ◽

Adaptive Behavior ◽

Real Life ◽

Higher Order ◽

Low Level ◽

Adaptive Models ◽

Neural Computations ◽

Counterintuitive Result

AbstractEveryday life features uncertain and ever-changing situations. In such environments, optimal adaptive behavior requires higher-order inferential capabilities to grasp the volatility of external contingencies. These capabilities however involve complex and rapidly intractable computations, so that we poorly understand how humans develop efficient adaptive behaviors in such environments. Here we demonstrate this counterintuitive result: simple, low-level inferential processes involving imprecise computations conforming to the psychophysical Weber Law actually lead to near-optimal adaptive behavior, regardless of the environment volatility. Using volatile experimental settings, we further show that such imprecise, low-level inferential processes accounted for observed human adaptive performances, unlike optimal adaptive models involving higher-order inferential capabilities, their biologically more plausible, algorithmic approximations and non-inferential adaptive models like reinforcement learning. Thus, minimal inferential capabilities may have evolved along with imprecise neural computations as contributing to near-optimal adaptive behavior in real-life environments, while leading humans to make suboptimal choices in canonical decision-making tasks.

Download Full-text

Meta Reinforcement Learning with Task Embedding and Shared Policy

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/387 ◽

2019 ◽

Cited By ~ 2

Author(s):

Lin Lan ◽

Zhenguo Li ◽

Xiaohong Guan ◽

Pinghui Wang

Keyword(s):

Reinforcement Learning ◽

The Other ◽

Specific Information ◽

Significant Progress ◽

Learning To Learn ◽

Learning Capacity ◽

Shared Information ◽

Meta Learning ◽

The One ◽

High Level

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization. Recent efforts apply meta-learning to learn a meta-learner from a set of RL tasks such that a novel but related task could be solved quickly. Though specific in some ways, different tasks in meta-RL are generally similar at a high level. However, most meta-RL methods do not explicitly and adequately model the specific and shared information among different tasks, which limits their ability to learn training tasks and to generalize to novel tasks. In this paper, we propose to capture the shared information on the one hand and meta-learn how to quickly abstract the specific information about a task on the other hand. Methodologically, we train an SGD meta-learner to quickly optimize a task encoder for each task, which generates a task embedding based on past experience. Meanwhile, we learn a policy which is shared across all tasks and conditioned on task embeddings. Empirical results on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.

Download Full-text

Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018465 ◽

2019 ◽

Vol 33 ◽

pp. 8465-8472 ◽

Cited By ~ 8

Author(s):

Qiuyuan Huang ◽

Zhe Gan ◽

Asli Celikyilmaz ◽

Dapeng Wu ◽

Jianfeng Wang ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Semantic Concept ◽

Sentence Generation ◽

Visual Storytelling ◽

Empirical Results ◽

Low Level ◽

Story Generation ◽

End To End ◽

High Level

We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherent multi-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder. The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchically structured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.

Download Full-text