scholarly journals Air Learning: a deep reinforcement learning gym for autonomous aerial robot visual navigation

2021 ◽  
Author(s):  
Srivatsan Krishnan ◽  
Behzad Boroujerdian ◽  
William Fu ◽  
Aleksandra Faust ◽  
Vijay Janapa Reddi

AbstractWe introduce Air Learning, an open-source simulator, and a gym environment for deep reinforcement learning research on resource-constrained aerial robots. Equipped with domain randomization, Air Learning exposes a UAV agent to a diverse set of challenging scenarios. We seed the toolset with point-to-point obstacle avoidance tasks in three different environments and Deep Q Networks (DQN) and Proximal Policy Optimization (PPO) trainers. Air Learning assesses the policies’ performance under various quality-of-flight (QoF) metrics, such as the energy consumed, endurance, and the average trajectory length, on resource-constrained embedded platforms like a Raspberry Pi. We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to $$40\%$$ 40 % longer trajectories in one of the environments. To understand the source of such discrepancies, we use Air Learning to artificially degrade high-end desktop performance to mimic what happens on a low-end embedded system. We then propose a mitigation technique that uses the hardware-in-the-loop to determine the latency distribution of running the policy on the target platform (onboard compute on aerial robot). A randomly sampled latency from the latency distribution is then added as an artificial delay within the training loop. Training the policy with artificial delays allows us to minimize the hardware gap (discrepancy in the flight time metric reduced from 37.73% to 0.5%). Thus, Air Learning with hardware-in-the-loop characterizes those differences and exposes how the onboard compute’s choice affects the aerial robot’s performance. We also conduct reliability studies to assess the effect of sensor failures on the learned policies. All put together, Air Learning enables a broad class of deep RL research on UAVs. The source code is available at: https://github.com/harvard-edge/AirLearning.

2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Fanyu Zeng ◽  
Chen Wang

Vanilla policy gradient methods suffer from high variance, leading to unstable policies during training, where the policy’s performance fluctuates drastically between iterations. To address this issue, we analyze the policy optimization process of the navigation method based on deep reinforcement learning (DRL) that uses asynchronous gradient descent for optimization. A variant navigation (asynchronous proximal policy optimization navigation, appoNav) is presented that can guarantee the policy monotonic improvement during the process of policy optimization. Our experiments are tested in DeepMind Lab, and the experimental results show that the artificial agents with appoNav perform better than the compared algorithm.


Author(s):  
Zhenhuan Rao ◽  
Yuechen Wu ◽  
Zifei Yang ◽  
Wei Zhang ◽  
Shijian Lu ◽  
...  

2009 ◽  
Vol 29 (3) ◽  
pp. 87-90
Author(s):  
Chad Loseby ◽  
Peter Chapin ◽  
Carl Brandon

2020 ◽  
Vol 53 (2) ◽  
pp. 10810-10815
Author(s):  
Miguel S.E. Martins ◽  
Joaquim L. Viegas ◽  
Tiago Coito ◽  
Bernardo Marreiros Firme ◽  
João M.C. Sousa ◽  
...  

Aerospace ◽  
2021 ◽  
Vol 8 (6) ◽  
pp. 167
Author(s):  
Bartłomiej Brukarczyk ◽  
Dariusz Nowak ◽  
Piotr Kot ◽  
Tomasz Rogalski ◽  
Paweł Rzucidło

The paper presents automatic control of an aircraft in the longitudinal channel during automatic landing. There are two crucial components of the system presented in the paper: a vision system and an automatic landing system. The vision system processes pictures of dedicated on-ground signs which appear to an on-board video camera to determine a glide path. Image processing algorithms used by the system were implemented into an embedded system and tested under laboratory conditions according to the hardware-in-the-loop method. An output from the vision system was used as one of the input signals to an automatic landing system. The major components are control algorithms based on the fuzzy logic expert system. They were created to imitate pilot actions while landing the aircraft. Both systems were connected with one another for cooperation and to control an aircraft model in a simulation environment. Selected results of tests presenting control efficiency and precision are shown in the final section of the paper.


2021 ◽  
Vol 11 (4) ◽  
pp. 1514 ◽  
Author(s):  
Quang-Duy Tran ◽  
Sang-Hoon Bae

To reduce the impact of congestion, it is necessary to improve our overall understanding of the influence of the autonomous vehicle. Recently, deep reinforcement learning has become an effective means of solving complex control tasks. Accordingly, we show an advanced deep reinforcement learning that investigates how the leading autonomous vehicles affect the urban network under a mixed-traffic environment. We also suggest a set of hyperparameters for achieving better performance. Firstly, we feed a set of hyperparameters into our deep reinforcement learning agents. Secondly, we investigate the leading autonomous vehicle experiment in the urban network with different autonomous vehicle penetration rates. Thirdly, the advantage of leading autonomous vehicles is evaluated using entire manual vehicle and leading manual vehicle experiments. Finally, the proximal policy optimization with a clipped objective is compared to the proximal policy optimization with an adaptive Kullback–Leibler penalty to verify the superiority of the proposed hyperparameter. We demonstrate that full automation traffic increased the average speed 1.27 times greater compared with the entire manual vehicle experiment. Our proposed method becomes significantly more effective at a higher autonomous vehicle penetration rate. Furthermore, the leading autonomous vehicles could help to mitigate traffic congestion.


Transmisi ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 117-122
Author(s):  
Sadr Lufti Mufreni ◽  
Esi Putri Silmina

Indonesia merupakan negara kepulauan yang mempunyai lebih dari 13.000 pulau. Wilayahnya terletak di antara Samudera Hindia dan Samudera Pasifik dan dilewati oleh Pacific Ring of Fire sehingga banyak gunung berapi aktif. Berdasarkan letak geografis mempunyai potensi tsunami dan gempa bumi cukup tinggi. Diperlukan rencana penanggulangan bencana yang baik untuk menekan risiko yang bisa terjadi, salah satunya dengan mitigasi bencana. Mitigasi bencana adalah serangkaian upaya untuk mengurangi risiko bencana, baik melalui pembangunan fisik maupun penyadaran dan peningkatan kemampuan menghadapi ancaman bencana. Mitigasi bencana diperlukan untuk mengurangi dampak yang ditimbulkan terutama korban jiwa. Salah satunya dengan menggunakan sistem peringatan dini. Sistem peringatan dini terdiri dari 3 komponen utama yaitu sensor untuk mendapatkan nilai dari suatu lingkungan, controller untuk mengolah nilai yang diterima, dan aksi yang dilakukan berdasarkan hasil dari pengolahan. Untuk membuat sistem yang efektif diperlukan komunikasi yang memadai. Messaging queue digunakan oleh industri untuk komunikasi antar perangkat lunak, perangkat keras, dan embedded system. Penelitian berfokus pada penggunaan ActiveMQ Artemis sebagai messaging queue sebagai server untuk komunikasi dengan internet of things (IoT). Keunggulan ActiveMQ Artemis dapat dijalankan di Raspberry Pi 3 dengan sedikit modifikasi. Hasil penelitian membuktikan bahwa ActiveMQ Artemis dapat digunakan untuk komunikasi IoT pada simulasi sistem mitigasi bencana.


Sign in / Sign up

Export Citation Format

Share Document