Software Architecture for Autonomous and Coordinated Navigation of UAV Swarms in Forest and Urban Firefighting

Advances in the field of unmanned aerial vehicles (UAVs) have led to an exponential increase in their market, thanks to the development of innovative technological solutions aimed at a wide range of applications and services, such as emergencies and those related to fires. In addition, the expansion of this market has been accompanied by the birth and growth of the so-called UAV swarms. Currently, the expansion of these systems is due to their properties in terms of robustness, versatility, and efficiency. Along with these properties there is an aspect, which is still a field of study, such as autonomous and cooperative navigation of these swarms. In this paper we present an architecture that includes a set of complementary methods that allow the establishment of different control layers to enable the autonomous and cooperative navigation of a swarm of UAVs. Among the different layers, there are a global trajectory planner based on sampling, algorithms for obstacle detection and avoidance, and methods for autonomous decision making based on deep reinforcement learning. The paper shows satisfactory results for a line-of-sight based algorithm for global path planner trajectory smoothing in 2D and 3D. In addition, a novel method for autonomous navigation of UAVs based on deep reinforcement learning is shown, which has been tested in 2 different simulation environments with promising results about the use of these techniques to achieve autonomous navigation of UAVs.

Download Full-text

Generative Exploration and Exploitation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5858 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4337-4344

Author(s):

Jiechuan Jiang ◽

Zongqing Lu

Keyword(s):

Reinforcement Learning ◽

Prior Knowledge ◽

Single Agent ◽

Exploration And Exploitation ◽

Cooperative Navigation ◽

Multi Agent ◽

Novel Method

Sparse reward is one of the biggest challenges in reinforcement learning (RL). In this paper, we propose a novel method called Generative Exploration and Exploitation (GENE) to overcome sparse reward. GENE automatically generates start states to encourage the agent to explore the environment and to exploit received reward signals. GENE can adaptively tradeoff between exploration and exploitation according to the varying distributions of states experienced by the agent as the learning progresses. GENE relies on no prior knowledge about the environment and can be combined with any RL algorithm, no matter on-policy or off-policy, single-agent or multi-agent. Empirically, we demonstrate that GENE significantly outperforms existing methods in three tasks with only binary rewards, including Maze, Maze Ant, and Cooperative Navigation. Ablation studies verify the emergence of progressive exploration and automatic reversing.

Download Full-text

Variable Precision Depth Encoding for 3D Range Geometry Compression

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.17.3dmp-034 ◽

2020 ◽

Vol 2020 (17) ◽

pp. 34-1-34-7

Author(s):

Matthew G. Finley ◽

Tyler Bell

Keyword(s):

Normal Distribution ◽

Arbitrary Distribution ◽

File Size ◽

Geometry Compression ◽

Variable Precision ◽

Wide Range ◽

Novel Method ◽

Encoding Method ◽

Rgb Image ◽

Color Channels

This paper presents a novel method for accurately encoding 3D range geometry within the color channels of a 2D RGB image that allows the encoding frequency—and therefore the encoding precision—to be uniquely determined for each coordinate. The proposed method can thus be used to balance between encoding precision and file size by encoding geometry along a normal distribution; encoding more precisely where the density of data is high and less precisely where the density is low. Alternative distributions may be followed to produce encodings optimized for specific applications. In general, the nature of the proposed encoding method is such that the precision of each point can be freely controlled or derived from an arbitrary distribution, ideally enabling this method for use within a wide range of applications.

Download Full-text

Deep Q Reinforcement Learning for Autonomous Navigation of Surgical Snake Robot in Confined Spaces

10.31256/hsmr2019.18 ◽

2019 ◽

Author(s):

S Athiniotis ◽

◽

R A Srivatsan ◽

H Choset

Keyword(s):

Reinforcement Learning ◽

Autonomous Navigation ◽

Snake Robot ◽

Confined Spaces

Download Full-text

Expressing the Self

10.1093/oso/9780198786658.001.0001 ◽

2018 ◽

Cited By ~ 1

Keyword(s):

Philosophy Of Mind ◽

The Self ◽

Epistemic Status ◽

First Person ◽

Semantic Categories ◽

Wide Range ◽

Person Reference ◽

De Se ◽

Novel Method ◽

Cultural Pragmatics

This book addresses different linguistic and philosophical aspects of referring to the self in a wide range of languages from different language families, including Amharic, English, French, Japanese, Korean, Mandarin, Newari (Sino-Tibetan), Polish, Tariana (Arawak), and Thai. In the domain of speaking about oneself, languages use a myriad of expressions that cut across grammatical and semantic categories, as well as a wide variety of constructions. Languages of Southeast and East Asia famously employ a great number of terms for first-person reference to signal honorification. The number and mixed properties of these terms make them debatable candidates for pronounhood, with many grammar-driven classifications opting to classify them with nouns. Some languages make use of egophors or logophors, and many exhibit an interaction between expressing the self and expressing evidentiality qua the epistemic status of information held from the ego perspective. The volume’s focus on expressing the self, however, is not directly motivated by an interest in the grammar or lexicon, but instead stems from philosophical discussions of the special status of thoughts about oneself, known as de se thoughts. It is this interdisciplinary understanding of expressing the self that underlies this volume, comprising philosophy of mind at one end of the spectrum and cross-cultural pragmatics of self-expression at the other. This unprecedented juxtaposition results in a novel method of approaching de se and de se expressions, in which research methods from linguistics and philosophy inform each other. The importance of this interdisciplinary perspective on expressing the self cannot be overemphasized. Crucially, the volume also demonstrates that linguistic research on first-person reference makes a valuable contribution to research on the self tout court, by exploring the ways in which the self is expressed, and thereby adding to the insights gained through philosophy, psychology, and cognitive science.

Download Full-text

Homogeneous 2D and 3D alignment of cardiomyocyte in dilated cardiomyopathy revealed by intravital heart imaging

Scientific Reports ◽

10.1038/s41598-021-94100-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Kiyoshi Masuyama ◽

Tomoaki Higo ◽

Jong-Kook Lee ◽

Ryohei Matsuura ◽

Ian Jones ◽

...

Keyword(s):

Heart Failure ◽

Dilated Cardiomyopathy ◽

Cardiac Dysfunction ◽

Three Dimensional ◽

Single Layer ◽

Specific Pattern ◽

Two Dimensional ◽

Heart Imaging ◽

Novel Method ◽

2D And 3D

AbstractIn contrast to hypertrophic cardiomyopathy, there has been reported no specific pattern of cardiomyocyte array in dilated cardiomyopathy (DCM), partially because lack of alignment assessment in a three-dimensional (3D) manner. Here we have established a novel method to evaluate cardiomyocyte alignment in 3D using intravital heart imaging and demonstrated homogeneous alignment in DCM mice. Whilst cardiomyocytes of control mice changed their alignment by every layer in 3D and position twistedly even in a single layer, termed myocyte twist, cardiomyocytes of DCM mice aligned homogeneously both in two-dimensional (2D) and in 3D and lost myocyte twist. Manipulation of cultured cardiomyocyte toward homogeneously aligned increased their contractility, suggesting that homogeneous alignment in DCM mice is due to a sort of alignment remodelling as a way to compensate cardiac dysfunction. Our findings provide the first intravital evidence of cardiomyocyte alignment and will bring new insights into understanding the mechanism of heart failure.

Download Full-text

VAGADRONE: Intelligent and Fully Automatic Drone Based on Raspberry Pi and Android

Applied Sciences ◽

10.3390/app11073153 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3153

Author(s):

Saifeddine Benhadhria ◽

Mohamed Mansouri ◽

Ameni Benkhlifa ◽

Imed Gharbi ◽

Nadhem Jlili

Keyword(s):

Remote Control ◽

Autonomous Navigation ◽

Object Identification ◽

External Control ◽

Raspberry Pi ◽

Optimal Trajectories ◽

Live Streaming ◽

Wide Range ◽

Fully Automatic ◽

Direct Use

Multirotor drones are widely used currently in several areas of life. Their suitable size and the tasks that they can perform are their main advantages. However, to the best of our knowledge, they must be controlled via remote control to fly from one point to another, and they can only be used for a specific mission (tracking, searching, computing, and so on). In this paper, we intend to present an autonomous UAV based on Raspberry Pi and Android. Android offers a wide range of applications for direct use by the UAV depending on the context of the assigned mission. The applications cover a large number of areas such as object identification, facial recognition, and counting objects such as panels, people, and so on. In addition, the proposed UAV calculates optimal trajectories, provides autonomous navigation without external control, detects obstacles, and ensures live streaming during the mission. Experiments are carried out to test the above-mentioned criteria.

Download Full-text

Collision-free path planning for welding manipulator via hybrid algorithm of deep reinforcement learning and inverse kinematics

Complex & Intelligent Systems ◽

10.1007/s40747-021-00366-1 ◽

2021 ◽

Author(s):

Jie Zhong ◽

Tao Wang ◽

Lianglun Cheng

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Free Path ◽

Inverse Kinematics ◽

Multiple Dimensions ◽

Continuous State ◽

Planning Algorithm ◽

Convergence Performance ◽

Path Planner ◽

Action Spaces

AbstractIn actual welding scenarios, an effective path planner is needed to find a collision-free path in the configuration space for the welding manipulator with obstacles around. However, as a state-of-the-art method, the sampling-based planner only satisfies the probability completeness and its computational complexity is sensitive with state dimension. In this paper, we propose a path planner for welding manipulators based on deep reinforcement learning for solving path planning problems in high-dimensional continuous state and action spaces. Compared with the sampling-based method, it is more robust and is less sensitive with state dimension. In detail, to improve the learning efficiency, we introduce the inverse kinematics module to provide prior knowledge while a gain module is also designed to avoid the local optimal policy, we integrate them into the training algorithm. To evaluate our proposed planning algorithm in multiple dimensions, we conducted multiple sets of path planning experiments for welding manipulators. The results show that our method not only improves the convergence performance but also is superior in terms of optimality and robustness of planning compared with most other planning algorithms.

Download Full-text

A Reinforcement Learning Framework for Spiking Networks with Dynamic Synapses

Computational Intelligence and Neuroscience ◽

10.1155/2011/869348 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Karim El-Laithy ◽

Martin Bogdan

Keyword(s):

Reinforcement Learning ◽

Spike Timing ◽

Neural Representation ◽

Model Parameters ◽

Learning Framework ◽

Reference Target ◽

Wide Range ◽

Spiking Network ◽

Dynamic Synapses ◽

Exclusive Or

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.

Download Full-text

Goal-driven active learning

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09527-5 ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Process ◽

Real World ◽

Imitation Learning ◽

Learning Approaches ◽

Wide Range ◽

Fixed Set ◽

Complex Decision Making ◽

Complex Decision

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Download Full-text

Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments

Sensors ◽

10.3390/s21072534 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2534

Author(s):

Oualid Doukhi ◽

Deok-Jin Lee

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Autonomous Navigation ◽

Control Technique ◽

Local Motion ◽

Aerial Robots ◽

Novel Approach ◽

Outdoor Environments ◽

Aerial Vehicle ◽

High Level

Autonomous navigation and collision avoidance missions represent a significant challenge for robotics systems as they generally operate in dynamic environments that require a high level of autonomy and flexible decision-making capabilities. This challenge becomes more applicable in micro aerial vehicles (MAVs) due to their limited size and computational power. This paper presents a novel approach for enabling a micro aerial vehicle system equipped with a laser range finder to autonomously navigate among obstacles and achieve a user-specified goal location in a GPS-denied environment, without the need for mapping or path planning. The proposed system uses an actor–critic-based reinforcement learning technique to train the aerial robot in a Gazebo simulator to perform a point-goal navigation task by directly mapping the noisy MAV’s state and laser scan measurements to continuous motion control. The obtained policy can perform collision-free flight in the real world while being trained entirely on a 3D simulator. Intensive simulations and real-time experiments were conducted and compared with a nonlinear model predictive control technique to show the generalization capabilities to new unseen environments, and robustness against localization noise. The obtained results demonstrate our system’s effectiveness in flying safely and reaching the desired points by planning smooth forward linear velocity and heading rates.

Download Full-text