Loading…

Drone Navigation and Avoidance of Obstacles Through Deep Reinforcement Learning

Unmanned aerial vehicles (UAV) specifically drones have been used for surveillance, shipping and delivery, wildlife monitoring, disaster management etc. The increase on the number of drones in the airspace worldwide will lead necessarily to full autonomous drones. Given the expected huge number of d...

Full description

Saved in:
Bibliographic Details
Main Authors: Cetin, Ender, Barrado, Cristina, Munoz, Guillem, Macias, Miquel, Pastor, Enric
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unmanned aerial vehicles (UAV) specifically drones have been used for surveillance, shipping and delivery, wildlife monitoring, disaster management etc. The increase on the number of drones in the airspace worldwide will lead necessarily to full autonomous drones. Given the expected huge number of drones, if they were operated by human pilots, the possibility to collide with each other could be too high. In this paper, deep reinforcement learning (DRL) architecture is proposed to make drones behave autonomously inside a suburb neighborhood environment. The environment in the simulator has plenty of obstacles such as trees, cables, parked cars and houses. In addition, there are also another drones, acting as moving obstacles, inside the environment while the learner drone has a goal to achieve. In this way the drone can be trained to detect stationary and moving obstacles inside the neighborhood and so the drones can be used safely in a public area in the future. The drone has a front camera and it can capture continuously depth images. Every depth image is part of the state used in DRL architecture. Also, another part of the state is the distance to the geo-fence (a virtual barrier on the environment) which is added as a scalar value. The agent will be rewarded negatively when it tries to overpass the geo-fence limits. In addition, angle to goal and elevation angle between the goal and the drone will be used as information to be added to the state. It is considered that these scalar values will improve the DRL performance and also the reward obtained. The drone is trained using Q-Network and its convergence and final reward are evaluated. The states containing image and several scalars are processed by a neural network that joints the two state parts into a unique flow. This neural network is named as Joint Neural Network (JNN) [1]. The training and test results show that the agent can successfully learn to avoid any obstacle in the environment. The results for three scenarios are very promising and the learner drone reaches the destination with a success rate 100% in first two tests and with a success rate 98% in the last test, this one with a total of three drones.
ISSN:2155-7209
DOI:10.1109/DASC43569.2019.9081749