Image-based Mapless Navigation of a Hybrid Aerial-Underwater Vehicle using Prioritized Deep Reinforcement Learning

de Jesus, Junior Costa; Kich, Victor Augusto; Kolling, Alisson Henrique; Grando, Ricardo Bedin; Guerra, Rodrigo da Silva; Drews-Jr, Paulo Lilles Jorge

Texto completo
Autor(es):	de Jesus, Junior Costa ; Kich, Victor Augusto ; Kolling, Alisson Henrique ; Grando, Ricardo Bedin ; Guerra, Rodrigo da Silva ; Drews-Jr, Paulo Lilles Jorge Número total de Autores: 6
Tipo de documento:	Artigo Científico
Fonte:	JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS; v. 111, n. 1, p. 13-pg., 2025-02-13.
Resumo
In recent years, Reinforcement Learning (RL) has made promising progress in several areas, such as control tasks and video games, by using simple, low-dimensional data. However, it struggles when it needs to process more complex, high-dimensional inputs like raw pixel images, offering results that are not as good as those that use information from laser sensors, as many robotics applications demand. This paper introduces a new technique called Contrastive Unsupervised Prioritized Representations in Reinforcement Learning (CUPRL) for mobile robotics. This innovative approach combines RL and Contrastive Learning to effectively handle high-dimensional observations, an area not fully explored. This is crucial for navigating complex environments, especially for hybrid robots, such as the Hybrid Unmanned Aerial-Underwater Vehicles (HUAUVs) that experience strong changes in light when moving between air and water. Our approach excels in taking important information from depth maps and RGB images during training, aiming to improve the ability of RL agents to navigate without a map in the context of HUAUVs. This field has much to be explored. Our tests in a robot simulator show that CUPRL, which uses learning from both RGB and depth images, performs better than current methods that rely only on pixel data. This is especially true for 3D navigation without maps, where we use only RGB images during tests. This proves that CUPRL could be useful for making decisions in HUAUVs. We believe our work not only offers improved solutions for navigation but also encourages further research into the use of high-dimensional data in RL, presenting a more efficient and adaptable method in complex environments compared to earlier strategies. (AU)

Processo FAPESP:	24/10523-5 - Prh22.1 - tecnologias digitais para o ecossistema costeiro oceânico na indústria do petróleo, gás e biocombustível
Beneficiário:	Emanuel da Silva Diaz Estrada
Modalidade de apoio:	Auxílio à Pesquisa - Regular

URL curto