Busca avançada
Ano de início
Entree


RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation

Texto completo
Autor(es):
Morimitsu, Henrique ; Zhu, Xiaobin ; Cesar-Jr, Roberto M. ; Ji, Xiangyang ; Yin, Xu-Cheng
Número total de Autores: 5
Tipo de documento: Artigo Científico
Fonte: 2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024; v. N/A, p. 7-pg., 2024-01-01.
Resumo

Extracting motion information from videos with optical flow estimation is vital in multiple practical robot applications. Current optical flow approaches show remarkable accuracy, but top-performing methods have high computational costs and are unsuitable for embedded devices. Although some previous works have focused on developing low-cost optical flow strategies, their estimation quality has a noticeable gap with more robust methods. In this paper, we develop a novel method to efficiently estimate high-quality optical flow in embedded devices. Our proposed RAPIDFlow model combines efficient NeXt1D convolution blocks with a fully recurrent structure based on feature pyramids to decrease computational costs without significantly impacting estimation accuracy. The adaptable recurrent encoder produces multi-scale features with a single shared block, which allows us to adjust the pyramid length at inference time and make it more robust to changes in input size. Also, it enables our model to offer multiple tradeoffs between accuracy and speed to suit different applications. Experiments using a Jetson Orin NX embedded system on the MPI-Sintel and KITTI public benchmarks show that RAPIDFlow outperforms previous approaches by significant margins at faster speeds. Our code is available at https://github.com/hmorimitsu/ptlflow/tree/main/ptlflow/models/rapidflow. (AU)

Processo FAPESP: 15/22308-2 - Representações intermediárias em Ciência Computacional para descoberta de conhecimento
Beneficiário:Roberto Marcondes Cesar Junior
Modalidade de apoio: Auxílio à Pesquisa - Temático
Processo FAPESP: 22/15304-4 - Aprendizado de representações ricas em contexto para visão computacional
Beneficiário:Nina Sumiko Tomita Hirata
Modalidade de apoio: Auxílio à Pesquisa - Temático