CV-C3D: Action Recognition on Compressed Videos with Convolutional 3D Networks

dos Santos, Samuel Felipe; Sebe, Nicu; Almeida, Jurandy; IEEE

Full text
Author(s):	dos Santos, Samuel Felipe ; Sebe, Nicu ; Almeida, Jurandy ; IEEE Total Authors: 4
Document type:	Journal article
Source:	2019 32ND SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI); v. N/A, p. 7-pg., 2019-01-01.
Abstract
Action recognition in videos has gained substantial attention from the computer vision community due to the wide range of possible applications. Recent works have addressed this problem with deep learning methods. The main limitation of existing approaches is their difficulty to learn temporal dynamics due to the high computational load demanded for processing huge amounts of data required to train a model. To overcome this problem, we propose a Compressed Video Convolutional 3D network (CV-C3D). It exploits information from the compressed representation of a video in order to avoid the high computational cost for fully decoding the video stream. The speed up of the computation enables our network to use 3D convolutions for capturing the temporal context efficiently. Our network has the lowest computational complexity among all the compared approaches. Results of our approach in the task of action recognition on two public benchmarks, UCF-101 and HMDB-51, were comparable to the baselines, with the advantage of running at faster inference speed. (AU)

FAPESP's process:	17/25908-6 - Weakly supervised learning for compressed video analysis on retrieval and classification tasks for visual alert
Grantee:	João Paulo Papa
Support Opportunities:	Research Grants - Research Partnership for Technological Innovation - PITE


FAPESP's process:	18/21837-0 - Human Activity Understanding with Discriminative Models through Deep Learning on Compressed Videos
Grantee:	Jurandy Gomes de Almeida Junior
Support Opportunities:	Scholarships abroad - Research

Short URL