Advanced search
Start date

Human Activity Understanding with Discriminative Models through Deep Learning on Compressed Videos

Grant number: 18/21837-0
Support type:Scholarships abroad - Research
Effective date (Start): February 16, 2019
Effective date (End): February 15, 2020
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal researcher:Jurandy Gomes de Almeida Junior
Grantee:Jurandy Gomes de Almeida Junior
Host: Niculae Sebe
Home Institution: Instituto de Ciência e Tecnologia (ICT). Universidade Federal de São Paulo (UNIFESP). Campus São José dos Campos. São José dos Campos , SP, Brazil
Research place: Universitá degli Studi di Trento, Italy  


Digital videos have become the medium of choice for a growing number of people communicating via Internet and their mobile devices. Over the past decade, world has witnessed an explosive growth in the amount of video data fostered by astonishing technological developments. In this scenario, there is a growing demand for efficient systems to reduce the work and information overload for people. Making efficient use of video content requires the development of intelligent tools capable to understand videos in a similar way as humans do. This has been the goal of a quickly evolving research area known as video understanding. A crucial step toward video understanding is to understand human actions and activities. One of the main issues concerning the human activity understanding problem is the extraction of useful information from video content. Recently, deep learning has been successfully used to train discriminative models able to learn powerful and interpretable features for understanding visual content. However, due to the temporal dimension, training deep learning models on video data faces a number of practical difficulties, like limited training samples and high computational cost. The goal of this research proposal is to tackle the computational overhead of training a deep learning model in order to improve its capacity to handle video data and advance the state-of-the-art on human activity understanding. For this, we plan to exploit relevant information pertaining to visual content available in the compressed representation used for video storage and transmission. This enables to save high computational load in full decoding the video stream and therefore greatly speed up the training time, which has become a big bottleneck of deep learning.

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items