Advanced search
Start date
Betweenand

Video-to-video dynamics transfer with deep generative models

Grant number: 17/16144-2
Support type:Scholarships in Brazil - Doctorate
Effective date (Start): September 01, 2018
Effective date (End): October 12, 2021
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal researcher:Gerberth Adín Ramírez Rivera
Grantee:Juan Felipe Hernández Albarracín
Home Institution: Instituto de Computação (IC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil

Abstract

Automatic content generation (or synthesis) is a field that had an incredible boost in recent years, with the advent of deep generative models. Nowadays, neural networks can create text, images and videos based on class labels or other media. Most of the research in this area focus on semantic image edition (e.g., style transfer or object transfiguration) and video prediction. Particularly, in the domain of videos, there is relatively few research focusing on purposes different from generating future frames of an input sequence. Therefore, in this proposal we intend to explore the field of video synthesis by extending existing ideas for the purpose of dynamics transfer, a narrowly explored scenario with potential new applications. We aim to achieve the video generation task by transferring the dynamics of the objects in a video A into the objects of a video B. Our proposed approach consists in training an autoencoder-like architecture that will split the given videos into an appearance-dynamics hyperspace that is used to synthesize the videos. To perform the transfer, we can use a video appearance representation with the dynamics of another video (extracted from the encoder), and then use the decoder to generate the output video. We will perform a video-dynamics cross training approach to learn robust appearance and dynamics spaces. The challenge we face is the lack of existing robust autoencoders for video data, as current generative methods do not achieve fully natural motion in video sequences yet. Thus, our work needs to create robust autoencoders for video generation, while solving the inter-frame motion consistency problem. (AU)