Scholarship 20/14452-4 - Grafos, Processamento de linguagem natural - BV FAPESP
Advanced search
Start date
Betweenand

Visual question answering task with graph convolution networks

Grant number: 20/14452-4
Support Opportunities:Scholarships in Brazil - Master
Start date: May 01, 2021
End date: August 31, 2023
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Gerberth Adín Ramírez Rivera
Grantee:Bruno César de Oliveira Souza
Host Institution: Instituto de Computação (IC). Universidade Estadual de Campinas (UNICAMP). Campinas , SP, Brazil
Associated scholarship(s):22/09849-8 - Noisy scene graph with self-supervised learning on graph neural network for visual question answering task, BE.EP.MS

Abstract

Visual Question Answering (VQA) is a task that aims to answer a user's question grounded to a given image. Normally, this task requires a combination of concepts from Computer Vision and Natural Language Processing. The majority of existing VQA systems merge the extracted image and question features in order to predict an answer. Nonetheless, this multi-modal fusion shows a significant gap in semantic understanding between the relationship of the image and the question. To perform a better holistic understanding of the scene, we propose to apply a graph-based approach combining the question features related to the input image. The main objective of our research is to provide advancements in visual question answering, by using the structure of graph representation that improves the connections between features. For this purpose, it is necessary to create architectures to attain a graph representation that encodes the feature from the image's content, the natural language question, and their relationships. Then, we intend to use a graph neural network (GNN) that will learn the relationship of the VQA graph representation between a specific question grounded on the input image, in order to output the correct predicted answer. Finally, to bring more `reason' to our proposal, we aim to use the novel `fact-based' visual question answering (FVQA) task. A `fact-based' approach provides the model with a candidate list of facts related to the question. The method receives the `fact' through a knowledge base (KB) approach extracted from different sources of information. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
SOUZA, Bruno César de Oliveira. Melhoramento de informações visuais em tarefas de respostas a questões baseadas em imagens com dados em grafos de cena utilizando aprendizagem autossupervisionada. 2023. Master's Dissertation - Universidade Estadual de Campinas (UNICAMP). Instituto de Computação Campinas, SP.