Espaço de Representação Multimodal para Geração de Dados Guiados por Texto
O Império nas imagens: a Expo'98 e a (re)construção iconográfica do Império Português
A rota inversa dos descobrimentos: o conceito de brasilidade em jornais lusitanos ...
Texto completo | |
Autor(es): |
Veltroni, Wellington Cristiano
;
Caseli, Helena de Medeiros
;
Villavicencio, A
;
Moreira, V
;
Abad, A
;
Caseli, H
;
Gamallo, P
;
Ramisch, C
;
Oliveira, HG
;
Paetzold, GH
Número total de Autores: 10
|
Tipo de documento: | Artigo Científico |
Fonte: | COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018; v. 11122, p. 11-pg., 2018-01-01. |
Resumo | |
Text-image alignment is the task of aligning elements in a text with elements in the image accompanying it. Text-image alignment can be applied, for example, in news articles to improve clarity by explicitly defining the correspondence between regions in the article's image and words or named entities in the article's text. It can also be an useful step in many multimodal applications such as image captioning or image description/comprehension. In this paper we present the LinkPICS: an automatic aligner which combines Natural Language Processing (NLP) and Computer Vision (CV) techniques to explicitly define the correspondence between regions of an image (bounding boxes) and elements (words or named entities) in a text. LinkPICS performs the alignment of people and objects (or animals, vehicles, etc.) as two distinct processes. In the experiments present in this paper, LinkPICS obtained a precision of 97% in the alignment of people and 73% in the alignment of objects in articles in Portuguese from a Brazilian news site. (AU) | |
Processo FAPESP: | 16/13002-0 - MMeaning - representação semântica distribuída multimodal |
Beneficiário: | Helena de Medeiros Caseli |
Modalidade de apoio: | Auxílio à Pesquisa - Regular |