Text-Image Alignment in Portuguese News Using LinkPICS

Veltroni, Wellington Cristiano; Caseli, Helena de Medeiros; Villavicencio, A; Moreira, V; Abad, A; Caseli, H; Gamallo, P; Ramisch, C; Oliveira, HG; Paetzold, GH

Full text
Author(s):	Veltroni, Wellington Cristiano ; Caseli, Helena de Medeiros ; Villavicencio, A ; Moreira, V ; Abad, A ; Caseli, H ; Gamallo, P ; Ramisch, C ; Oliveira, HG ; Paetzold, GH Total Authors: 10
Document type:	Journal article
Source:	COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018; v. 11122, p. 11-pg., 2018-01-01.
Abstract
Text-image alignment is the task of aligning elements in a text with elements in the image accompanying it. Text-image alignment can be applied, for example, in news articles to improve clarity by explicitly defining the correspondence between regions in the article's image and words or named entities in the article's text. It can also be an useful step in many multimodal applications such as image captioning or image description/comprehension. In this paper we present the LinkPICS: an automatic aligner which combines Natural Language Processing (NLP) and Computer Vision (CV) techniques to explicitly define the correspondence between regions of an image (bounding boxes) and elements (words or named entities) in a text. LinkPICS performs the alignment of people and objects (or animals, vehicles, etc.) as two distinct processes. In the experiments present in this paper, LinkPICS obtained a precision of 97% in the alignment of people and 73% in the alignment of objects in articles in Portuguese from a Brazilian news site. (AU)

FAPESP's process:	16/13002-0 - MMeaning - multimodal distributional semantic models
Grantee:	Helena de Medeiros Caseli
Support Opportunities:	Regular Research Grants

Short URL