Busca avançada
Ano de início
Entree


Multi-Script Video Caption Localization Based on Visual Rhythms

Texto completo
Autor(es):
Roberto e Souza, Marcos ; Maia, Helena de Almeida ; Souza e Santos, Anderson Carlos ; Vieira, Marcelo Bernardes ; Pedrini, Helio
Número total de Autores: 5
Tipo de documento: Artigo Científico
Fonte: APPLIED ARTIFICIAL INTELLIGENCE; v. 36, n. 1, p. 32-pg., 2022-02-05.
Resumo

Localization of video caption plays an important role in information retrieval in multimedia applications. In this work, we present and evaluate a novel method for localizing video captions using visual rhythms, which enable the representation and analysis of a specific feature throughout the time. We build visual rhythms from the text location maps produced by general text localization methods that are far more common in the literature than caption-oriented ones. Then, we process the maps properly to keep only the captions, generating caption localization masks. To meet the need for a standardized and large dataset, we constructed a new one, where captions with thirteen different scripts are added to the video frames, generating a total of 221 videos with ground truth. Experiments demonstrate that our method achieves competitive results when compared to other literature approaches. (AU)

Processo FAPESP: 17/12646-3 - Déjà vu: coerência temporal, espacial e de caracterização de dados heterogêneos para análise e interpretação de integridade
Beneficiário:Anderson de Rezende Rocha
Modalidade de apoio: Auxílio à Pesquisa - Temático