What should we pay attention to when classifying violent videos?

Teixeira, Marcos; Avila, Sandra; ASSOC COMP MACHINERY

Full text
Author(s):	Teixeira, Marcos ; Avila, Sandra ; ASSOC COMP MACHINERY Total Authors: 3
Document type:	Journal article
Source:	ARES 2021: 16TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY; v. N/A, p. 10-pg., 2021-01-01.
Abstract
Many works on violent video classification have proposed solutions ranging from local descriptors to deep neural networks. Most approaches use the entire representation of the video as input to extract the appropriate features. However, some scenes may contain noisy and irrelevant parts that confuse the algorithm. We investigated the effectiveness of attention-based models to deal with this problem. We extended the initial implementations to work with multimodal features using the late fusion approach. We performed the experiments on three datasets with different concepts of violence: Hockey Fights, MediaEval 2015, and RWF-2000. We conducted quantitative experiments, analyzing the performance of attention-based models and comparing them with traditional methods, and qualitative, analyzing the relevance scores produced by the attention-based models. Attention-based models surpassed their traditional counterpart for all cases. Also, attention-based models have achieved better results than many more expensive approaches, highlighting the advantage of their use. (AU)

FAPESP's process:	13/08293-7 - CCES - Center for Computational Engineering and Sciences
Grantee:	Munir Salomao Skaf
Support Opportunities:	Research Grants - Research, Innovation and Dissemination Centers - RIDC


FAPESP's process:	17/16246-0 - Sensitive media analysis through deep learning architectures
Grantee:	Sandra Eliza Fontes de Avila
Support Opportunities:	Regular Research Grants

Short URL