Post or Block? Advances in Automatically Filtering Undesired Comments

Alberto, Tulio C.; Lochter, Johannes V.; Almeida, Tiago A.

Texto completo
Autor(es):	Alberto, Tulio C. ; Lochter, Johannes V. ; Almeida, Tiago A. Número total de Autores: 3
Tipo de documento:	Artigo Científico
Fonte:	JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS; v. 80, p. 15-pg., 2015-12-01.
Resumo
Currently, a great volume of the available information on several websites comes from the interaction with users, such as social networks, forums and blogs, where readers can post comments and sometimes develop habits of frequenting them. Some blogs specialized in certain subjects, gain the users credibility and become references in the field. Nevertheless, the ease of inserting content through text comments makes room for unwanted messages, which affect the user experience, reduce the quality of the information provided by the websites and indirectly cause personal and economic losses. In this scenario, this paper presents a comprehensive study of established machine learning techniques applied to automatically detect undesired comments posted on blogs. Furthermore, different sets of attributes were evaluated along with text normalization techniques. Experiments carried out with a real and public database indicate that support vector machines, logistic regression and stacking ensemble methods, trained with both attributes extracted from the text messages and posting information, are promising for the task of blocking undesired comments. (AU)

Processo FAPESP:	13/10005-0 - Contribuições ao combate de blog comment spamming
Beneficiário:	Túlio Casagrande Alberto
Modalidade de apoio:	Bolsas no Brasil - Iniciação Científica

URL curto