Advanced search
Start date
Betweenand


An Ensemble Approach to Cross-Domain Authorship Attribution

Full text
Author(s):
Show less -
Custodio, Jose Eleandro ; Paraboni, Ivandre ; Crestani, F ; Braschler, M ; Savoy, J ; Rauber, A ; Muller, H ; Losada, DE ; Burki, GH ; Cappellato, L ; Ferro, N
Total Authors: 11
Document type: Journal article
Source: EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2019); v. 11696, p. 12-pg., 2019-01-01.
Abstract

This paper presents an ensemble approach to cross-domain authorship attribution that combines predictions made by three independent classifiers, namely, standard character n-grams, character n-grams with non-diacritic distortion and word n-grams. Our proposal relies on variable-length n-gram models and multinomial logistic regression to select the prediction of highest probability among the three models as the output for the task. The present approach is compared against a number of baseline systems, and we report results based on both the PAN-CLEF 2018 test data, and on a new corpus of song lyrics in English and Portuguese. (AU)

FAPESP's process: 16/14223-0 - Computational Treatment of Human Personality for Natural Language Processing Applications
Grantee:Ivandre Paraboni
Support Opportunities: Regular Research Grants