Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

ML-MDLText: An efficient and lightweight multilabel text classifier with incremental learning

Full text
Author(s):
Bittencourt, Marciele M. [1] ; Silva, Renato M. [1] ; Almeida, Tiago A. [1]
Total Authors: 3
Affiliation:
[1] Univ Fed Sao Carlos, Dept Comp Sci, UFSCar Sorocaba, Sao Paulo, SP - Brazil
Total Affiliations: 1
Document type: Journal article
Source: APPLIED SOFT COMPUTING; v. 96, NOV 2020.
Web of Science Citations: 0
Abstract

Single-label text classification has been extensively studied in the last decades, and usually, more attention has been given to offline learning scenarios, where all of the training data is available in advance. However, real-world text classification problems often involve multilabel instances and have dynamic textual patterns that can change frequently. In this context, the methods must predict a subset of target labels rather than a single one, and ideally should be able to update their model incrementally to be scalable and adaptable to changes in data patterns using limited time and memory. In this study, we present a text classification method based on the minimum description length principle that can be applied to multilabel classification without requiring the transformation of the classification problem. It also takes advantage of dependency information among labels and naturally supports online learning. We evaluated its performance using fifteen datasets from different application domains and compared it with traditional benchmark classifiers, considering three online learning scenarios. Even without requiring problem transformation tricks, the results obtained by the proposed method were very competitive with existing state-of-the-art online learning methods and those that transform multilabel problems into several single-label ones. (AU)

FAPESP's process: 18/02146-6 - Distributed text representation model with online learning
Grantee:Renato Moraes Silva
Support type: Scholarships in Brazil - Post-Doctorate
FAPESP's process: 17/09387-6 - A continuously evolving distributed text representation model
Grantee:Tiago Agostinho de Almeida
Support type: Regular Research Grants