Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

ML-MDLText: An efficient and lightweight multilabel text classifier with incremental learning

Texto completo
Autor(es):
Bittencourt, Marciele M. [1] ; Silva, Renato M. [1] ; Almeida, Tiago A. [1]
Número total de Autores: 3
Afiliação do(s) autor(es):
[1] Univ Fed Sao Carlos, Dept Comp Sci, UFSCar Sorocaba, Sao Paulo, SP - Brazil
Número total de Afiliações: 1
Tipo de documento: Artigo Científico
Fonte: APPLIED SOFT COMPUTING; v. 96, NOV 2020.
Citações Web of Science: 0
Resumo

Single-label text classification has been extensively studied in the last decades, and usually, more attention has been given to offline learning scenarios, where all of the training data is available in advance. However, real-world text classification problems often involve multilabel instances and have dynamic textual patterns that can change frequently. In this context, the methods must predict a subset of target labels rather than a single one, and ideally should be able to update their model incrementally to be scalable and adaptable to changes in data patterns using limited time and memory. In this study, we present a text classification method based on the minimum description length principle that can be applied to multilabel classification without requiring the transformation of the classification problem. It also takes advantage of dependency information among labels and naturally supports online learning. We evaluated its performance using fifteen datasets from different application domains and compared it with traditional benchmark classifiers, considering three online learning scenarios. Even without requiring problem transformation tricks, the results obtained by the proposed method were very competitive with existing state-of-the-art online learning methods and those that transform multilabel problems into several single-label ones. (AU)

Processo FAPESP: 18/02146-6 - Representação distribuída de textos com atualização incremental
Beneficiário:Renato Moraes Silva
Linha de fomento: Bolsas no Brasil - Pós-Doutorado
Processo FAPESP: 17/09387-6 - Modelo de representação distribuída de textos com capacidade de evoluir continuamente
Beneficiário:Tiago Agostinho de Almeida
Linha de fomento: Auxílio à Pesquisa - Regular