Busca avançada
Ano de início
Entree


Sequential Short-Text Classification from Multiple Textual Representations with Weak Supervision

Texto completo
Autor(es):
Reis Filho, Ivan J. ; Martins, Luiz H. D. ; Parmezan, Antonio R. S. ; Marcacini, Ricardo M. ; Rezende, Solange O. ; Xavier-Junior, JC ; Rios, RA
Número total de Autores: 7
Tipo de documento: Artigo Científico
Fonte: INTELLIGENT SYSTEMS, PT I; v. 13653, p. 15-pg., 2022-01-01.
Resumo

The amount of news generated on the internet has increased significantly in recent years. As a trend, text data has gained attention from industry, government, academia, and the financial market. This information is potentially valuable to assist domain experts in decision making. Therefore, related applications based on machine learning have been widely available in several areas of knowledge. However, for supervised learning tasks, the availability of annotated texts in quantity and quality is a recurring problem. This work proposes a time-series-driven approach to labeling chronologically arranged documents. Our proposal categorizes short texts for a particular domain according to the level and trend patterns of a given time series. We use the obtained weak labels with the understanding that they are imperfect but still useful for building predictive text models. Documents and agribusiness commodity price series were employed to assess performance in four classification scenarios. The experimental evaluation considered nine textual representations and different learning paradigms. Neural language-based models demonstrated better classification performance than traditional ones. The results indicate that the proposed approach can be an alternative for automatically labeling a large news volume. (AU)

Processo FAPESP: 19/07665-4 - Centro de Inteligência Artificial
Beneficiário:Fabio Gagliardi Cozman
Modalidade de apoio: Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia
Processo FAPESP: 19/25010-5 - Representações semanticamente enriquecidas para mineração de textos em português: modelos e aplicações
Beneficiário:Solange Oliveira Rezende
Modalidade de apoio: Auxílio à Pesquisa - Regular