Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Using unsupervised information to improve semi-supervised tweet sentiment classification

Texto completo
Autor(es):
Felipe da Silva, Nadia Felix [1] ; Coletta, Luiz F. S. [1] ; Hruschka, Eduardo R. [1] ; Hruschka, Jr., Estevam R. [2]
Número total de Autores: 4
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Dept Comp Sci, Ave Trabalhador Sao Carlense 400, BR-13560970 Sao Carlos, SP - Brazil
[2] Fed Univ UFSCAR, Dept Comp Sci, Rodovia Washington Luis, Km 235-SP-310, BR-13565905 Sao Carlos, SP - Brazil
Número total de Afiliações: 2
Tipo de documento: Artigo Científico
Fonte: INFORMATION SCIENCES; v. 355, p. 348-365, AUG 10 2016.
Citações Web of Science: 11
Resumo

Supervised algorithms require a set of representative labeled data for building classification models. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses both labeled and unlabeled data in the training process and is particularly useful in applications such as tweet sentiment analysis, where a large amount of unlabeled data is available. Semi supervised learning for tweet sentiment analysis, although quite appealing, is relatively new. We propose a semi-supervised learning framework that combines unsupervised information, captured from a similarity matrix constructed from unlabeled data, with a classifier. Our motivation is that such a similarity matrix is a powerful knowledge-discovery tool that can help classify unlabeled tweet sets. Our framework makes use of the well-known Self-training algorithm to induce a better tweet sentiment classifier. Experimental results in real-world datasets demonstrate that the proposed framework can improve the accuracy of tweet sentiment analysis. (C) 2016 Elsevier Inc. All rights reserved. (AU)

Processo FAPESP: 10/20830-0 - Algoritmos evolutivos para agregar classificadores e agrupadores
Beneficiário:Luiz Fernando Sommaggio Coletta
Linha de fomento: Bolsas no Brasil - Doutorado
Processo FAPESP: 13/07375-0 - CeMEAI - Centro de Ciências Matemáticas Aplicadas à Indústria
Beneficiário:José Alberto Cuminato
Linha de fomento: Auxílio à Pesquisa - Centros de Pesquisa, Inovação e Difusão - CEPIDs