Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Using unsupervised information to improve semi-supervised tweet sentiment classification

Full text
Author(s):
Felipe da Silva, Nadia Felix [1] ; Coletta, Luiz F. S. [1] ; Hruschka, Eduardo R. [1] ; Hruschka, Jr., Estevam R. [2]
Total Authors: 4
Affiliation:
[1] Univ Sao Paulo, Dept Comp Sci, Ave Trabalhador Sao Carlense 400, BR-13560970 Sao Carlos, SP - Brazil
[2] Fed Univ UFSCAR, Dept Comp Sci, Rodovia Washington Luis, Km 235-SP-310, BR-13565905 Sao Carlos, SP - Brazil
Total Affiliations: 2
Document type: Journal article
Source: INFORMATION SCIENCES; v. 355, p. 348-365, AUG 10 2016.
Web of Science Citations: 13
Abstract

Supervised algorithms require a set of representative labeled data for building classification models. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses both labeled and unlabeled data in the training process and is particularly useful in applications such as tweet sentiment analysis, where a large amount of unlabeled data is available. Semi supervised learning for tweet sentiment analysis, although quite appealing, is relatively new. We propose a semi-supervised learning framework that combines unsupervised information, captured from a similarity matrix constructed from unlabeled data, with a classifier. Our motivation is that such a similarity matrix is a powerful knowledge-discovery tool that can help classify unlabeled tweet sets. Our framework makes use of the well-known Self-training algorithm to induce a better tweet sentiment classifier. Experimental results in real-world datasets demonstrate that the proposed framework can improve the accuracy of tweet sentiment analysis. (C) 2016 Elsevier Inc. All rights reserved. (AU)

FAPESP's process: 13/07375-0 - CeMEAI - Center for Mathematical Sciences Applied to Industry
Grantee:Francisco Louzada Neto
Support Opportunities: Research Grants - Research, Innovation and Dissemination Centers - RIDC
FAPESP's process: 10/20830-0 - Evolutionary Algorithms for Aggregating Ensembles of Classifiers and Clusterers
Grantee:Luiz Fernando Sommaggio Coletta
Support Opportunities: Scholarships in Brazil - Doctorate