Advanced search
Start date
Betweenand

Distributed text representation model with online learning

Grant number: 18/02146-6
Support Opportunities:Scholarships in Brazil - Post-Doctorate
Effective date (Start): August 01, 2018
Effective date (End): July 31, 2021
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computer Systems
Principal Investigator:Tiago Agostinho de Almeida
Grantee:Renato Moraes Silva
Host Institution: Centro de Ciências em Gestão e Tecnologia (CCGT). Universidade Federal de São Carlos (UFSCAR). Campus de Sorocaba. Sorocaba , SP, Brazil

Abstract

The amount of digital information stored in text format has been growing radically for the last decade with the digital inclusion and the popularization of smartphones. For that reason, the demand for automatic systems that can extract knowledge from texts has increased and has become more and more fundamental. The quality of these systems is highly influenced by the computational representation models of the texts. The most traditional model, the "bag of words", does not capture the context and semantic relations, is highly sparse and is not able to reflect the constant changes in the textual information patterns, generated by applications such as social networks and instant messaging systems, within an acceptable time. Even the recent distributed text representation models have limitations when used in these scenarios, since they should have incremental learning because new terms, such as slang, symbols, and abbreviations, arise very frequently. Therefore, this research project aims to propose and develop a distributed representation of texts that can be updated online. For this, unsupervised clustering techniques and recurrent neural networks can be combined to associate new terms to groups of known terms, enabling the model to appropriately represent terms not seen before. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
Articles published in other media outlets (0 total):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
BITTENCOURT, MARCIELE M.; SILVA, RENATO M.; ALMEIDA, TIAGO A.. ML-MDLText: An efficient and lightweight multilabel text classifier with incremental learning. APPLIED SOFT COMPUTING, v. 96, . (18/02146-6, 17/09387-6)
SILVA, RENATO M.; SANTOS, RONEY L. S.; ALMEIDA, TIAGO A.; PARDO, THIAGO A. S.. Towards automatically filtering fake news in Portuguese. EXPERT SYSTEMS WITH APPLICATIONS, v. 146, . (18/02146-6, 17/09387-6)
FREITAS, BRENO L.; SILVA, RENATO M.; ALMEIDA, TIAGO A.. Gaussian Mixture Descriptors Learner. KNOWLEDGE-BASED SYSTEMS, v. 188, . (18/02146-6, 17/09387-6)

Please report errors in scientific publications list by writing to: cdi@fapesp.br.