Advanced search
Start date
Betweenand

Semantically enriched representations for Portuguese textmining: models and applications

Abstract

Text Mining techniques have become essential for supporting text analysis and knowledge discovery as the volume and variety of digital text documents have increased, either in social networks and the Web or inside organizations. Despite the application task or applied technique, the treatment of text semantics is an important challenge of the Text Mining process. The challenge is even bigger when we analyze Portuguese texts due to language particularities and the low number of Portuguese resources and researches. In this context, this project aims to advance Text Mining research, focusing on the Portuguese language, and disseminate the knowledge of the field by applying Text Mining techniques in different real-world problems. We will investigate and propose semantically enriched text representation models, considering both the vector-space model and network-based representations, as well as their application in one-class learning. As a first step to support this research, we will collect, prepare and characterize collections of texts written in Portuguese, and make consolidated information about labeled collections available to the research community. Lastly, we will evaluate and apply semantically enriched text representations in different Text Mining problems, such as sentiment analysis, recommendation systems, fake news detection, literature-based discovery and event mining. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
Articles published in other media outlets (0 total):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
ARAUJO, ADAILTON F.; GOLO, MARCOS P. S.; MARCACINI, RICARDO M.. Opinion mining for app reviews: an analysis of textual representation and predictive models. AUTOMATED SOFTWARE ENGINEERING, v. 29, n. 1, . (19/25010-5, 19/07665-4)
SANTOS, BRUCCE NEVES DOS; MARCACINI, RICARDO MARCONDES; REZENDE, SOLANGE OLIVEIRA. Multi-Domain Aspect Extraction Using Bidirectional Encoder Representations From Transformers. IEEE ACCESS, v. 9, p. 91604-91613, . (19/25010-5, 19/07665-4)
DE SOUZA, MARIANA CARAVANTI; NOGUEIRA, BRUNO MAGALHAES; ROSSI, RAFAEL GERALDELI; MARCACINI, RICARDO MARCONDES; DOS SANTOS, BRUCCE NEVES; REZENDE, SOLANGE OLIVEIRA. A network-based positive and unlabeled learning approach for fake news detection. MACHINE LEARNING, . (19/25010-5)

Please report errors in scientific publications list by writing to: cdi@fapesp.br.