Advanced search
Start date
Betweenand

Combining complex networks and word embeddings in text classification tasks

Grant number: 20/06271-0
Support Opportunities:Regular Research Grants
Start date: March 01, 2022
End date: February 29, 2024
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Diego Raphael Amancio
Grantee:Diego Raphael Amancio
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

Complex networks have been used to model a myriad of complex systems. Even though this model has already been employed in text classification tasks, most of the studies are based on the co-occurrence model, which has some drawbacks. We propose here an extension of the traditional co-occurrence model by using information obtained from word embeddings. The proposed model includes additional virtual edges reflecting the similarity between words (vertices). The enriched networks are expected to provide an improved characterization of texts. As a consequence, we expect a gain in performance and robustness in the considered classification tasks. Because the proposed method is generic, the same approach could be applied to analyze any networked systems by using information encoded in node embeddings. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (5)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
BRITO, ANA C. M.; SILVA, FILIPI N.; AMANCIO, DIEGO R.. Analyzing the influence of prolific collaborations on authors productivity and visibility. SCIENTOMETRICS, v. 128, n. 4, p. 17-pg., . (20/14817-2, 20/06271-0)
DE ARRUDA, HENRIQUE FERRAZ; REIA, SANDRO MARTINELLI; SILVA, FILIPI NASCIMENTO; AMANCIO, DIEGO RAPHAEL; COSTA, LUCIANO DA FONTOURA. Finding contrasting patterns in rhythmic properties between prose and poetry. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, v. 598, p. 13-pg., . (18/10489-0, 20/06271-0, 15/22308-2)
SOUZA, BARBARA C. E.; SILVA, FILIPI N.; DE ARRUDA, HENRIQUE F.; DA SILVA, GIOVANA D.; COSTA, LUCIANO DA F.; AMANCIO, DIEGO R.. Text characterization based on recurrence networks. INFORMATION SCIENCES, v. 641, p. 15-pg., . (19/07665-4, 21/01744-0, 20/06271-0, 18/10489-0, 15/22308-2)
BRITO, ANA CAROLINE M.; OLIVEIRA, MARIA CRISTINA F.; OLIVEIRA JR, OSVALDO N.; SILVA, FILIPI N.; AMANCIO, DIEGO R.. Network Analysis and Natural Language Processing to Obtain a Landscape of the Scientific Literature on Materials Applications. ACS APPLIED MATERIALS & INTERFACES, v. 15, n. 23, p. 10-pg., . (20/14817-2, 20/06271-0, 18/22214-6)