Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Semantic flow in language networks discriminates texts by genre and publication date

Texto completo
Autor(es):
Correa Jr, Edilson A. ; Marinho, Vanessa Q. [1] ; Amancio, Diego R. [1]
Número total de Autores: 3
Afiliação do(s) autor(es):
[1] Correa Jr, Jr., Edilson A., Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP - Brazil
Número total de Afiliações: 1
Tipo de documento: Artigo Científico
Fonte: PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS; v. 557, NOV 1 2020.
Citações Web of Science: 0
Resumo

We propose a framework to characterize documents based on their semantic flow. The proposed framework encompasses a network-based model that connected sentences based on their semantic similarity. Semantic fields are detected using standard community detection methods. As the story unfolds, transitions between semantic fields are represented in Markov networks, which in turn are characterized via network motifs (subgraphs). Here we show that different book characteristics (such as genre and publication date) are discriminated by the adopted semantic flow representation. Remarkably, even without a systematic optimization of parameters, philosophy and investigative books were discriminated with an accuracy rate of 92.5%. While the objective of this study is not to create a text classification method, we believe that semantic flow features could be used in traditional network-based models of texts that capture only syntactical/stylistic information to improve the characterization of texts. (C) 2020 Elsevier B.V. All rights reserved. (AU)

Processo FAPESP: 15/05676-8 - Desenvolvimento de novos modelos para reconhecimento de autoria com a utilização de redes complexas
Beneficiário:Vanessa Queiroz Marinho
Modalidade de apoio: Bolsas no Brasil - Mestrado
Processo FAPESP: 16/19069-9 - Classificação de documentos usando informações semânticas em redes complexas
Beneficiário:Diego Raphael Amancio
Modalidade de apoio: Auxílio à Pesquisa - Regular