Busca avançada
Ano de início
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Concentric network symmetry grasps authors' styles in word adjacency networks

Texto completo
Amancio, Diego R. [1] ; Silva, Filipi N. [2] ; Costa, Luciano da F. [2]
Número total de Autores: 3
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP - Brazil
[2] Univ Sao Paulo, Sao Carlos Inst Phys, Sao Carlos, SP - Brazil
Número total de Afiliações: 2
Tipo de documento: Artigo Científico
Fonte: EPL; v. 110, n. 6 JUN 2015.
Citações Web of Science: 8

Several characteristics of written texts have been inferred from statistical analysis derived from networked models. Even though many network measurements have been adapted to study textual properties at several levels of complexity, some textual aspects have been disregarded. In this paper, we study the symmetry of word adjacency networks, a well-known representation of text as a graph. A statistical analysis of the symmetry distribution performed in several novels showed that most of the words do not display symmetric patterns of connectivity. More specifically, the merged symmetry displayed a distribution similar to the ubiquitous power-law distribution. Our experiments also revealed that the studied metrics do not correlate with other traditional network measurements, such as the degree or the betweenness centrality. The discriminability power of the symmetry measurements was verified in the authorship attribution task. Interestingly, we found that specific authors prefer particular types of symmetric motifs. As a consequence, the authorship of books could be accurately identified in 82.5% of the cases, in a dataset comprising books written by 8 authors. Because the proposed measurements for text analysis are complementary to the traditional approach, they can be used to improve the characterization of text networks, which might be useful for applications based on stylistic classification. Copyright (C) EPLA, 2015. (AU)

Processo FAPESP: 14/20830-0 - Modelagem e reconhecimento de padrões em textos com redes complexas
Beneficiário:Diego Raphael Amancio
Linha de fomento: Auxílio à Pesquisa - Regular
Processo FAPESP: 11/50761-2 - Modelos e métodos de e-Science para ciências da vida e agrárias
Beneficiário:Roberto Marcondes Cesar Junior
Linha de fomento: Auxílio à Pesquisa - Temático