Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Concentric network symmetry grasps authors' styles in word adjacency networks

Full text
Author(s):
Amancio, Diego R. [1] ; Silva, Filipi N. [2] ; Costa, Luciano da F. [2]
Total Authors: 3
Affiliation:
[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP - Brazil
[2] Univ Sao Paulo, Sao Carlos Inst Phys, Sao Carlos, SP - Brazil
Total Affiliations: 2
Document type: Journal article
Source: EPL; v. 110, n. 6 JUN 2015.
Web of Science Citations: 8
Abstract

Several characteristics of written texts have been inferred from statistical analysis derived from networked models. Even though many network measurements have been adapted to study textual properties at several levels of complexity, some textual aspects have been disregarded. In this paper, we study the symmetry of word adjacency networks, a well-known representation of text as a graph. A statistical analysis of the symmetry distribution performed in several novels showed that most of the words do not display symmetric patterns of connectivity. More specifically, the merged symmetry displayed a distribution similar to the ubiquitous power-law distribution. Our experiments also revealed that the studied metrics do not correlate with other traditional network measurements, such as the degree or the betweenness centrality. The discriminability power of the symmetry measurements was verified in the authorship attribution task. Interestingly, we found that specific authors prefer particular types of symmetric motifs. As a consequence, the authorship of books could be accurately identified in 82.5% of the cases, in a dataset comprising books written by 8 authors. Because the proposed measurements for text analysis are complementary to the traditional approach, they can be used to improve the characterization of text networks, which might be useful for applications based on stylistic classification. Copyright (C) EPLA, 2015. (AU)

FAPESP's process: 14/20830-0 - Using complex networks to recognize patterns in written texts
Grantee:Diego Raphael Amancio
Support Opportunities: Regular Research Grants
FAPESP's process: 11/50761-2 - Models and methods of e-Science for life and agricultural sciences
Grantee:Roberto Marcondes Cesar Junior
Support Opportunities: Research Projects - Thematic Grants