Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Labelled network subgraphs reveal stylistic subtleties in written texts

Full text
Author(s):
Marinho, Vanessa Queiroz [1] ; Hirst, Graeme [2] ; Amancio, Diego Raphael [1]
Total Authors: 3
Affiliation:
[1] Univ Sao Paulo, Inst Math & Comp Sci, Ave Trabalhador Sancarlense, 400 Ctr, BR-13566590 Sao Carlos, SP - Brazil
[2] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3G4 - Canada
Total Affiliations: 2
Document type: Journal article
Source: JOURNAL OF COMPLEX NETWORKS; v. 6, n. 4, p. 620-638, AUG 2018.
Web of Science Citations: 0
Abstract

The vast amount of data and increase of computational capacity have allowed the analysis of texts from several perspectives, including the representation of texts as complex networks. Nodes of the network represent the words, and edges represent some relationship, usually word co-occurrence. Even though networked representations have been applied to study some tasks, such approaches are not usually combined with traditional models relying upon statistical paradigms. Because networked models are able to grasp textual patterns, we devised a hybrid classifier, called labelled subgraphs, that combines the frequency of common words with small structures found in the topology of the network. Our approach is illustrated in two contexts, authorship attribution and translationese identification. In the former, a set of novels written by different authors is analysed. To identify translationese, texts from the Canadian Hansard and the European Parliament were classified as to original and translated instances. Our results suggest that labelled subgraphs are able to represent texts and it should be further explored in other tasks, such as the analysis of text complexity, language proficiency and machine translation. (AU)

FAPESP's process: 15/05676-8 - Development of new models for authorship recognition using complex networks
Grantee:Vanessa Queiroz Marinho
Support Opportunities: Scholarships in Brazil - Master
FAPESP's process: 14/20830-0 - Using complex networks to recognize patterns in written texts
Grantee:Diego Raphael Amancio
Support Opportunities: Regular Research Grants
FAPESP's process: 15/23803-7 - Authorship attribution with traditional methods and complex networks
Grantee:Vanessa Queiroz Marinho
Support Opportunities: Scholarships abroad - Research Internship - Master's degree
FAPESP's process: 16/19069-9 - Using semantical information to classify texts modelled as complex networks
Grantee:Diego Raphael Amancio
Support Opportunities: Regular Research Grants