Advanced search
Start date
Betweenand

Development of new models for authorship recognition using complex networks

Grant number: 15/05676-8
Support Opportunities:Scholarships in Brazil - Master
Start date: July 01, 2015
End date: July 31, 2017
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Agreement: Coordination of Improvement of Higher Education Personnel (CAPES)
Principal Investigator:Diego Raphael Amancio
Grantee:Vanessa Queiroz Marinho
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil
Associated scholarship(s):15/23803-7 - Authorship attribution with traditional methods and complex networks, BE.EP.MS

Abstract

The modeling of graphs and complex networks has been successfully applied in different fields, being the object of study in different areas including, for example, mathematics and computer science. The discovery that methods derived from the study of complex networks can be used to analyze texts in their different complexity levels provided great advances in natural language processing tasks. Examples of applications analyzed with the methods and tools of complex networks are the detection of relevant concepts, development of automatic summarizers and authorship recognition systems. The latter task, which is the focus of this research project, has been studied with some success through the representation of words adjacency networks that connect only the closest words. The purpose of this project is to extend the traditional modeling, choosing the optimal connection window to the problem, for a given training set. In addition, we intend to use the connectivity information of function words to complement the characterization of authors' style. Finally, we inted to create hybrid classifiers that are able to combine traditional factors with properties provided by the topological analysis of complex networks. By adapting, combining and improving the model, we aim not only improve the performance of textual stylistic characterization and authorship recognition systems, but also better understand what are the textual quantitative factors (measured through networks) that can be used in stylometry. The advances obtained during this project may be useful tostudy related applications, such as the analysis of stylistic inconsistences and plagiarism. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (7)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
DE ARRUDA, HENRIQUE F.; MARINHO, VANESSA Q.; LIMA, THALES S.; AMANCIO, DIEGO R.; COSTA, LUCIANO DA F.. An image analysis approach to text analytics based on complex networks. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, v. 510, p. 110-120, . (16/19069-9, 11/50761-2, 15/22308-2, 15/05676-8)
CORREA, EDILSON A., JR.; MARINHO, VANESSA Q.; DOS SANTOS, LEANDRO B.; BERTAGLIA, THALES F. C.; TREVISO, MARCOS V.; BRUM, HENRICO B.; IEEE. PELESent: Cross-domain polarity classification using distant supervision. 2017 6TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), v. N/A, p. 6-pg., . (15/05676-8)
CORREA JR, EDILSON A.; MARINHO, VANESSA Q.; AMANCIO, DIEGO R.. Semantic flow in language networks discriminates texts by genre and publication date. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, v. 557, . (15/05676-8, 16/19069-9)
MARINHO, VANESSA QUEIROZ; HIRST, GRAEME; AMANCIO, DIEGO RAPHAEL. Labelled network subgraphs reveal stylistic subtleties in written texts. JOURNAL OF COMPLEX NETWORKS, v. 6, n. 4, p. 620-638, . (15/05676-8, 14/20830-0, 15/23803-7, 16/19069-9)
DE ARRUDA, HENRIQUE F.; MARINHO, VANESSA Q.; COSTA, LUCIANO DA F.; AMANCIO, DIEGO R.. Paragraph-based representation of texts: A complex networks approach. INFORMATION PROCESSING & MANAGEMENT, v. 56, n. 3, p. 479-494, . (17/13464-6, 15/22308-2, 16/19069-9, 11/50761-2, 15/05676-8)
DE ARRUDA, HENRIQUE FERRAZ; SILVA, FILIPI NASCIMENTO; MARINHO, VANESSA QUEIROZ; AMANCIO, DIEGO RAPHAEL; COSTA, LUCIANO DA FONTOURA. Representation of texts as complex networks: a mesoscopic approach. JOURNAL OF COMPLEX NETWORKS, v. 6, n. 1, p. 125-144, . (16/19069-9, 11/50761-2, 15/05676-8, 14/20830-0, 15/08003-4)
MARINHO, VANESSA QUEIROZ; HIRST, GRAEME; AMANCIO, DIEGO RAPHAEL; IEEE. Authorship attribution via network motifs identification. PROCEEDINGS OF 2016 5TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2016), v. N/A, p. 6-pg., . (15/05676-8, 14/20830-0, 15/23803-7)
Academic Publications
(References retrieved automatically from State of São Paulo Research Institutions)
MARINHO, Vanessa Queiroz. Development of new models for authorship recognition using complex networks. 2017. Master's Dissertation - Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB) São Carlos.