Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

A multi-centrality index for graph-based keyword extraction

Full text
Author(s):
Vega-Olivero, Didier A. [1, 2] ; Gomes, Pedro Spoljaric [3] ; Milios, Evangelos E. [4] ; Berton, Lilian [3]
Total Authors: 4
Affiliation:
[1] Indiana Univ, Sch Informat Comp & Engn, Bloomington, IN - USA
[2] Univ Sao Paulo, Dept Comp & Math, Ribeirao Preto, SP - Brazil
[3] Univ Fed Sao Paulo, Inst Sci & Technol, Sao Jose Dos Campos, SP - Brazil
[4] Dalhousie Univ, Fac Comp Sci, Halifax, NS - Canada
Total Affiliations: 4
Document type: Journal article
Source: INFORMATION PROCESSING & MANAGEMENT; v. 56, n. 6 NOV 2019.
Web of Science Citations: 0
Abstract

Keyword extraction aims to capture the main topics of a document and is an important step in natural language processing (NLP) applications. The use of different graph centrality measures has been proposed to extract automatic keywords. However, there is no consensus yet on how these measures compare in this task. Here, we present the multi-centrality index (MCI) approach, which aims to find the optimal combination of word rankings according to the selection of centrality measures. We analyze nine centrality measures (Betweenness, Clustering Coefficient, Closeness, Degree, Eccentricity, Eigenvector, K-Core, PageRank, Structural Holes) for identifying keywords in co-occurrence word-graphs representation of documents. We perform experiments on three datasets of documents and demonstrate that all individual centrality methods achieve similar statistical results, while the proposed MCI approach significantly outperforms the individual centralities, three clustering algorithms, and previously reported results in the literature. (AU)

FAPESP's process: 18/01722-3 - Semi-supervised learning via complex networks: network construction, selection and propagation of labels and applications
Grantee:Lilian Berton
Support Opportunities: Regular Research Grants
FAPESP's process: 18/24260-5 - Spatiotemporal Data Analytics based on Complex Networks
Grantee:Didier Augusto Vega Oliveros
Support Opportunities: Scholarships abroad - Research Internship - Post-doctor
FAPESP's process: 16/23698-1 - Dynamical Processes in Complex Network based on Machine Learning
Grantee:Didier Augusto Vega Oliveros
Support Opportunities: Scholarships in Brazil - Post-Doctoral
FAPESP's process: 15/50122-0 - Dynamic phenomena in complex networks: basics and applications
Grantee:Elbert Einstein Nehrer Macau
Support Opportunities: Research Projects - Thematic Grants