| Full text | |
| Author(s): |
Vega-Olivero, Didier A.
[1, 2]
;
Gomes, Pedro Spoljaric
[3]
;
Milios, Evangelos E.
[4]
;
Berton, Lilian
[3]
Total Authors: 4
|
| Affiliation: | [1] Indiana Univ, Sch Informat Comp & Engn, Bloomington, IN - USA
[2] Univ Sao Paulo, Dept Comp & Math, Ribeirao Preto, SP - Brazil
[3] Univ Fed Sao Paulo, Inst Sci & Technol, Sao Jose Dos Campos, SP - Brazil
[4] Dalhousie Univ, Fac Comp Sci, Halifax, NS - Canada
Total Affiliations: 4
|
| Document type: | Journal article |
| Source: | INFORMATION PROCESSING & MANAGEMENT; v. 56, n. 6 NOV 2019. |
| Web of Science Citations: | 0 |
| Abstract | |
Keyword extraction aims to capture the main topics of a document and is an important step in natural language processing (NLP) applications. The use of different graph centrality measures has been proposed to extract automatic keywords. However, there is no consensus yet on how these measures compare in this task. Here, we present the multi-centrality index (MCI) approach, which aims to find the optimal combination of word rankings according to the selection of centrality measures. We analyze nine centrality measures (Betweenness, Clustering Coefficient, Closeness, Degree, Eccentricity, Eigenvector, K-Core, PageRank, Structural Holes) for identifying keywords in co-occurrence word-graphs representation of documents. We perform experiments on three datasets of documents and demonstrate that all individual centrality methods achieve similar statistical results, while the proposed MCI approach significantly outperforms the individual centralities, three clustering algorithms, and previously reported results in the literature. (AU) | |
| FAPESP's process: | 18/01722-3 - Semi-supervised learning via complex networks: network construction, selection and propagation of labels and applications |
| Grantee: | Lilian Berton |
| Support Opportunities: | Regular Research Grants |
| FAPESP's process: | 18/24260-5 - Spatiotemporal Data Analytics based on Complex Networks |
| Grantee: | Didier Augusto Vega Oliveros |
| Support Opportunities: | Scholarships abroad - Research Internship - Post-doctor |
| FAPESP's process: | 16/23698-1 - Dynamical Processes in Complex Network based on Machine Learning |
| Grantee: | Didier Augusto Vega Oliveros |
| Support Opportunities: | Scholarships in Brazil - Post-Doctoral |
| FAPESP's process: | 15/50122-0 - Dynamic phenomena in complex networks: basics and applications |
| Grantee: | Elbert Einstein Nehrer Macau |
| Support Opportunities: | Research Projects - Thematic Grants |