Word sense disambiguation: A complex network approach

Correa, Jr., Edilson A.; Lopes, Alneu A.; Amancio, Diego R.

Texto completo
Autor(es):	Correa, Jr., Edilson A. ^[1] ; Lopes, Alneu A. ^[1] ; Amancio, Diego R. ^{[1, 2]} Número total de Autores: 3
Afiliação do(s) autor(es):	^[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP - Brazil ^[2] Indiana Univ, Sch Informat Comp & Engn, Bloomington, IN 47408 - USA Número total de Afiliações: 2
Tipo de documento:	Artigo Científico
Fonte:	INFORMATION SCIENCES; v. 442, p. 103-113, MAY 2018.
Citações Web of Science:	6
Resumo
The word sense disambiguation (WSD) task aims at identifying the meaning of words in a given context for specific words conveying multiple meanings. This task plays a prominent role in a myriad of real world applications, such as machine translation, word processing and information retrieval. Recently, concepts and methods of complex networks have been employed to tackle this task by representing words as nodes, which are connected if they are semantically similar. Despite the increasingly number of studies carried out with such models, most of them use networks just to represent the data, while the pattern recognition performed on the attribute space is performed using traditional learning techniques. In other words, the structural relationships between words have not been explicitly used in the pattern recognition process. In addition, only a few investigations have probed the suitability of representations based on bipartite networks and graphs (bigraphs) for the problem, as many approaches consider all possible links between words. In this context, we assess the relevance of a bipartite network model representing both feature words (i.e. the words characterizing the context) and target (ambiguous) words to solve ambiguities in written texts. Here, we focus on semantical relationships between these two type of words, disregarding relationships between feature words. The adopted method not only serves to represent texts as graphs, but also constructs a structure on which the discrimination of senses is accomplished. Our results revealed that the adopted learning algorithm in such bipartite networks provides excellent results mostly when local features are employed to characterize the context. Surprisingly, our method even outperformed the support vector machine algorithm in particular cases, with the advantage of being robust even if a small training dataset is available. Taken together, the results obtained here show that the representation/classification used for the WSD problem might be useful to improve the semantical characterization of written texts without the use of deep linguistic information. (C) 2018 Elsevier Inc. All rights reserved. (AU)

Processo FAPESP:	17/13464-6 - Modelando grafos de citação e informação: uma abordagem baseada em redes complexas
Beneficiário:	Diego Raphael Amancio
Modalidade de apoio:	Bolsas no Exterior - Pesquisa


Processo FAPESP:	16/19069-9 - Classificação de documentos usando informações semânticas em redes complexas
Beneficiário:	Diego Raphael Amancio
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	15/14228-9 - Análise e Mineração de Redes Sociais
Beneficiário:	Alneu de Andrade Lopes
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	14/20830-0 - Modelagem e reconhecimento de padrões em textos com redes complexas
Beneficiário:	Diego Raphael Amancio
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	11/22749-8 - Desafios em visualização exploratória de dados multidimensionais: novos paradigmas, escalabilidade e aplicações
Beneficiário:	Luis Gustavo Nonato
Modalidade de apoio:	Auxílio à Pesquisa - Temático

URL curto