Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Deep analysis of word sense disambiguation via semi-supervised learning and neural word representations

Texto completo
Autor(es):
Duarte, Jose Marcio [1] ; Sousa, Samuel [1] ; Milios, Evangelos [2] ; Berton, Lilian [1]
Número total de Autores: 4
Afiliação do(s) autor(es):
[1] Univ Fed Sao Paulo, Inst Sci & Technol, BR-12247014 Sao Jose Dos Campos, SP - Brazil
[2] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 1W5 - Canada
Número total de Afiliações: 2
Tipo de documento: Artigo Científico
Fonte: INFORMATION SCIENCES; v. 570, p. 278-297, SEP 2021.
Citações Web of Science: 0
Resumo

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context. Different approaches have been proposed in supervised and unsupervised domains. In most cases, supervised learning provides superior WSD performance. Since sense annotated corpora can be difficult or time-consuming to obtain, which must be repeated for new domains, languages, and sense inventories, semi-supervised learning (SSL) methods, that combine a small amount of sense-annotated data, start to be pre-eminent. In SSL, graph-based methods are common, because they capture the relationships between terms using an undirected graph. This paper aims to investigate semi-supervised WSD by considering different graph-based SSL algorithms with features generated by word embeddings from Word2Vec, FastText, GloVe, BERT and ELECTRA models combined with parts-of speech tags and word context. We test several combinations of word-embedding models, similarity measures for graph construction and SSL classification algorithms to disambiguate classical lexical sample WSD datasets. The results indicate our SSL algorithms achieved competitive results compared to supervised ones and the ELECTRA models performed better than other embeddings for SSL. (c) 2021 Elsevier Inc. All rights reserved. (AU)

Processo FAPESP: 18/01722-3 - Aprendizado semissupervisionado via redes complexas: construção de redes, seleção e propagação de rótulos e aplicações
Beneficiário:Lilian Berton
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 18/09465-0 - Desambiguação de palavras via algoritmos semissupervisionados baseados em grafos
Beneficiário:Samuel Bruno da Silva Sousa
Modalidade de apoio: Bolsas no Brasil - Mestrado