Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Multi-Domain Aspect Extraction Using Bidirectional Encoder Representations From Transformers

Texto completo
Autor(es):
Santos, Brucce Neves Dos [1] ; Marcacini, Ricardo Marcondes [1] ; Rezende, Solange Oliveira [1]
Número total de Autores: 3
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Inst Math & Comp Sci, BR-13566590 Sao Carlos - Brazil
Número total de Afiliações: 1
Tipo de documento: Artigo Científico
Fonte: IEEE ACCESS; v. 9, p. 91604-91613, 2021.
Citações Web of Science: 0
Resumo

Deep learning and neural language models have obtained state-of-the-art results in aspects extraction tasks, in which the objective is to automatically extract characteristics of products and services that are the target of consumer opinion. However, these methods require a large amount of labeled data to achieve such results. Since data labeling is a costly task, there are no labeled data available for all domains. In this paper, we propose an approach for aspect extraction in a multi-domain transfer learning scenario, thereby leveraging labeled data from different source domains to extract aspects of a new unlabeled target domain. Our approach, called MDAE-BERT (Multi-Domain Aspect Extraction using Bidirectional Encoder Representations from Transformers), explores neural language models to deal with two major challenges in multi-domain learning: (1) inconsistency of aspects from target and source domains and (2) context-based semantic distance between ambiguous aspects. We evaluated our MDAE-BERT considering two perspectives (1) the aspect extraction performance using F1-Macro and Accuracy measures; and (2) by comparing the multi-domain aspect extraction models and single-domain models for aspect extraction. In the first perspective, our method outperforms the LSTM-based approach. In the second perspective, our approach proved to be a competitive alternative compared to the single-domain model trained in a specific domain, even in the absence of labeled data from the target domain. (AU)

Processo FAPESP: 19/25010-5 - Representações semanticamente enriquecidas para mineração de textos em português: modelos e aplicações
Beneficiário:Solange Oliveira Rezende
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 19/07665-4 - Centro de Inteligência Artificial
Beneficiário:Fabio Gagliardi Cozman
Modalidade de apoio: Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia