Multi-Domain Aspect Extraction Using Bidirectional Encoder Representations From Transformers

Santos, Brucce Neves Dos; Marcacini, Ricardo Marcondes; Rezende, Solange Oliveira

Texto completo
Autor(es):	Santos, Brucce Neves Dos ^[1] ; Marcacini, Ricardo Marcondes ^[1] ; Rezende, Solange Oliveira ^[1] Número total de Autores: 3
Afiliação do(s) autor(es):	^[1] Univ Sao Paulo, Inst Math & Comp Sci, BR-13566590 Sao Carlos - Brazil Número total de Afiliações: 1
Tipo de documento:	Artigo Científico
Fonte:	IEEE ACCESS; v. 9, p. 91604-91613, 2021.
Citações Web of Science:	0
Resumo
Deep learning and neural language models have obtained state-of-the-art results in aspects extraction tasks, in which the objective is to automatically extract characteristics of products and services that are the target of consumer opinion. However, these methods require a large amount of labeled data to achieve such results. Since data labeling is a costly task, there are no labeled data available for all domains. In this paper, we propose an approach for aspect extraction in a multi-domain transfer learning scenario, thereby leveraging labeled data from different source domains to extract aspects of a new unlabeled target domain. Our approach, called MDAE-BERT (Multi-Domain Aspect Extraction using Bidirectional Encoder Representations from Transformers), explores neural language models to deal with two major challenges in multi-domain learning: (1) inconsistency of aspects from target and source domains and (2) context-based semantic distance between ambiguous aspects. We evaluated our MDAE-BERT considering two perspectives (1) the aspect extraction performance using F1-Macro and Accuracy measures; and (2) by comparing the multi-domain aspect extraction models and single-domain models for aspect extraction. In the first perspective, our method outperforms the LSTM-based approach. In the second perspective, our approach proved to be a competitive alternative compared to the single-domain model trained in a specific domain, even in the absence of labeled data from the target domain. (AU)

Processo FAPESP:	19/25010-5 - Representações semanticamente enriquecidas para mineração de textos em português: modelos e aplicações
Beneficiário:	Solange Oliveira Rezende
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	19/07665-4 - Centro de Inteligência Artificial
Beneficiário:	Fabio Gagliardi Cozman
Modalidade de apoio:	Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia

URL curto