Opinion mining for app reviews: an analysis of textual representation and predictive models

Araujo, Adailton F.; Golo, Marcos P. S.; Marcacini, Ricardo M.

Texto completo
Autor(es):	Araujo, Adailton F. ^[1] ; Golo, Marcos P. S. ^[1] ; Marcacini, Ricardo M. ^[1] Número total de Autores: 3
Afiliação do(s) autor(es):	^[1] Univ Sao Paulo, Inst Math & Comp Sci, USP, POB 668, BR-13560970 Sao Carlos, SP - Brazil Número total de Afiliações: 1
Tipo de documento:	Artigo de Revisão
Fonte:	AUTOMATED SOFTWARE ENGINEERING; v. 29, n. 1 MAY 2022.
Citações Web of Science:	0
Resumo
Popular mobile applications receive millions of user reviews. These reviews contain relevant information for software maintenance, such as bug reports and improvement suggestions. The review's information is a valuable knowledge source for software requirements engineering since the apps review analysis helps make strategic decisions to improve the app quality. However, due to the large volume of texts, the manual extraction of the relevant information is an impracticable task. Opinion mining is the field of study for analyzing people's sentiments and emotions through opinions expressed on the web, such as social networks, forums, and community platforms for products and services recommendation. In this paper, we investigate opinion mining for app reviews. In particular, we compare textual representation techniques for classification, sentiment analysis, and utility prediction from app reviews. We discuss and evaluate different techniques for the textual representation of reviews, from traditional Bag-of-Words (BoW) to the most recent state-of-the-art Neural Language models (NLM). Our findings show that the traditional Bag-of-Words model, combined with a careful analysis of text pre-processing techniques, is still competitive. It obtains results close to the NLM in the classification, sentiment analysis and utility prediction tasks. However, NLM proved to be more advantageous since they achieved very competitive performance in all the predictive tasks covered in this work, provide significant dimensionality reduction, and deals more adequately with semantic proximity between the reviews' texts. {[}GRAPHICS] . (AU)

Processo FAPESP:	19/25010-5 - Representações semanticamente enriquecidas para mineração de textos em português: modelos e aplicações
Beneficiário:	Solange Oliveira Rezende
Modalidade de apoio:	Auxílio à Pesquisa - Regular


Processo FAPESP:	19/07665-4 - Centro de Inteligência Artificial
Beneficiário:	Fabio Gagliardi Cozman
Modalidade de apoio:	Auxílio à Pesquisa - Programa eScience e Data Science - Centros de Pesquisa em Engenharia

URL curto