Predicting Popularity of Video Streaming Services with Representation Learning: A Survey and a Real-World Case Study

de Sa, Sidney Loyola; Rocha, Antonio A. de A.; Paes, Aline

Texto completo
Autor(es):	de Sa, Sidney Loyola ^[1] ; Rocha, Antonio A. de A. ^[1] ; Paes, Aline ^[1] Número total de Autores: 3
Afiliação do(s) autor(es):	^[1] Univ Fed Fluminense, Inst Comp, BR-24210330 Niteroi, RJ - Brazil Número total de Afiliações: 1
Tipo de documento:	Artigo Científico
Fonte:	SENSORS; v. 21, n. 21 NOV 2021.
Citações Web of Science:	0
Resumo
The Internet's popularization has increased the amount of content produced and consumed on the web. To take advantage of this new market, major content producers such as Netflix and Amazon Prime have emerged, focusing on video streaming services. However, despite the large number and diversity of videos made available by these content providers, few of them attract the attention of most users. For example, in the data explored in this article, only 6% of the most popular videos account for 85% of total views. Finding out in advance which videos will be popular is not trivial, especially given many influencing variables. Nevertheless, a tool with this ability would be of great value to help dimension network infrastructure and properly recommend new content to users. In this way, this manuscript examines the machine learning-based approaches that have been proposed to solve the prediction of web content popularity. To this end, we first survey the literature and elaborate a taxonomy that classifies models according to predictive features and describes state-of-the-art features and techniques used to solve this task. While analyzing previous works, we saw an opportunity to use textual features for video prediction. Thus, additionally, we propose a case study that combines features acquired through attribute engineering and word embedding to predict the popularity of a video. The first approach is based on predictive attributes defined by resource engineering. The second takes advantage of word embeddings from video descriptions and titles. We experimented with the proposed techniques in a set of videos from GloboPlay, the largest provider of video streaming services in Latin America. A combination of engineering features and embeddings using the Random Forest algorithm achieved the best result, with an accuracy of 87%. (AU)

Processo FAPESP:	15/24144-7 - Tecnologias e soluções para habilitar o paradigma de nuvens de coisas
Beneficiário:	José Neuman de Souza
Modalidade de apoio:	Auxílio à Pesquisa - Temático

URL curto

Compartilhe esta página