Speech Quality Assessment in Wireless VoIP Communication Using Deep Belief Network

Affonso, Emmanuel T.; Nunes, Rodrigo D.; Rosa, Renata L.; Pivaro, Gabriel F.; Rodriguez, Demostenes Z.

Full text
Author(s):	Affonso, Emmanuel T. ^[1] ; Nunes, Rodrigo D. ^[1] ; Rosa, Renata L. ^[1] ; Pivaro, Gabriel F. ^[2] ; Rodriguez, Demostenes Z. ^[1] Total Authors: 5
Affiliation:	^[1] Univ Fed Lavras, BR-37200000 Lavras - Brazil ^[2] Natl Inst Telecommun, Radiocommun Reference Ctr, Santa Rita Do Sapucai, MG - Brazil Total Affiliations: 2
Document type:	Journal article
Source:	IEEE ACCESS; v. 6, p. 77022-77032, 2018.
Web of Science Citations:	0
Abstract
Nowadays, the voice over Internet protocol (VoIP) communication service is widely adopted, and it counts with many users across the world. However, the users' quality of experience is not guaranteed because the voice signal quality can be affected by several degradations that happen in the network infrastructure. Thus, it is relevant to have a global speech quality assessment method that considers both wired and wireless networks to provide reliable results. In this paper, several network scenarios that consider different packet loss rates (PLRs) and wireless channel models are implemented in which the impaired signals are evaluated using the algorithm described in ITU-T Recommendation P.862. Preliminary results showed a relationship between both fading and PLR parameters and the global speech quality index. However, the P.862 algorithm is not viable in real VoIP scenarios. The ITU-T Recommendation P.563 describes a non-intrusive speech quality assessment method; nevertheless, its results are not confident. In this context, the main objective of this paper is to propose a non-intrusive speech quality classification model based on a deep belief network (DBN) that considers the wired and wireless impairments on the speech signal. Experimental results demonstrated a high correlation between the proposed model based on the DBN and P.862 algorithm, reaching a F-measure of 97.01%. For validation, the non-intrusive P.563 algorithm is used; the proposed model and P.563 reached an average accuracy of 96.14% and 72.12%, respectively. Furthermore, subjective tests were carried out, and the proposed DBN model reached an accuracy of 94%. (AU)

FAPESP's process:	15/24496-0 - Evaluation of the service of communication operators using the voice Quality Index
Grantee:	Demostenes Zegarra Rodriguez
Support Opportunities:	Regular Research Grants


FAPESP's process:	15/25512-0 - Conditional Analysis of Audio and Speech Signals for Coding and Recognition
Grantee:	Miguel Arjona Ramírez
Support Opportunities:	Regular Research Grants

Short URL