A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations

Vieira, Samuel Terra; Rosa, Renata Lopes; Rodriguez, Demostenes Zegarra

Full text
Author(s):	Vieira, Samuel Terra ^[1] ; Rosa, Renata Lopes ^[1] ; Rodriguez, Demostenes Zegarra ^[1] Total Authors: 3
Affiliation:	^[1] Univ Fed Lavras, Dept Comp Sci, Lavras, MG - Brazil Total Affiliations: 1
Document type:	Journal article
Source:	JOURNAL OF COMMUNICATIONS SOFTWARE AND SYSTEMS; v. 16, n. 2, p. 180-187, JUN 2020.
Web of Science Citations:	0
Abstract
Many factors can affect the users' quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of users' QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL. (AU)

FAPESP's process:	15/24496-0 - Evaluation of the service of communication operators using the voice Quality Index
Grantee:	Demostenes Zegarra Rodriguez
Support Opportunities:	Regular Research Grants


FAPESP's process:	18/26455-8 - Audio-Visual Speech Processing by Machine Learning
Grantee:	Miguel Arjona Ramírez
Support Opportunities:	Regular Research Grants

Short URL