Advanced search
Start date
Betweenand
Conteúdos relacionados


Application of predictability measures to voice signals for larynx pathology differentiation

Full text
Author(s):
Paulo Rogério Scalassara
Total Authors: 1
Document type: Doctoral Thesis
Press: São Carlos.
Institution: Universidade de São Paulo (USP). Escola de Engenharia de São Carlos (EESC/SBD)
Defense date:
Examining board members:
José Carlos Pereira; Aparecido Augusto de Carvalho; Francisco Javier Ramirez Fernandez; Carlos Dias Maciel; Marco Antonio Grivet Mattoso Maia
Advisor: José Carlos Pereira
Abstract

This thesis presents initial studies of the application of predictability measures to voice signal analysis. Its aim is to develop methods that are capable of differentiating healthy and pathological signals, also amongst pathologies. In order to do that, we perform an attempt to measure the uncertainty and predictability variations of the signals from the analyzed groups. Some larynx pathologies, such as nodule and Reinkes edema, that are used in this study, cause changes to the voice signals due to structure and functionality modifications of the vocal tract and folds. The main modifications are higher amplitude and frequency perturbations, noise addition, and supression of high frequency harmonic components. Because of that, the signals lose some of their almost periodic structure, the vocal system\'s uncertainty increases and, therefore, the predictability decreases. We use several measures to evaluate these changes, such as Shannons entropy and relative entropy between healthy and pathological signals. In addition, we use the predictive power (PP), that is based on the relative entropy between the voice signal and its prediction error given by a model. Firstly, we used the autoregressive model (AR), common for voice analysis, however, due to unsatisfactory results, we presented a model based on wavelet decomposition. We also took advantage of another tool, called predictable component analysis (PrCA), it performs a signal decomposition in components that are ordered by their predictability. Then it is possible to reconstruct the signals using only their most predictable components. Using this technique, we analyzed a kind of tridimensional representation of the voice signals in a space with coordinates given by delayed versions of the signals. We tested the developed algorithms with the aid of simulated voice signals, which had variations of noise level and amplitude and frequency perturbations. By means of that, it was possible to detect errors and solve method problems. After the algorithms evaluation, we estimated the entropy of the voice signals and the relative entropy between the healthy signals and all the signals. In addition, we estimated the PP using the AR and wavelet based models. After that, we used the PrCA in order to obtain more predictable versions of the signals and then, estimated the PP using this version as the signals prediction. Also, we applied the PrCA to the signals tridimensional representations using a multidimensional AR model as a predictor. Using the voice entropy results, we could not distinguish between the analyzed groups, but with the relative entropy values, the healthy and pathological signals were differentiated efficiently. In spite of that, this measure has no practical application, because a diagnosed voice database is necessary as a basis of comparison. For the PP with AR modeling, no distinction between the groups is observed, but with the wavelet modeling, the healthy signals showed significantly higher predictability than the pathological ones, however the pathologies were differentiated. Using the PrCA with both models, the pathological and healthy groups were distinguished, but for the AR model, the healthy signals presented smaller predictability. This shows that the predictability depends on the analysis model, thus the larynx pathologies can decrease or increase the prediction capacity of the voice signals according to the used model. The results of PrCA of the tridimensional representations show similar behavior of the ones from direct PrCA signal analisys with the AR model. Despite of these results, this form of data representation seems to be promising for future studies. Considering these results, we concluded that this study was very useful to acquire a better understanding of the dynamics of voice production and that the predictability measures are interesting for the evaluation of larynx pathologies, especially presence of nodule in the vocal folds and Reinke\'s edema, at least for this initial study using the available signals. More studies are still necessary, but this analysis method already presents good results, which can be applied to aid pathology diagnosis by health professionals. (AU)

FAPESP's process: 06/53238-0 - Utilizacao de processamento digital de sinais de voz para discriminacao de patologias.
Grantee:Paulo Rogério Scalassara
Support Opportunities: Scholarships in Brazil - Doctorate