Some Parametric Methods of Speech Processing

Parametric modelling of speech signals finds its use in various speech processing applications. Recently, publications concerning sinusoidal speech modelling have been increasingly appeared in scientific literature. The thesis is mainly devoted to the sinusoidal model with harmonically related component sine waves, i.e. the harmonic model. The main objective is to find new approaches to synthetic speech quality improvement. A novel method for speech spectrum envelope determination is introduced. This method uses a staircase envelope considering the spectral behaviour in voiced as well as unvoiced speech frames. The staircase envelope is smoothed by weighted moving average. The determined envelope is parametrized using autoregressive (AR) model or cepstral coefficients. It has been shown that the new method is of most importance in high-pitch speakers. Besides, new methods or modifications of known methods can be found in pitch synchronization, AR model order selection [114], maximum voiced frequency determination [108], using inverse fast Fourier transform (FFT) of the spectral envelope for AR parameters determination [78], gain correction for cepstral parameters determination [122], and use of asymmetric Hanning window in pitch-synchronous overlap-and-add (OLA) synthesis. Methods are compared with respect to the spectral measure, the perceived speech quality, and the computational complexity. Experimental results have shown that the proposed envelope determination method outperformed known methods in AR as well as cepstral parametrization of the harmonic model. Use of asymmetric Hanning window during OLA synthesis influenced decrease in the standard deviation of the RMS log spectral measure for all the used analysis methods. The greatest advantage of the asymmetric window was observed for AR parametrization with higher order. Apparent lowering of the mean as well as the standard deviation of the spectral measure can be noticed here. Comparison of AR and cepstral parametrizations of the same spectral envelope showed preference of AR, if OLA synthesis was performed. Listening tests confirmed quantitative results given by the RMS log spectral measure. Comparison of the cepstral model and the harmonic model with cepstral parametrization gave better frequency properties of the harmonic model, however, at the expense of higher computational complexity. Inverse relationship between quality and computational complexity can be found in each of the methods, although this relationship is not always proportional.

File Type: pdf
File Size: 5 KB
Publication Year: 2001
Author: Pribilova, Anna
Supervisors: Not Available
Institution: Slovak University of Technology
Keywords: