Noise Robust ASR: Missing data techniques and beyond

Speech recognition performance degrades in the presence of background noise. In this thesis, several methods are developed to improve the noise robustness. Most of the work pertains to the use of sparse representations of speech: speech segments are described as a sparse linear combination of example speech segments, exemplars. Using techniques from missing data theory and compressed sensing, it is proposed to find, for each noisy speech observation, a sparse linear combination of exemplars using only speech features that are not corrupted by noise. This linear combination of clean speech exemplars is then used to reconstruct and estimate of the clean speech. Later in the thesis, it is proposed to augment this model by expressing noisy speech as a linear combination of speech and noise exemplars. Additionally, the weights of labelled exemplars in the sparse representation is used directly for exemplar-based speech decoding.

File Type: pdf
File Size: 5 MB
Publication Year: 2011
Author: Gemmeke, Jort
Supervisors: Bert Cranen, Lou Boves
Institution: Radboud University Nijmegen
Keywords: speech recognition, missing data, noise robustness, compressed sensing, sparse representations, exemplar-based