Audio Visual Speech Enhancement

Abstract / truncated to 115 words (read the full abstract)

This thesis presents a novel approach to speech enhancement by exploiting the bimodality of speech production and the correlation that exists between audio and visual speech information. An analysis into the correlation of a range of audio and visual features reveals significant correlation to exist between visual speech features and audio filterbank features. The amount of correlation was also found to be greater when the correlation is analysed with individual phonemes rather than across all phonemes. This led to building a Gaussian Mixture Model (GMM) that is capable of estimating filterbank features from visual features. Phoneme-specific GMMs gave lower filterbank estimation errors and a phoneme transcription is decoded using audio-visual Hidden Markov Model (HMM). Clean ... toggle 3 keywords
audio-visual – speech processing – speech enhancement

Information

Author

Almajai, Ibrahim

Institution

University of East Anglia

Supervisors

Publication Year

2009

Upload Date

Sept. 27, 2011

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.

Follow @eurasip

Audio Visual Speech Enhancement (2009)

Abstract / truncated to 115 words (read the full abstract)

Information

First few pages / click to enlarge