Generative Speech Enhancement in Multimodal Applications
This dissertation advances generative speech enhancement by investigating both unsupervised and supervised machine learning approaches, with a focus on integrating visual information to improve robustness. The work is organized into three main contributions: The first contribution focuses on unsupervised generative speech enhancement. We explore a Bayesian framework combining variational autoencoders (VAEs) trained on clean speech…
