Deep neural networks for source separation and noise-robust speech recognition

Abstract / truncated to 115 words (read the full abstract)

This thesis addresses the problem of multichannel audio source separation by exploiting deep neural networks (DNNs). We build upon the classical expectation-maximization (EM) based source separation framework employing a multichannel Gaussian model, in which the sources are characterized by their power spectral densities and their source spatial covariance matrices. We explore and optimize the use of DNNs for estimating these spectral and spatial parameters. Employing the estimated source parameters, we then derive a time-varying multichannel Wiener filter for the separation of each source. We extensively study the impact of various design choices for the spectral and spatial DNNs. We consider different cost functions, time-frequency representations, architectures, and training data sizes. Those cost functions notably include ... toggle 3 keywords
multichannel audio source separation – multichannel gaussian model – deep neural networks

Information

Author

Nugraha, Aditya Arie

Institution

Université de Lorraine

Supervisors

Publication Year

2017

Upload Date

Dec. 31, 2017

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.

Deep neural networks for source separation and noise-robust speech recognition (2017)

Abstract / truncated to 115 words (read the full abstract)

Information

First few pages / click to enlarge