Integration of Neural Networks and Probabilistic Spatial Models for Acoustic Blind Source Separation (2020)
Abstract / truncated to 115 words
Despite a lot of progress in speech separation, enhancement, and automatic speech recognition realistic meeting recognition is still fairly unsolved. Most research on speech separation either focuses on spectral cues to address single-channel recordings or spatial cues to separate multi-channel recordings and exclusively either rely on neural networks or probabilistic graphical models. Integrating a spatial clustering approach and a deep learning approach using spectral cues in a single framework can significantly improve automatic speech recognition performance and improve generalizability given that a neural network profits from a vast amount of training data while the probabilistic counterpart adapts to the current scene. This thesis at hand, therefore, concentrates on the integration of two fairly disjoint research ... toggle 5 keywordsblind source separation – speech processing – beamforming – deep clustering – neural networks
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.