Speech Enhancement Algorithms for Audiological Applications

The improvement of speech intelligibility is a traditional problem which still remains open and unsolved. The recent boom of applications such as hands-free communi- cations or automatic speech recognition systems and the ever-increasing demands of the hearing-impaired community have given a definitive impulse to the research in this area. This PhD thesis is focused on speech enhancement for audiological applications. Most of the research conducted in this thesis has been focused on the improvement of speech intelligibility in hearing aids, considering the variety of restrictions and limitations imposed by this type of devices. The combination of source separation techniques and spatial filtering with machine learning and evolutionary computation has originated novel and interesting algorithms which are included in this thesis. The thesis is divided in two main parts. The first one contains a preliminary study of the problem and a ...

Ayllón, David — Universidad de Alcalá


MIMO Instantaneous Blind Identification and Separation based on Arbitrary Order Temporal Structure in the Data

This thesis is concerned with three closely related problems. The first one is called Multiple-Input Multiple-Output (MIMO) Instantaneous Blind Identification, which we denote by MIBI. In this problem a number of mutually statistically independent source signals are mixed by a MIMO instantaneous mixing system and only the mixed signals are observed, i.e. both the mixing system and the original sources are unknown or ‘blind’. The goal of MIBI is to identify the MIMO system from the observed mixtures of the source signals only. The second problem is called Instantaneous Blind Signal Separation (IBSS) and deals with recovering mutually statistically independent source signals from their observed instantaneous mixtures only. The observation model and assumptions on the signals and mixing system are the same as those of MIBI. However, the main purpose of IBSS is the estimation of the source signals, whereas ...

van de Laar, Jakob — TU Eindhoven


MIMO instantaneous blind idenfitication and separation based on arbitrary order

This thesis is concerned with three closely related problems. The first one is called Multiple-Input Multiple-Output (MIMO) Instantaneous Blind Identification, which we denote by MIBI. In this problem a number of mutually statistically independent source signals are mixed by a MIMO instantaneous mixing system and only the mixed signals are observed, i.e. both the mixing system and the original sources are unknown or ¡blind¢. The goal of MIBI is to identify the MIMO system from the observed mixtures of the source signals only. The second problem is called Instantaneous Blind Signal Separation (IBSS) and deals with recovering mutually statistically independent source signals from their observed instantaneous mixtures only. The observation model and assumptions on the signals and mixing system are the same as those of MIBI. However, the main purpose of IBSS is the estimation of the source signals, whereas ...

van de Laar, Jakob — T.U. Eindhoven


Integrating monaural and binaural cues for sound localization and segregation in reverberant environments

The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. Binaural processing, where input signals resemble those that enter the two ears, is of particular interest in the CASA field. The dominant approach to binaural segregation has been to derive spatially selective filters in order to enhance the signal in a direction of interest. As such, the problems of sound localization and sound segregation are closely tied. While spatial filtering has been widely utilized, substantial performance degradation is incurred in reverberant environments and more fundamentally, segregation cannot be performed without sufficient spatial separation between sources. This dissertation ...

Woodruff, John — The Ohio State University


Sparse Multi-Channel Linear Prediction for Blind Speech Dereverberation

In many speech communication applications, such as hands-free telephony and hearing aids, the microphones are located at a distance from the speaker. Therefore, in addition to the desired speech signal, the microphone signals typically contain undesired reverberation and noise, caused by acoustic reflections and undesired sound sources. Since these disturbances tend to degrade the quality of speech communication, decrease speech intelligibility and negatively affect speech recognition, efficient dereverberation and denoising methods are required. This thesis deals with blind dereverberation methods, not requiring any knowledge about the room impulse responses between the speaker and the microphones. More specifically, we propose a general framework for blind speech dereverberation based on multi-channel linear prediction (MCLP) and exploiting sparsity of the speech signal in the time-frequency domain.

Jukić, Ante — University of Oldenburg


Integrated active noise control and noise reduction in hearing aids

In every day life conversations and listening scenarios the desired speech signal is rarely delivered alone. The listener most commonly faces a scenario where he has to understand speech in a noisy environment. Hearing impairments, and more particularly sensorineural losses, can cause a reduction of speech understanding in noise. Therefore, in a hearing aid compensating for such kind of losses it is not sufficient to just amplify the incoming sound. Hearing aids also need to integrate algorithms that allow to discriminate between speech and noise in order to extract a desired speech from a noisy environment. A standard noise reduction scheme in general aims at maximising the signal-to-noise ratio of the signal to be fed in the hearing aid loudspeaker. This signal, however, does not reach the eardrum directly. It first has to propagate through an acoustic path and encounter ...

Serizel, Romain — KU Leuven


Development and evaluation of psychoacoustically motivated binaural noise reduction and cue preservation techniques

Due to their decreased ability to understand speech hearing impaired may have difficulties to interact in social groups, especially when several people are talking simultaneously. Fortunately, in the last decades hearing aids have evolved from simple sound amplifiers to modern digital devices with complex functionalities including noise reduction algorithms, which are crucial to improve speech understanding in background noise for hearing-impaired persons. Since many hearing aid users are fitted with two hearing aids, so-called binaural hearing aids have been developed, which exchange data and signals through a wireless link such that the processing in both hearing aids can be synchronized. In addition to reducing noise and limiting speech distortion, another important objective of noise reduction algorithms in binaural hearing aids is the preservation of the listener’s impression of the acoustical scene, in order to exploit the binaural hearing advantage and ...

Marquardt, Daniel — University of Oldenburg, Germany


Preserving binaural cues in noise reduction algorithms for hearing aids

Hearing aid users experience great difficulty in understanding speech in noisy environments. This has led to the introduction of noise reduction algorithms in hearing aids. The development of these algorithms is typically done monaurally. However, the human auditory system is a binaural system, which compares and combines the signals received by both ears to perceive a sound source as a single entity in space. Providing two monaural, independently operating, noise reduction systems, i.e. a bilateral configuration, to the hearing aid user may disrupt binaural information, needed to localize sound sources correctly and to improve speech perception in noise. In this research project, we first examined the influence of commercially available, bilateral, noise reduction algorithms on binaural hearing. Extensive objective and perceptual evaluations showed that the bilateral adaptive directional microphone (ADM) and the bilateral fixed directional microphone, two of the most ...

Van den Bogaert, Tim — Katholieke Universiteit Leuven


Informed spatial filters for speech enhancement

In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...

Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg


Signal Separation

The problem of signal separation is a very broad and fundamental one. A powerful paradigm within which signal separation can be achieved is the assumption that the signals/sources are statistically independent of one another. This is known as Independent Component Analysis (ICA). In this thesis, the theoretical aspects and derivation of ICA are examined, from which disparate approaches to signal separation are drawn together in a unifying framework. This is followed by a review of signal separation techniques based on ICA. Second order statistics based output decorrelation methods are employed to try to solve the challenging problem of separating convolutively mixed signals, in the context of mainly audio source separation and the Cocktail Party Problem. Various optimisation techniques are devised to implement second order signal separation of both artificially mixed signals and real mixtures. A study of the advantages and ...

Ahmed, Alijah — University of Cambridge


Cognitive-driven speech enhancement using EEG-based auditory attention decoding for hearing aid applications

Identifying the target speaker in hearing aid applications is an essential ingredient to improve speech intelligibility. Although several speech enhancement algorithms are available to reduce background noise or to perform source separation in multi-speaker scenarios, their performance depends on correctly identifying the target speaker to be enhanced. Recent advances in electroencephalography (EEG) have shown that it is possible to identify the target speaker which the listener is attending to using single-trial EEG-based auditory attention decoding (AAD) methods. However, in realistic acoustic environments the AAD performance is influenced by undesired disturbances such as interfering speakers, noise and reverberation. In addition, it is important for real-world hearing aid applications to close the AAD loop by presenting on-line auditory feedback. This thesis deals with the problem of identifying and enhancing the target speaker in realistic acoustic environments based on decoding the auditory attention ...

Aroudi, Ali — University of Oldenburg, Germany


New strategies for single-channel speech separation

We present new results on single-channel speech separation and suggest a new separation approach to improve the speech quality of separated signals from an observed mix- ture. The key idea is to derive a mixture estimator based on sinusoidal parameters. The proposed estimator is aimed at finding sinusoidal parameters in the form of codevectors from vector quantization (VQ) codebooks pre-trained for speakers that, when combined, best fit the observed mixed signal. The selected codevectors are then used to reconstruct the recovered signals for the speakers in the mixture. Compared to the log- max mixture estimator used in binary masks and the Wiener filtering approach, it is observed that the proposed method achieves an acceptable perceptual speech quality with less cross- talk at different signal-to-signal ratios. Moreover, the method is independent of pitch estimates and reduces the computational complexity of the ...

Pejman Mowlaee — Department of Electronic Systems, Aalborg University


Source-Filter Model Based Single Channel Speech Separation

In a natural acoustic environment, multiple sources are usually active at the same time. The task of source separation is the estimation of individual source signals from this complex mixture. The challenge of single channel source separation (SCSS) is to recover more than one source from a single observation. Basically, SCSS can be divided in methods that try to mimic the human auditory system and model-based methods, which find a probabilistic representation of the individual sources and employ this prior knowledge for inference. This thesis presents several strategies for the separation of two speech utterances mixed into a single channel and is structured in four parts: The first part reviews factorial models in model-based SCSS and introduces the soft-binary mask for signal reconstruction. This mask shows improved performance compared to the soft and the binary masks in automatic speech recognition ...

Stark, Michael — Graz University of Technology


Some Contributions to Music Signal Processing and to Mono-Microphone Blind Audio Source Separation

For humans, the sound is valuable mostly for its meaning. The voice is spoken language, music, artistic intent. Its physiological functioning is highly developed, as well as our understanding of the underlying process. It is a challenge to replicate this analysis using a computer: in many aspects, its capabilities do not match those of human beings when it comes to speech or instruments music recognition from the sound, to name a few. In this thesis, two problems are investigated: the source separation and the musical processing. The first part investigates the source separation using only one Microphone. The problem of sources separation arises when several audio sources are present at the same moment, mixed together and acquired by some sensors (one in our case). In this kind of situation it is natural for a human to separate and to recognize ...

Schutz, Antony — Eurecome/Mobile


Speech Enhancement Using Nonnegative Matrix Factorization and Hidden Markov Models

Reducing interference noise in a noisy speech recording has been a challenging task for many years yet has a variety of applications, for example, in handsfree mobile communications, in speech recognition, and in hearing aids. Traditional single-channel noise reduction schemes, such as Wiener filtering, do not work satisfactorily in the presence of non-stationary background noise. Alternatively, supervised approaches, where the noise type is known in advance, lead to higher-quality enhanced speech signals. This dissertation proposes supervised and unsupervised single-channel noise reduction algorithms. We consider two classes of methods for this purpose: approaches based on nonnegative matrix factorization (NMF) and methods based on hidden Markov models (HMM). The contributions of this dissertation can be divided into three main (overlapping) parts. First, we propose NMF-based enhancement approaches that use temporal dependencies of the speech signals. In a standard NMF, the important temporal ...

Mohammadiha, Nasser — KTH Royal Institute of Technology

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.