Multi-microphone noise reduction and dereverberation techniques for speech applications

In typical speech communication applications, such as hands-free mobile telephony, voice-controlled systems and hearing aids, the recorded microphone signals are corrupted by background noise, room reverberation and far-end echo signals. This signal degradation can lead to total unintelligibility of the speech signal and decreases the performance of automatic speech recognition systems. In this thesis several multi-microphone noise reduction and dereverberation techniques are developed. In Part I we present a Generalised Singular Value Decomposition (GSVD) based optimal filtering technique for enhancing multi-microphone speech signals which are degraded by additive coloured noise. Several techniques are presented for reducing the computational complexity and we show that the GSVD-based optimal filtering technique can be integrated into a `Generalised Sidelobe Canceller' type structure. Simulations show that the GSVD-based optimal filtering technique achieves a larger signal-to-noise ratio improvement than standard fixed and adaptive beamforming techniques and ...

Doclo, Simon — Katholieke Universiteit Leuven

Contributions to Wideband Hands-free Systems and their Evaluation

This work deals with the advancement of wideband hands-free systems (HFS’s) for mono- and stereophonic cases of application. Furthermore, innovative contributions to the corr. field of quality evaluation are made. The proposed HFS approaches are based on frequency-domain adaptive filtering for system identification, making use of Kalman theory and state-space modeling. Functional enhancement modules are developed in this work, which improve one or more of key quality aspects, aiming at not to harm others. In so doing, these modules can be combined in a flexible way, dependent on the needs at hand. The enhanced monophonic HFS is evaluated according to automotive ITU-T recommendations, to prove its customized efficacy. Furthermore, a novel methodology and techn. framework are introduced in this work to improve the prototyping and evaluation process of automotive HF and in-car-communication (ICC) systems. The monophonic HFS in several configurations ...

Jung, Marc-André — Technische Universität Braunschweig

Acoustic echo reduction for multiple loudspeakers and microphones: Complexity reduction and convergence enhancement

Modern devices such as mobile phones, tablets or smart speakers are commonly equipped with several loudspeakers and microphones. If, for instance, one employs such a device for hands-free communication applications, the signals that are reproduced by the loudspeakers are propagated through the room and are inevitably acquired by the microphones. If no processing is applied, the participants in the far-end room receive delayed reverberated replicas of their own voice, which strongly degrades both speech intelligibility and user comfort. In order to prevent that so-called acoustic echoes are transmitted back to the far-end room, acoustic echo cancelers are commonly employed. The latter make use of adaptive filtering techniques to identify the propagation paths between loudspeakers and microphones. The estimated propagation paths are then employed to compute acoustic echo estimates, which are finally subtracted from the signals acquired by the microphones. In ...

Luis Valero, Maria — International Audio Laboratories Erlangen

Speech Enhancement Using Data-Driven Concepts

Speech communication frequently suffers from transmitted background noises. Numerous speech enhancement algorithms have thus been proposed to obtain a speech signal with a reduced amount of background noise and better speech quality. In most cases they are analytically derived as spectral weighting rules for given error criteria along with statistical models of the speech and noise spectra. However, as these spectral distributions are indeed not easy to be measured and modeled, such algorithms achieve in practice only a suboptimal performance. In the development of state-of-the-art algorithms, speech and noise training data is commonly exploited for the statistical modeling of the respective spectral distributions. In this thesis, the training data is directly applied to train data-driven speech enhancement algorithms, avoiding any modeling of the spectral distributions. Two applications are proposed: (1) A set of spectral weighting rules is trained from noise ...

Suhadi — Technische Universität Braunschweig

Speech derereverberation in noisy environments using time-frequency domain signal models

Reverberation is the sum of reflected sound waves and is present in any conventional room. Speech communication devices such as mobile phones in hands-free mode, tablets, smart TVs, teleconferencing systems, hearing aids, voice-controlled systems, etc. use one or more microphones to pick up the desired speech signals. When the microphones are not in the proximity of the desired source, strong reverberation and noise can degrade the signal quality at the microphones and can impair the intelligibility and the performance of automatic speech recognizers. Therefore, it is a highly demanded task to process the microphone signals such that reverberation and noise are reduced. The process of reducing or removing reverberation from recorded signals is called dereverberation. As dereverberation is usually a completely blind problem, where the only available information are the microphone signals, and as the acoustic scenario can be non-stationary, ...

Braun, Sebastian — Friedrich-Alexander Universität Erlangen-Nürnberg

Adaptive filtering algorithms for acoustic echo cancellation and acoustic feedback control in speech communication applications

Multimedia consumer electronics are nowadays everywhere from teleconferencing, hands-free communications, in-car communications to smart TV applications and more. We are living in a world of telecommunication where ideal scenarios for implementing these applications are hard to find. Instead, practical implementations typically bring many problems associated to each real-life scenario. This thesis mainly focuses on two of these problems, namely, acoustic echo and acoustic feedback. On the one hand, acoustic echo cancellation (AEC) is widely used in mobile and hands-free telephony where the existence of echoes degrades the intelligibility and listening comfort. On the other hand, acoustic feedback limits the maximum amplification that can be applied in, e.g., in-car communications or in conferencing systems, before howling due to instability, appears. Even though AEC and acoustic feedback cancellation (AFC) are functional in many applications, there are still open issues. This means that ...

Gil-Cacho, Jose Manuel — KU Leuven

Informed spatial filters for speech enhancement

In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...

Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg

Domain-informed signal processing with application to analysis of human brain functional MRI data

Standard signal processing techniques are implicitly based on the assumption that the signal lies on a regular, homogeneous domain. In practice, however, many signals lie on an irregular or inhomogeneous domain. An application area where data are naturally defined on an irregular or inhomogeneous domain is human brain neuroimaging. The goal in neuroimaging is to map the structure and function of the brain using imaging techniques. In particular, functional magnetic resonance imaging (fMRI) is a technique that is conventionally used in non-invasive probing of human brain function. This doctoral dissertation deals with the development of signal processing schemes that adapt to the domain of the signal. It consists of four papers that in different ways deal with exploiting knowledge of the signal domain to enhance the processing of signals. In each paper, special focus is given to the analysis of ...

Behjat, Hamid — Lund University

A multimicrophone approach to speech processing in a smart-room environment

Recent advances in computer technology and speech and language processing have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple multimodal sensors, also known as smart-rooms, has considerably grown. In general, it is well-known that the quality of speech signals captured by microphones that can be located several meters away from the speakers is severely distorted by acoustic noise and room reverberation. In the context of the development of hands-free speech applications in smart-room environments, the use of obtrusive sensors like close-talking microphones is usually not allowed, and consequently, speech technologies must operate on the basis of distant-talking recordings. In such conditions, speech technologies that usually perform reasonably well in free of noise and ...

Abad, Alberto — Universitat Politecnica de Catalunya

Adaptive interference suppression algorithms for DS-UWB systems

In multiuser ultra-wideband (UWB) systems, a large number of multipath components (MPCs) are introduced by the channel. One of the main challenges for the receiver is to effectively suppress the interference with affordable complexity. In this thesis, we focus on the linear adaptive interference suppression algorithms for the direct-sequence ultrawideband (DS-UWB) systems in both time-domain and frequency-domain. In the time-domain, symbol by symbol transmission multiuser DS-UWB systems are considered. We first investigate a generic reduced-rank scheme based on the concept of joint and iterative optimization (JIO) that jointly optimizes a projection vector and a reduced-rank filter by using the minimum mean-squared error (MMSE) criterion. A low-complexity scheme, named Switched Approximations of Adaptive Basis Functions (SAABF), is proposed as a modification of the generic scheme, in which the complexity reduction is achieved by using a multi-branch framework to simplify the structure ...

Sheng Li — University of York

Filter Bank Techniques for the Physical Layer in Wireless Communications

Filter bank based multicarrier is an evolution with many advantages over the widespread OFDM multicarrier scheme. The author of the thesis stands behind this statement and proposes various solutions for practical physical layer problems based on filter bank processing of wireless communications signals. Filter banks are an evolved form of subband processing, harnessing the key advantages of original efficient subband processing based on the fast Fourier transforms and addressing some of its shortcomings, at the price of a somewhat increased implementation complexity. The main asset of the filter banks is the possibility to design very frequency selective subband filters to compartmentalize the overall spectrum into well isolated subbands, while still making very efficient use of the assigned bandwidth. This thesis first exploits this main feature of the filter banks in the subband system configuration, in which the analysis filter bank ...

Hidalgo Stitz, Tobias — Tampere University of Technology

Speech Enhancement Algorithms for Audiological Applications

The improvement of speech intelligibility is a traditional problem which still remains open and unsolved. The recent boom of applications such as hands-free communi- cations or automatic speech recognition systems and the ever-increasing demands of the hearing-impaired community have given a definitive impulse to the research in this area. This PhD thesis is focused on speech enhancement for audiological applications. Most of the research conducted in this thesis has been focused on the improvement of speech intelligibility in hearing aids, considering the variety of restrictions and limitations imposed by this type of devices. The combination of source separation techniques and spatial filtering with machine learning and evolutionary computation has originated novel and interesting algorithms which are included in this thesis. The thesis is divided in two main parts. The first one contains a preliminary study of the problem and a ...

Ayllón, David — Universidad de Alcalá

Spherical Microphone Array Processing for Acoustic Parameter Estimation and Signal Enhancement

In many distant speech acquisition scenarios, such as hands-free telephony or teleconferencing, the desired speech signal is corrupted by noise and reverberation. This degrades both the speech quality and intelligibility, making communication difficult or even impossible. Speech enhancement techniques seek to mitigate these effects and extract the desired speech signal. This objective is commonly achieved through the use of microphone arrays, which take advantage of the spatial properties of the sound field in order to reduce noise and reverberation. Spherical microphone arrays, where the microphones are arranged in a spherical configuration, usually mounted on a rigid baffle, are able to analyze the sound field in three dimensions; the captured sound field can then be efficiently described in the spherical harmonic domain (SHD). In this thesis, a number of novel spherical array processing algorithms are proposed, based in the SHD. In ...

Jarrett, Daniel P. — Imperial College London

Sparse Multi-Channel Linear Prediction for Blind Speech Dereverberation

In many speech communication applications, such as hands-free telephony and hearing aids, the microphones are located at a distance from the speaker. Therefore, in addition to the desired speech signal, the microphone signals typically contain undesired reverberation and noise, caused by acoustic reflections and undesired sound sources. Since these disturbances tend to degrade the quality of speech communication, decrease speech intelligibility and negatively affect speech recognition, efficient dereverberation and denoising methods are required. This thesis deals with blind dereverberation methods, not requiring any knowledge about the room impulse responses between the speaker and the microphones. More specifically, we propose a general framework for blind speech dereverberation based on multi-channel linear prediction (MCLP) and exploiting sparsity of the speech signal in the time-frequency domain.

Jukić, Ante — University of Oldenburg

Post-Filter Optimization for Multichannel Automotive Speech Enhancement

In an automotive environment, quality of speech communication using a hands-free equipment is often deteriorated by interfering car noise. In order to preserve the speech signal without car noise, a multichannel speech enhancement system including a beamformer and a post-filter can be applied. Since employing a beamformer alone is insufficient to substantially reducing the level of car noise, a post-filter has to be applied to provide further noise reduction, especially at low frequencies. In this thesis, two novel post-filter designs along with their optimization for different driving conditions are presented. The first post-filter design utilizes an adaptive smoothing factor for the power spectral density estimation as well as a hybrid noise coherence function. The hybrid noise coherence function is a mixture of the diffuse and the measured noise coherence functions for a specific driving condition. The second post-filter design applies ...

Yu, Huajun — Technische Universität Braunschweig

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.