Some Contributions to Machine Learning-based System Identification and Speech Enhancement for Nonlinear Acoustic Echo Control (2024)
Multimedia consumer electronics are nowadays everywhere from teleconferencing, hands-free communications, in-car communications to smart TV applications and more. We are living in a world of telecommunication where ideal scenarios for implementing these applications are hard to find. Instead, practical implementations typically bring many problems associated to each real-life scenario. This thesis mainly focuses on two of these problems, namely, acoustic echo and acoustic feedback. On the one hand, acoustic echo cancellation (AEC) is widely used in mobile and hands-free telephony where the existence of echoes degrades the intelligibility and listening comfort. On the other hand, acoustic feedback limits the maximum amplification that can be applied in, e.g., in-car communications or in conferencing systems, before howling due to instability, appears. Even though AEC and acoustic feedback cancellation (AFC) are functional in many applications, there are still open issues. This means that ...
Gil-Cacho, Jose Manuel — KU Leuven
This thesis deals with several open problems in acoustic echo cancellation and acoustic feedback control. Our main goal has been to develop solutions that provide a high performance and sound quality, and behave in a robust way in realistic conditions. This can be achieved by departing from the traditional ad-hoc methods, and instead deriving theoretically well-founded solutions, based on results from parameter estimation and system identification. In the development of these solutions, the computational efficiency has permanently been taken into account as a design constraint, in that the complexity increase compared to the state-of-the-art solutions should not exceed 50 % of the original complexity. In the context of acoustic echo cancellation, we have investigated the problems of double-talk robustness, acoustic echo path undermodeling, and poor excitation. The two former problems have been tackled by including adaptive decorrelation filters in the ...
van Waterschoot, Toon — Katholieke Universiteit Leuven
Noise or interference is often assumed to be a random process. Conventional linear filtering, control or prediction techniques are used to cancel or reduce the noise. However, some noise processes have been shown to be nonlinear and deterministic. These nonlinear deterministic noise processes appear to be random when analysed with second order statistics. As nonlinear processes are widespread in nature it may be beneficial to exploit the coherence of the nonlinear deterministic noise with nonlinear filtering techniques. The nonlinear deterministic noise processes used in this thesis are generated from nonlinear difference or differential equations which are derived from real world scenarios. Analysis tools from the theory of nonlinear dynamics are used to determine an appropriate sampling rate of the nonlinear deterministic noise processes and their embedding dimensions. Nonlinear models, such as the Volterra series filter and the radial basis function ...
Strauch, Paul E. — University Of Edinburgh
Modern devices such as mobile phones, tablets or smart speakers are commonly equipped with several loudspeakers and microphones. If, for instance, one employs such a device for hands-free communication applications, the signals that are reproduced by the loudspeakers are propagated through the room and are inevitably acquired by the microphones. If no processing is applied, the participants in the far-end room receive delayed reverberated replicas of their own voice, which strongly degrades both speech intelligibility and user comfort. In order to prevent that so-called acoustic echoes are transmitted back to the far-end room, acoustic echo cancelers are commonly employed. The latter make use of adaptive filtering techniques to identify the propagation paths between loudspeakers and microphones. The estimated propagation paths are then employed to compute acoustic echo estimates, which are finally subtracted from the signals acquired by the microphones. In ...
Luis Valero, Maria — International Audio Laboratories Erlangen
Recently emerging techniques like wave field synthesis (WFS) or Higher-Order Ambisonics (HOA) allow for high-quality spatial audio reproduction, which makes them candidates for the audio reproduction in future telepresence systems or interactive gaming environments with acoustic human-machine interfaces. In such scenarios, acoustic echo cancellation (AEC) will generally be necessary to remove the loudspeaker echoes in the recorded microphone signals before further processing. Moreover, the reproduction quality of WFS or HOA can be improved by adaptive pre-equalization of the loudspeaker signals, as facilitated by listening room equalization (LRE). However, AEC and LRE require adaptive filters, where the large number of reproduction channels of WFS and HOA imply major computational and algorithmic challenges for the implementation of adaptive filters. A technique called wave-domain adaptive filtering (WDAF) promises to master these challenges. However, known literature is still far away from providing sufficient insight ...
Schneider, Martin — Friedrich-Alexander-University Erlangen-Nuremberg
Efficient parametric modeling, identification and equalization of room acoustics
Room acoustic signal enhancement (RASE) applications, such as digital equalization, acoustic echo and feedback cancellation, which are commonly found in communication devices and audio equipment, aim at processing the acoustic signals with the final goal of improving the perceived sound quality in rooms. In order to do so, signal processing algorithms require the acoustic response of the room to be represented by means of parametric models and to be identified from the input and output signals of the room acoustic system. In particular, a good model should be both accurate, thus capturing those features of room acoustics that are physically and perceptually most relevant, and efficient, so that it can be implemented as a digital filter and used in practical signal processing tasks. This thesis addresses the fundamental question in room acoustic signal processing concerning the appropriateness of different parametric ...
Vairetti, Giacomo — KU Leuven
This thesis presents a new approach to the problem of localizing and tracking multiple acoustic sources using a microphone array. The use of microphone arrays offers enhancements of speech signals recorded in meeting rooms and office spaces. A common solution for speech enhancement in realistic environments with ambient noise and multi-path propagation is the application of so-called beamforming techniques, that enhance signals at the desired angle, using constructive interference, while attenuating signals coming from other directions, by destructive interference. Such beamforming algorithms require as prior knowledge the source location. Therefore, source localization and tracking algorithms are an integral part of such a system. However, conventional localization algorithms deteriorate in realistic scenarios with multiple concurrent speakers. In contrast to conventional localization algorithms, the localization algorithm presented in this thesis makes use of fundamental frequency or pitch information of speech signals in ...
Habib, Tania — Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria
Robust Equalization of Multichannel Acoustic Systems
In most real-world acoustical scenarios, speech signals captured by distant microphones from a source are reverberated due to multipath propagation, and the reverberation may impair speech intelligibility. Speech dereverberation can be achieved by equalizing the channels from the source to microphones. Equalization systems can be computed using estimates of multichannel acoustic impulse responses. However, the estimates obtained from system identification always include errors; the fact that an equalization system is able to equalize the estimated multichannel acoustic system does not mean that it is able to equalize the true system. The objective of this thesis is to propose and investigate robust equalization methods for multichannel acoustic systems in the presence of system identification errors. Equalization systems can be computed using the multiple-input/output inverse theorem or multichannel least-squares method. However, equalization systems obtained from these methods are very sensitive to system ...
Zhang, Wancheng — Imperial College London
Multi-microphone noise reduction and dereverberation techniques for speech applications
In typical speech communication applications, such as hands-free mobile telephony, voice-controlled systems and hearing aids, the recorded microphone signals are corrupted by background noise, room reverberation and far-end echo signals. This signal degradation can lead to total unintelligibility of the speech signal and decreases the performance of automatic speech recognition systems. In this thesis several multi-microphone noise reduction and dereverberation techniques are developed. In Part I we present a Generalised Singular Value Decomposition (GSVD) based optimal filtering technique for enhancing multi-microphone speech signals which are degraded by additive coloured noise. Several techniques are presented for reducing the computational complexity and we show that the GSVD-based optimal filtering technique can be integrated into a `Generalised Sidelobe Canceller' type structure. Simulations show that the GSVD-based optimal filtering technique achieves a larger signal-to-noise ratio improvement than standard fixed and adaptive beamforming techniques and ...
Doclo, Simon — Katholieke Universiteit Leuven
Adaptive Noise Cancelation in Speech Signals
Today, adaptive algorithms represent one of the most frequently used computational tools for the processing of digital speech signals. This work investigates and analyzes the properties of adaptive algorithms in speech communication applications where rigorous conditions apply, such as noise and echo cancelation. Like other theses in this field do, it tries to tackle the ever-lasting problem of computational complexity vs. rate of convergence. It introduces some new adaptive methods that stem from the existing algorithms as well as a novel concept which has been entitled Optimal Step-Size (OSS). In the first part of the thesis we investigate some well-known, widely used adaptive techniques such as the Normalized Least Mean Squares (NLMS) and the Recursive Least Mean Squares (RLS). In spite of the fact that the NLMS and the RLS belong to the "simplest" principles, as far as complexity is ...
Malenovsky, Vladimir — Department of Telecommunications, Brno University of Technology, Czech Republic
Non-linear Spatial Filtering for Multi-channel Speech Enhancement
A large part of human speech communication takes place in noisy environments and is supported by technical devices. For example, a hearing-impaired person might use a hearing aid to take part in a conversation in a busy restaurant. These devices, but also telecommunication in noisy environments or voiced-controlled assistants, make use of speech enhancement and separation algorithms that improve the quality and intelligibility of speech by separating speakers and suppressing background noise as well as other unwanted effects such as reverberation. If the devices are equipped with more than one microphone, which is very common nowadays, then multi-channel speech enhancement approaches can leverage spatial information in addition to single-channel tempo-spectral information to perform the task. Traditionally, linear spatial filters, so-called beamformers, have been employed to suppress the signal components from other than the target direction and thereby enhance the desired ...
Tesch, Kristina — Universität Hamburg
Adaptive filtering techniques for noise reduction and acoustic feedback cancellation in hearing aids
Understanding speech in noise and the occurrence of acoustic feedback belong to the major problems of current hearing aid users. Hence, an urgent demand exists for efficient and well-working digital signal processing algorithms that offer a solution to these issues. In this thesis we develop adaptive filtering techniques for noise reduction and acoustic feedback cancellation. Thanks to the availability of low power digital signal processors, these algorithms can be integrated in a hearing aid. Because of the ongoing miniaturization in the hearing aid industry and the growing tendency towards multi-microphone hearing aids, robustness against imperfections such as microphone mismatch, has become a major issue in the design of a noise reduction algorithm. In this thesis we propose multimicrophone noise reduction techniques that are based on multi-channel Wiener filtering (MWF). Theoretical and experimental analysis demonstrate that these MWF-based techniques are less ...
Spriet, Ann — Katholieke Universiteit Leuven
Deep Learning-based Speaker Verification In Real Conditions
Smart applications like speaker verification have become essential in verifying the user's identity for availing of personal assistants or online banking services based on the user's voice characteristics. However, far-field or distant speaker verification is constantly affected by surrounding noises which can severely distort the speech signal. Moreover, speech signals propagating in long-range get reflected by various objects in the surrounding area, which creates reverberation and further degrades the signal quality. This PhD thesis explores deep learning-based multichannel speech enhancement techniques to improve the performance of speaker verification systems in real conditions. Multichannel speech enhancement aims to enhance distorted speech using multiple microphones. It has become crucial to many smart devices, which are flexible and convenient for speech applications. Three novel approaches are proposed to improve the robustness of speaker verification systems in noisy and reverberated conditions. Firstly, we integrate ...
Dowerah Sandipana — Universite de Lorraine, CNRS, Inria, Loria
Particle Filters and Markov Chains for Learning of Dynamical Systems
Sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC) methods provide computational tools for systematic inference and learning in complex dynamical systems, such as nonlinear and non-Gaussian state-space models. This thesis builds upon several methodological advances within these classes of Monte Carlo methods. Particular emphasis is placed on the combination of SMC and MCMC in so called particle MCMC algorithms. These algorithms rely on SMC for generating samples from the often highly autocorrelated state-trajectory. A specific particle MCMC algorithm, referred to as particle Gibbs with ancestor sampling (PGAS), is suggested. By making use of backward sampling ideas, albeit implemented in a forward-only fashion, PGAS enjoys good mixing even when using seemingly few particles in the underlying SMC sampler. This results in a computationally competitive particle MCMC algorithm. As illustrated in this thesis, PGAS is a useful tool for both ...
Lindsten, Fredrik — Linköping University
Adaptive filtering algorithms for acoustic echo and noise cancellation
In this thesis, we develop a number of algorithms for acoustic echo and noise cancellation. We derive a fast exact implementation for the affine projection algorithm (APA), and we also show that when using strong regularization the existing (approximating) fast techniques exhibit problems. We develop a number of algorithms for noise cancellation based on optimal filtering techniques for multi-microphone systems. By using QR-decomposition based techniques, a complexity reduction of a factor 50 to 100 is achieved compared to existing implementations. Finally, we show that instead of using a cascade of a noise-cancellation system and an echo-cancellation system, it is better to solve the combined problem as a global optimization problem. The aforementioned noise reduction techniques can be used to solve this optimization problem.
Rombouts, Geert — Katholieke Universiteit Leuven
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.