Adaptive Algorithms for Intelligent Acoustic Interfaces (2012)
Modern devices such as mobile phones, tablets or smart speakers are commonly equipped with several loudspeakers and microphones. If, for instance, one employs such a device for hands-free communication applications, the signals that are reproduced by the loudspeakers are propagated through the room and are inevitably acquired by the microphones. If no processing is applied, the participants in the far-end room receive delayed reverberated replicas of their own voice, which strongly degrades both speech intelligibility and user comfort. In order to prevent that so-called acoustic echoes are transmitted back to the far-end room, acoustic echo cancelers are commonly employed. The latter make use of adaptive filtering techniques to identify the propagation paths between loudspeakers and microphones. The estimated propagation paths are then employed to compute acoustic echo estimates, which are finally subtracted from the signals acquired by the microphones. In ...
Luis Valero, Maria — International Audio Laboratories Erlangen
Non-linear Spatial Filtering for Multi-channel Speech Enhancement
A large part of human speech communication takes place in noisy environments and is supported by technical devices. For example, a hearing-impaired person might use a hearing aid to take part in a conversation in a busy restaurant. These devices, but also telecommunication in noisy environments or voiced-controlled assistants, make use of speech enhancement and separation algorithms that improve the quality and intelligibility of speech by separating speakers and suppressing background noise as well as other unwanted effects such as reverberation. If the devices are equipped with more than one microphone, which is very common nowadays, then multi-channel speech enhancement approaches can leverage spatial information in addition to single-channel tempo-spectral information to perform the task. Traditionally, linear spatial filters, so-called beamformers, have been employed to suppress the signal components from other than the target direction and thereby enhance the desired ...
Tesch, Kristina — Universität Hamburg
Given the widespread use of miniaturized audio interfaces, echo control systems are faced with increasing challenges to address a large variety of acoustic conditions observed by such interfaces. This motivates the use of sophisticated machine learning-based techniques to overcome the limitations of conventional methods. The contributions in this thesis can be outlined by decomposing the task of nonlinear acoustic echo control into two subtasks: Nonlinear Acoustic Echo Cancellation (NAEC) and Acoustic Echo Suppression (AES). In particular, by formulating the single-channel NAEC model-adaptation task as a Bayesian recursive filtering problem, an evolutionary resampling strategy for particle filtering is proposed. The resulting Elitist Resampling Particle Filter (ERPF) is shown experimentally to be an efficient and high-performing approach that can be extended to address challenging conditions such as non-stationary interferers. The fundamental problem of nonlinear model design is addressed by proposing a novel ...
Halimeh, Mhd Modar — Friedrich-Alexander-Universität Erlangen-Nürnberg
Identifying the target speaker in hearing aid applications is an essential ingredient to improve speech intelligibility. Although several speech enhancement algorithms are available to reduce background noise or to perform source separation in multi-speaker scenarios, their performance depends on correctly identifying the target speaker to be enhanced. Recent advances in electroencephalography (EEG) have shown that it is possible to identify the target speaker which the listener is attending to using single-trial EEG-based auditory attention decoding (AAD) methods. However, in realistic acoustic environments the AAD performance is influenced by undesired disturbances such as interfering speakers, noise and reverberation. In addition, it is important for real-world hearing aid applications to close the AAD loop by presenting on-line auditory feedback. This thesis deals with the problem of identifying and enhancing the target speaker in realistic acoustic environments based on decoding the auditory attention ...
Aroudi, Ali — University of Oldenburg, Germany
Multimedia consumer electronics are nowadays everywhere from teleconferencing, hands-free communications, in-car communications to smart TV applications and more. We are living in a world of telecommunication where ideal scenarios for implementing these applications are hard to find. Instead, practical implementations typically bring many problems associated to each real-life scenario. This thesis mainly focuses on two of these problems, namely, acoustic echo and acoustic feedback. On the one hand, acoustic echo cancellation (AEC) is widely used in mobile and hands-free telephony where the existence of echoes degrades the intelligibility and listening comfort. On the other hand, acoustic feedback limits the maximum amplification that can be applied in, e.g., in-car communications or in conferencing systems, before howling due to instability, appears. Even though AEC and acoustic feedback cancellation (AFC) are functional in many applications, there are still open issues. This means that ...
Gil-Cacho, Jose Manuel — KU Leuven
Informed spatial filters for speech enhancement
In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...
Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg
Due to their decreased ability to understand speech hearing impaired may have difficulties to interact in social groups, especially when several people are talking simultaneously. Fortunately, in the last decades hearing aids have evolved from simple sound amplifiers to modern digital devices with complex functionalities including noise reduction algorithms, which are crucial to improve speech understanding in background noise for hearing-impaired persons. Since many hearing aid users are fitted with two hearing aids, so-called binaural hearing aids have been developed, which exchange data and signals through a wireless link such that the processing in both hearing aids can be synchronized. In addition to reducing noise and limiting speech distortion, another important objective of noise reduction algorithms in binaural hearing aids is the preservation of the listener’s impression of the acoustical scene, in order to exploit the binaural hearing advantage and ...
Marquardt, Daniel — University of Oldenburg, Germany
Speech derereverberation in noisy environments using time-frequency domain signal models
Reverberation is the sum of reflected sound waves and is present in any conventional room. Speech communication devices such as mobile phones in hands-free mode, tablets, smart TVs, teleconferencing systems, hearing aids, voice-controlled systems, etc. use one or more microphones to pick up the desired speech signals. When the microphones are not in the proximity of the desired source, strong reverberation and noise can degrade the signal quality at the microphones and can impair the intelligibility and the performance of automatic speech recognizers. Therefore, it is a highly demanded task to process the microphone signals such that reverberation and noise are reduced. The process of reducing or removing reverberation from recorded signals is called dereverberation. As dereverberation is usually a completely blind problem, where the only available information are the microphone signals, and as the acoustic scenario can be non-stationary, ...
Braun, Sebastian — Friedrich-Alexander Universität Erlangen-Nürnberg
Spherical Microphone Array Processing for Acoustic Parameter Estimation and Signal Enhancement
In many distant speech acquisition scenarios, such as hands-free telephony or teleconferencing, the desired speech signal is corrupted by noise and reverberation. This degrades both the speech quality and intelligibility, making communication difficult or even impossible. Speech enhancement techniques seek to mitigate these effects and extract the desired speech signal. This objective is commonly achieved through the use of microphone arrays, which take advantage of the spatial properties of the sound field in order to reduce noise and reverberation. Spherical microphone arrays, where the microphones are arranged in a spherical configuration, usually mounted on a rigid baffle, are able to analyze the sound field in three dimensions; the captured sound field can then be efficiently described in the spherical harmonic domain (SHD). In this thesis, a number of novel spherical array processing algorithms are proposed, based in the SHD. In ...
Jarrett, Daniel P. — Imperial College London
Prediction and Optimization of Speech Intelligibility in Adverse Conditions
In digital speech-communication systems like mobile phones, public address systems and hearing aids, conveying the message is one of the most important goals. This can be challenging since the intelligibility of the speech may be harmed at various stages before, during and after the transmission process from sender to receiver. Causes which create such adverse conditions include background noise, an unreliable internet connection during a Skype conversation or a hearing impairment of the receiver. To overcome this, many speech-communication systems include speech processing algorithms to compensate for these signal degradations like noise reduction. To determine the effect on speech intelligibility of these signal processing based solutions, the speech signal has to be evaluated by means of a listening test with human listeners. However, such tests are costly and time consuming. As an alternative, reliable and fast machine-driven intelligibility predictors are ...
Taal, Cees — Delft University of Technology
Spatio-Temporal Speech Enhancement in Adverse Acoustic Conditions
Never before has speech been captured as often by electronic devices equipped with one or multiple microphones, serving a variety of applications. It is the key aspect in digital telephony, hearing devices, and voice-driven human-to-machine interaction. When speech is recorded, the microphones also capture a variety of further, undesired sound components due to adverse acoustic conditions. Interfering speech, background noise and reverberation, i.e. the persistence of sound in a room after excitation caused by a multitude of reflections on the room enclosure, are detrimental to the quality and intelligibility of target speech as well as the performance of automatic speech recognition. Hence, speech enhancement aiming at estimating the early target-speech component, which contains the direct component and early reflections, is crucial to nearly all speech-related applications presently available. In this thesis, we compare, propose and evaluate existing and novel approaches ...
Dietzen, Thomas — KU Leuven
Non-intrusive Quality Evaluation of Speech Processed in Noisy and Reverberant Environments
In many speech applications such as hands-free telephony or voice-controlled home assistants, the distance between the user and the recording microphones can be relatively large. In such a far-field scenario, the recorded microphone signals are typically corrupted by noise and reverberation, which may severely degrade the performance of speech recognition systems and reduce intelligibility and quality of speech in communication applications. In order to limit these effects, speech enhancement algorithms are typically applied. The main objective of this thesis is to develop novel speech enhancement algorithms for noisy and reverberant environments and signal-based measures to evaluate these algorithms, focusing on solutions that are applicable in realistic scenarios. First, we propose a single-channel speech enhancement algorithm for joint noise and reverberation reduction. The proposed algorithm uses a spectral gain to enhance the input signal, where the gain is computed using a ...
Cauchi, Benjamin — University of Oldenburg
Efficient parametric modeling, identification and equalization of room acoustics
Room acoustic signal enhancement (RASE) applications, such as digital equalization, acoustic echo and feedback cancellation, which are commonly found in communication devices and audio equipment, aim at processing the acoustic signals with the final goal of improving the perceived sound quality in rooms. In order to do so, signal processing algorithms require the acoustic response of the room to be represented by means of parametric models and to be identified from the input and output signals of the room acoustic system. In particular, a good model should be both accurate, thus capturing those features of room acoustics that are physically and perceptually most relevant, and efficient, so that it can be implemented as a digital filter and used in practical signal processing tasks. This thesis addresses the fundamental question in room acoustic signal processing concerning the appropriateness of different parametric ...
Vairetti, Giacomo — KU Leuven
A multimicrophone approach to speech processing in a smart-room environment
Recent advances in computer technology and speech and language processing have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple multimodal sensors, also known as smart-rooms, has considerably grown. In general, it is well-known that the quality of speech signals captured by microphones that can be located several meters away from the speakers is severely distorted by acoustic noise and room reverberation. In the context of the development of hands-free speech applications in smart-room environments, the use of obtrusive sensors like close-talking microphones is usually not allowed, and consequently, speech technologies must operate on the basis of distant-talking recordings. In such conditions, speech technologies that usually perform reasonably well in free of noise and ...
Abad, Alberto — Universitat Politecnica de Catalunya
Embedded Optimization Algorithms for Perceptual Enhancement of Audio Signals
This thesis investigates the design and evaluation of an embedded optimization framework for the perceptual enhancement of audio signals which are degraded by linear and/or nonlinear distortion. In general, audio signal enhancement has the goal to improve the perceived audio quality, speech intelligibility, or another desired perceptual attribute of the distorted audio signal by applying a real-time digital signal processing algorithm. In the designed embedded optimization framework, the audio signal enhancement problem under consideration is formulated and solved as a per-frame numerical optimization problem, allowing to compute the enhanced audio signal frame that is optimal according to a desired perceptual attribute. The first stage of the embedded optimization framework consists in the formulation of the per-frame optimization problem aimed at maximally enhancing the desired perceptual attribute, by explicitly incorporating a suitable model of human sound perception. The second stage of ...
Defraene, Bruno — KU Leuven
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.