Multi-microphone speech enhancement: An integration of a priori and data-dependent spatial information

A speech signal captured by multiple microphones is often subject to a reduced intelligibility and quality due to the presence of noise and room acoustic interferences. Multi-microphone speech enhancement systems therefore aim at the suppression or cancellation of such undesired signals without substantial distortion of the speech signal. A fundamental aspect to the design of several multi-microphone speech enhancement systems is that of the spatial information which relates each microphone signal to the desired speech source. This spatial information is unknown in practice and has to be somehow estimated. Under certain conditions, however, the estimated spatial information can be inaccurate, which subsequently degrades the performance of a multi-microphone speech enhancement system. This doctoral dissertation is focused on the development and evaluation of acoustic signal processing algorithms in order to address this issue. Specifically, as opposed to conventional means of estimating ...

Ali, Randall — KU Leuven

Development and evaluation of psychoacoustically motivated binaural noise reduction and cue preservation techniques

Due to their decreased ability to understand speech hearing impaired may have difficulties to interact in social groups, especially when several people are talking simultaneously. Fortunately, in the last decades hearing aids have evolved from simple sound amplifiers to modern digital devices with complex functionalities including noise reduction algorithms, which are crucial to improve speech understanding in background noise for hearing-impaired persons. Since many hearing aid users are fitted with two hearing aids, so-called binaural hearing aids have been developed, which exchange data and signals through a wireless link such that the processing in both hearing aids can be synchronized. In addition to reducing noise and limiting speech distortion, another important objective of noise reduction algorithms in binaural hearing aids is the preservation of the listener’s impression of the acoustical scene, in order to exploit the binaural hearing advantage and ...

Marquardt, Daniel — University of Oldenburg, Germany

Design and evaluation of noise reduction techniques for binaural hearing aids

One of the main complaints of hearing aid users is their degraded speech understanding in noisy environments. Modern hearing aids therefore include noise reduction techniques. These techniques are typically designed for a monaural application, i.e. in a single device. However, the majority of hearing aid users currently have hearing aids at both ears in a so-called bilateral fitting, as it is widely accepted that this leads to a better speech understanding and user satisfaction. Unfortunately, the independent signal processing (in particular the noise reduction) in a bilateral fitting can destroy the so-called binaural cues, namely the interaural time and level differences (ITDs and ILDs) which are used to localize sound sources in the horizontal plane. A recent technological advance are so-called binaural hearing aids, where a wireless link allows for the exchange of data (or even microphone signals) between the ...

Cornelis, Bram — KU Leuven

Informed spatial filters for speech enhancement

In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...

Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg

Integrating monaural and binaural cues for sound localization and segregation in reverberant environments

The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. Binaural processing, where input signals resemble those that enter the two ears, is of particular interest in the CASA field. The dominant approach to binaural segregation has been to derive spatially selective filters in order to enhance the signal in a direction of interest. As such, the problems of sound localization and sound segregation are closely tied. While spatial filtering has been widely utilized, substantial performance degradation is incurred in reverberant environments and more fundamentally, segregation cannot be performed without sufficient spatial separation between sources. This dissertation ...

Woodruff, John — The Ohio State University

A multimicrophone approach to speech processing in a smart-room environment

Recent advances in computer technology and speech and language processing have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple multimodal sensors, also known as smart-rooms, has considerably grown. In general, it is well-known that the quality of speech signals captured by microphones that can be located several meters away from the speakers is severely distorted by acoustic noise and room reverberation. In the context of the development of hands-free speech applications in smart-room environments, the use of obtrusive sensors like close-talking microphones is usually not allowed, and consequently, speech technologies must operate on the basis of distant-talking recordings. In such conditions, speech technologies that usually perform reasonably well in free of noise and ...

Abad, Alberto — Universitat Politecnica de Catalunya

Preserving binaural cues in noise reduction algorithms for hearing aids

Hearing aid users experience great difficulty in understanding speech in noisy environments. This has led to the introduction of noise reduction algorithms in hearing aids. The development of these algorithms is typically done monaurally. However, the human auditory system is a binaural system, which compares and combines the signals received by both ears to perceive a sound source as a single entity in space. Providing two monaural, independently operating, noise reduction systems, i.e. a bilateral configuration, to the hearing aid user may disrupt binaural information, needed to localize sound sources correctly and to improve speech perception in noise. In this research project, we first examined the influence of commercially available, bilateral, noise reduction algorithms on binaural hearing. Extensive objective and perceptual evaluations showed that the bilateral adaptive directional microphone (ADM) and the bilateral fixed directional microphone, two of the most ...

Van den Bogaert, Tim — Katholieke Universiteit Leuven

Robust Direction-of-Arrival estimation and spatial filtering in noisy and reverberant environments

The advent of multi-microphone setups on a plethora of commercial devices in recent years has generated a newfound interest in the development of robust microphone array signal processing methods. These methods are generally used to either estimate parameters associated with acoustic scene or to extract signal(s) of interest. In most practical scenarios, the sources are located in the far-field of a microphone array where the main spatial information of interest is the direction-of-arrival (DOA) of the plane waves originating from the source positions. The focus of this thesis is to incorporate robustness against either lack of or imperfect/erroneous information regarding the DOAs of the sound sources within a microphone array signal processing framework. The DOAs of sound sources is by itself important information, however, it is most often used as a parameter for a subsequent processing method. One of the ...

Chakrabarty, Soumitro — Friedrich-Alexander Universität Erlangen-Nürnberg

Multi-microphone noise reduction and dereverberation techniques for speech applications

In typical speech communication applications, such as hands-free mobile telephony, voice-controlled systems and hearing aids, the recorded microphone signals are corrupted by background noise, room reverberation and far-end echo signals. This signal degradation can lead to total unintelligibility of the speech signal and decreases the performance of automatic speech recognition systems. In this thesis several multi-microphone noise reduction and dereverberation techniques are developed. In Part I we present a Generalised Singular Value Decomposition (GSVD) based optimal filtering technique for enhancing multi-microphone speech signals which are degraded by additive coloured noise. Several techniques are presented for reducing the computational complexity and we show that the GSVD-based optimal filtering technique can be integrated into a `Generalised Sidelobe Canceller' type structure. Simulations show that the GSVD-based optimal filtering technique achieves a larger signal-to-noise ratio improvement than standard fixed and adaptive beamforming techniques and ...

Doclo, Simon — Katholieke Universiteit Leuven

Speech derereverberation in noisy environments using time-frequency domain signal models

Reverberation is the sum of reflected sound waves and is present in any conventional room. Speech communication devices such as mobile phones in hands-free mode, tablets, smart TVs, teleconferencing systems, hearing aids, voice-controlled systems, etc. use one or more microphones to pick up the desired speech signals. When the microphones are not in the proximity of the desired source, strong reverberation and noise can degrade the signal quality at the microphones and can impair the intelligibility and the performance of automatic speech recognizers. Therefore, it is a highly demanded task to process the microphone signals such that reverberation and noise are reduced. The process of reducing or removing reverberation from recorded signals is called dereverberation. As dereverberation is usually a completely blind problem, where the only available information are the microphone signals, and as the acoustic scenario can be non-stationary, ...

Braun, Sebastian — Friedrich-Alexander Universität Erlangen-Nürnberg

Distributed Signal Processing Algorithms for Acoustic Sensor Networks

In recent years, there has been a proliferation of wireless devices for individual use to the point of being ubiquitous. Recent trends have been incorporating many of these devices (or nodes) together, which acquire signals and work in unison over wireless channels, in order to accomplish a predefined task. This type of cooperative sensing and communication between devices form the basis of a so-called wireless sensor network (WSN). Due to the ever increasing processing power of these nodes, WSNs are being assigned more complicated and computationally demanding tasks. Recent research has started to exploit this increased processing power in order for the WSNs to perform tasks pertaining to audio signal acquisition and processing forming so-called wireless acoustic sensor networks (WASNs). Audio signal processing poses new and unique problems when compared to traditional sensing applications as the signals observed often have ...

Szurley, Joseph C. — KU Leuven

Synthetic reproduction of head-related transfer functions by using microphone arrays

Spatial hearing for human listeners is based on the interaural as well as on the monaural analysis of the signals arriving at both ears, enabling the listeners to assign certain spatial components to these signals. This spatial aspect gets lost when the signals are reproduced via headphones without considering the acoustical influence of the head and torso, i.e. head-related transfer function (HRTFs). A common procedure to take into account spatial aspects in a binaural reproduction is to use so-called artificial heads. Artificial heads are replicas of a human head and torso with average anthropometric geometries and built-in microphones in the ears. Although, the signals recorded with artificial heads contain relevant spatial aspects, binaural recordings using artificial heads often suffer from front-back confusions and the perception of the sound source being inside the head (internalization). These shortcomings can be attributed to ...

Rasumow, Eugen — University of Oldenburg

Sparse Multi-Channel Linear Prediction for Blind Speech Dereverberation

In many speech communication applications, such as hands-free telephony and hearing aids, the microphones are located at a distance from the speaker. Therefore, in addition to the desired speech signal, the microphone signals typically contain undesired reverberation and noise, caused by acoustic reflections and undesired sound sources. Since these disturbances tend to degrade the quality of speech communication, decrease speech intelligibility and negatively affect speech recognition, efficient dereverberation and denoising methods are required. This thesis deals with blind dereverberation methods, not requiring any knowledge about the room impulse responses between the speaker and the microphones. More specifically, we propose a general framework for blind speech dereverberation based on multi-channel linear prediction (MCLP) and exploiting sparsity of the speech signal in the time-frequency domain.

Jukić, Ante — University of Oldenburg

Digital signal processing algorithms for noise reduction, dynamic range compression, and feedback cancellation in hearing aids

Hearing loss can be caused by many factors, e.g., daily exposure to excessive noise in the work environment and listening to loud music. Another important reason can be age-related, i.e., the slow loss of hearing that occurs as people get older. In general hearing impaired people suffer from a frequency-dependent hearing loss and from a reduced dynamic range between the hearing threshold and the uncomfortable level. This means that the uncomfortable level for normal hearing and hearing impaired people suffering from so called sensorineural hearing loss remains the same but the hearing threshold and the sensitivity to soft sounds are shifted as a result of the hearing loss. To compensate for this kind of hearing loss the hearing aid should include a frequency-dependent and a level-dependent gain. The corresponding digital signal processing (DSP) algorithm is referred to as dynamic range ...

Ngo, Kim — KU Leuven

Mixed structural models for 3D audio in virtual environments

In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...

Geronazzo, Michele — University of Padova

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.