Distributed Localization and Tracking of Acoustic Sources

Localization, separation and tracking of acoustic sources are ancient challenges that lots of animals and human beings are doing intuitively and sometimes with an impressive accuracy. Artificial methods have been developed for various applications and conditions. The majority of those methods are centralized, meaning that all signals are processed together to produce the estimation results. The concept of distributed sensor networks is becoming more realistic as technology advances in the fields of nano-technology, micro electro-mechanic systems (MEMS) and communication. A distributed sensor network comprises scattered nodes which are autonomous, self-powered modules consisting of sensors, actuators and communication capabilities. A variety of layout and connectivity graphs are usually used. Distributed sensor networks have a broad range of applications, which can be categorized in ecology, military, environment monitoring, medical, security and surveillance. In this dissertation we develop algorithms for distributed sensor networks ...

Dorfan, Yuval — Bar Ilan University


Auditory Inspired Methods for Multiple Speaker Localization and Tracking Using a Circular Microphone Array

This thesis presents a new approach to the problem of localizing and tracking multiple acoustic sources using a microphone array. The use of microphone arrays offers enhancements of speech signals recorded in meeting rooms and office spaces. A common solution for speech enhancement in realistic environments with ambient noise and multi-path propagation is the application of so-called beamforming techniques, that enhance signals at the desired angle, using constructive interference, while attenuating signals coming from other directions, by destructive interference. Such beamforming algorithms require as prior knowledge the source location. Therefore, source localization and tracking algorithms are an integral part of such a system. However, conventional localization algorithms deteriorate in realistic scenarios with multiple concurrent speakers. In contrast to conventional localization algorithms, the localization algorithm presented in this thesis makes use of fundamental frequency or pitch information of speech signals in ...

Habib, Tania — Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria


Auditory Inspired Methods for Multiple Speaker Localization and Tracking Using a Circular Microphone Array

This thesis presents a new approach to the problem of localizing and tracking multiple acoustic sources using a microphone array. The use of microphone arrays offers enhancements of speech signals recorded in meeting rooms and office spaces. A common solution for speech enhancement in realistic environments with ambient noise and multi-path propagation is the application of so-called beamforming techniques, that enhance signals at the desired angle, using constructive interference, while attenuating signals coming from other directions, by destructive interference. Such beamforming algorithms require as prior knowledge the source location. Therefore, source localization and tracking algorithms are an integral part of such a system. However, conventional localization algorithms deteriorate in realistic scenarios with multiple concurrent speakers. In contrast to conventional localization algorithms, the localization algorithm presented in this thesis makes use of fundamental frequency or pitch information of speech signals in ...

Tania Habib — Graz University of Technology


A multimicrophone approach to speech processing in a smart-room environment

Recent advances in computer technology and speech and language processing have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple multimodal sensors, also known as smart-rooms, has considerably grown. In general, it is well-known that the quality of speech signals captured by microphones that can be located several meters away from the speakers is severely distorted by acoustic noise and room reverberation. In the context of the development of hands-free speech applications in smart-room environments, the use of obtrusive sensors like close-talking microphones is usually not allowed, and consequently, speech technologies must operate on the basis of distant-talking recordings. In such conditions, speech technologies that usually perform reasonably well in free of noise and ...

Abad, Alberto — Universitat Politecnica de Catalunya


Cognitive Models for Acoustic and Audiovisual Sound Source Localization

Sound source localization algorithms have a long research history in the field of digital signal processing. Many common applications like intelligent personal assistants, teleconferencing systems and methods for technical diagnosis in acoustics require an accurate localization of sound sources in the environment. However, dynamic environments entail a particular challenge for these systems. For instance, voice controlled smart home applications, where the speaker, as well as potential noise sources, are moving within the room, are a typical example of dynamic environments. Classical sound source localization systems only have limited capabilities to deal with dynamic acoustic scenarios. In this thesis, three novel approaches to sound source localization that extend existing classical methods will be presented. The first system is proposed in the context of audiovisual source localization. Determining the position of sound sources in adverse acoustic conditions can be improved by including ...

Schymura, Christopher — Ruhr University Bochum


Robust Direction-of-Arrival estimation and spatial filtering in noisy and reverberant environments

The advent of multi-microphone setups on a plethora of commercial devices in recent years has generated a newfound interest in the development of robust microphone array signal processing methods. These methods are generally used to either estimate parameters associated with acoustic scene or to extract signal(s) of interest. In most practical scenarios, the sources are located in the far-field of a microphone array where the main spatial information of interest is the direction-of-arrival (DOA) of the plane waves originating from the source positions. The focus of this thesis is to incorporate robustness against either lack of or imperfect/erroneous information regarding the DOAs of the sound sources within a microphone array signal processing framework. The DOAs of sound sources is by itself important information, however, it is most often used as a parameter for a subsequent processing method. One of the ...

Chakrabarty, Soumitro — Friedrich-Alexander Universität Erlangen-Nürnberg


Speech derereverberation in noisy environments using time-frequency domain signal models

Reverberation is the sum of reflected sound waves and is present in any conventional room. Speech communication devices such as mobile phones in hands-free mode, tablets, smart TVs, teleconferencing systems, hearing aids, voice-controlled systems, etc. use one or more microphones to pick up the desired speech signals. When the microphones are not in the proximity of the desired source, strong reverberation and noise can degrade the signal quality at the microphones and can impair the intelligibility and the performance of automatic speech recognizers. Therefore, it is a highly demanded task to process the microphone signals such that reverberation and noise are reduced. The process of reducing or removing reverberation from recorded signals is called dereverberation. As dereverberation is usually a completely blind problem, where the only available information are the microphone signals, and as the acoustic scenario can be non-stationary, ...

Braun, Sebastian — Friedrich-Alexander Universität Erlangen-Nürnberg


Solving inverse problems in room acoustics using physical models, sparse regularization and numerical optimization

Reverberation consists of a complex acoustic phenomenon that occurs inside rooms. Many audio signal processing methods, addressing source localization, signal enhancement and other tasks, often assume absence of reverberation. Consequently, reverberant environments are considered challenging as state-ofthe-art methods can perform poorly. The acoustics of a room can be described using a variety of mathematical models, among which, physical models are the most complete and accurate. The use of physical models in audio signal processing methods is often non-trivial since it can lead to ill-posed inverse problems. These inverse problems require proper regularization to achieve meaningful results and involve the solution of computationally intensive large-scale optimization problems. Recently, however, sparse regularization has been applied successfully to inverse problems arising in different scientific areas. The increased computational power of modern computers and the development of new efficient optimization algorithms makes it possible ...

Antonello, Niccolò — KU Leuven


Spherical Microphone Array Processing for Acoustic Parameter Estimation and Signal Enhancement

In many distant speech acquisition scenarios, such as hands-free telephony or teleconferencing, the desired speech signal is corrupted by noise and reverberation. This degrades both the speech quality and intelligibility, making communication difficult or even impossible. Speech enhancement techniques seek to mitigate these effects and extract the desired speech signal. This objective is commonly achieved through the use of microphone arrays, which take advantage of the spatial properties of the sound field in order to reduce noise and reverberation. Spherical microphone arrays, where the microphones are arranged in a spherical configuration, usually mounted on a rigid baffle, are able to analyze the sound field in three dimensions; the captured sound field can then be efficiently described in the spherical harmonic domain (SHD). In this thesis, a number of novel spherical array processing algorithms are proposed, based in the SHD. In ...

Jarrett, Daniel P. — Imperial College London


Inferring Room Geometries

Determining the geometry of an acoustic enclosure using microphone arrays has become an active area of research. Knowledge gained about the acoustic environment, such as the location of reflectors, can be advantageous for applications such as sound source localization, dereverberation and adaptive echo cancellation by assisting in tracking environment changes and helping the initialization of such algorithms. A methodology to blindly infer the geometry of an acoustic enclosure by estimating the location of reflective surfaces based on acoustic measurements using an arbitrary array geometry is developed and analyzed. The starting point of this work considers a geometric constraint, valid both in two and three-dimensions, that converts time-of-arrival and time-difference-of-arrival information into elliptical constraints about the location of reflectors. Multiple constraints are combined to yield the line or plane parameters of the reflectors by minimizing a specific cost function in the ...

Filos, Jason — Imperial College London


Sparse Multi-Channel Linear Prediction for Blind Speech Dereverberation

In many speech communication applications, such as hands-free telephony and hearing aids, the microphones are located at a distance from the speaker. Therefore, in addition to the desired speech signal, the microphone signals typically contain undesired reverberation and noise, caused by acoustic reflections and undesired sound sources. Since these disturbances tend to degrade the quality of speech communication, decrease speech intelligibility and negatively affect speech recognition, efficient dereverberation and denoising methods are required. This thesis deals with blind dereverberation methods, not requiring any knowledge about the room impulse responses between the speaker and the microphones. More specifically, we propose a general framework for blind speech dereverberation based on multi-channel linear prediction (MCLP) and exploiting sparsity of the speech signal in the time-frequency domain.

Jukić, Ante — University of Oldenburg


Efficient parametric modeling, identification and equalization of room acoustics

Room acoustic signal enhancement (RASE) applications, such as digital equalization, acoustic echo and feedback cancellation, which are commonly found in communication devices and audio equipment, aim at processing the acoustic signals with the final goal of improving the perceived sound quality in rooms. In order to do so, signal processing algorithms require the acoustic response of the room to be represented by means of parametric models and to be identified from the input and output signals of the room acoustic system. In particular, a good model should be both accurate, thus capturing those features of room acoustics that are physically and perceptually most relevant, and efficient, so that it can be implemented as a digital filter and used in practical signal processing tasks. This thesis addresses the fundamental question in room acoustic signal processing concerning the appropriateness of different parametric ...

Vairetti, Giacomo — KU Leuven


Feedback Delay Networks in Artificial Reverberation and Reverberation Enhancement

In today's audio production and reproduction as well as in music performance practices it has become common practice to alter reverberation artificially through electronics or electro-acoustics. For music productions, radio plays, and movie soundtracks, the sound is often captured in small studio spaces with little to no reverberation to save real estate and to ensure a controlled environment such that the artistically intended spatial impression can be added during post-production. Spatial sound reproduction systems require flexible adjustment of artificial reverberation to the diffuse sound portion to help the reconstruction of the spatial impression. Many modern performance spaces are multi-purpose, and the reverberation needs to be adjustable to the desired performance style. Employing electro-acoustic feedback, also known as Reverberation Enhancement Systems (RESs), it is possible to extend the physical to the desired reverberation. These examples demonstrate a wide range of applications ...

Schlecht, Sebastian Jiro — Friedrich-Alexander-Universität Erlangen-Nürnberg


Flexible Multi-Microphone Acquisition and Processing of Spatial Sound Using Parametric Sound Field Representations

This thesis deals with the efficient and flexible acquisition and processing of spatial sound using multiple microphones. In spatial sound acquisition and processing, we use multiple microphones to capture the sound of multiple sources being simultaneously active at a rever- berant recording side and process the sound depending on the application at the application side. Typical applications include source extraction, immersive spatial sound reproduction, or speech enhancement. A flexible sound acquisition and processing means that we can capture the sound with almost arbitrary microphone configurations without constraining the application at the ap- plication side. This means that we can realize and adjust the different applications indepen- dently of the microphone configuration used at the recording side. For example in spatial sound reproduction, where we aim at reproducing the sound such that the listener perceives the same impression as if he ...

Thiergart, Oliver — Friedrich-Alexander-Universitat Erlangen-Nurnberg


Informed spatial filters for speech enhancement

In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...

Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.