Distributed Localization and Tracking of Acoustic Sources (2018)
Acoustic sensor network geometry calibration and applications
In the modern world, we are increasingly surrounded by computation devices with communication links and one or more microphones. Such devices are, for example, smartphones, tablets, laptops or hearing aids. These devices can work together as nodes in an acoustic sensor network (ASN). Such networks are a growing platform that opens the possibility for many practical applications. ASN based speech enhancement, source localization, and event detection can be applied for teleconferencing, camera control, automation, or assisted living. For this kind of applications, the awareness of auditory objects and their spatial positioning are key properties. In order to provide these two kinds of information, novel methods have been developed in this thesis. Information on the type of auditory objects is provided by a novel real-time sound classification method. Information on the position of human speakers is provided by a novel localization ...
Plinge, Axel — TU Dortmund University
This thesis presents a new approach to the problem of localizing and tracking multiple acoustic sources using a microphone array. The use of microphone arrays offers enhancements of speech signals recorded in meeting rooms and office spaces. A common solution for speech enhancement in realistic environments with ambient noise and multi-path propagation is the application of so-called beamforming techniques, that enhance signals at the desired angle, using constructive interference, while attenuating signals coming from other directions, by destructive interference. Such beamforming algorithms require as prior knowledge the source location. Therefore, source localization and tracking algorithms are an integral part of such a system. However, conventional localization algorithms deteriorate in realistic scenarios with multiple concurrent speakers. In contrast to conventional localization algorithms, the localization algorithm presented in this thesis makes use of fundamental frequency or pitch information of speech signals in ...
Habib, Tania — Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria
A multimicrophone approach to speech processing in a smart-room environment
Recent advances in computer technology and speech and language processing have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple multimodal sensors, also known as smart-rooms, has considerably grown. In general, it is well-known that the quality of speech signals captured by microphones that can be located several meters away from the speakers is severely distorted by acoustic noise and room reverberation. In the context of the development of hands-free speech applications in smart-room environments, the use of obtrusive sensors like close-talking microphones is usually not allowed, and consequently, speech technologies must operate on the basis of distant-talking recordings. In such conditions, speech technologies that usually perform reasonably well in free of noise and ...
Abad, Alberto — Universitat Politecnica de Catalunya
Signal processing algorithms for wireless acoustic sensor networks
Recent academic developments have initiated a paradigm shift in the way spatial sensor data can be acquired. Traditional localized and regularly arranged sensor arrays are replaced by sensor nodes that are randomly distributed over the entire spatial field, and which communicate with each other or with a master node through wireless communication links. Together, these nodes form a so-called ‘wireless sensor network’ (WSN). Each node of a WSN has a local sensor array and a signal processing unit to perform computations on the acquired data. The advantage of WSNs compared to traditional (wired) sensor arrays, is that many more sensors can be used that physically cover the full spatial field, which typically yields more variety (and thus more information) in the signals. It is likely that future data acquisition, control and physical monitoring, will heavily rely on this type of ...
Bertrand, Alexander — Katholieke Universiteit Leuven
Cognitive Models for Acoustic and Audiovisual Sound Source Localization
Sound source localization algorithms have a long research history in the field of digital signal processing. Many common applications like intelligent personal assistants, teleconferencing systems and methods for technical diagnosis in acoustics require an accurate localization of sound sources in the environment. However, dynamic environments entail a particular challenge for these systems. For instance, voice controlled smart home applications, where the speaker, as well as potential noise sources, are moving within the room, are a typical example of dynamic environments. Classical sound source localization systems only have limited capabilities to deal with dynamic acoustic scenarios. In this thesis, three novel approaches to sound source localization that extend existing classical methods will be presented. The first system is proposed in the context of audiovisual source localization. Determining the position of sound sources in adverse acoustic conditions can be improved by including ...
Schymura, Christopher — Ruhr University Bochum
Distributed Signal Processing Algorithms for Multi-Task Wireless Acoustic Sensor Networks
Recent technological advances in analogue and digital electronics as well as in hardware miniaturization have taken wireless sensing devices to another level by introducing low-power communication protocols, improved digital signal processing capabilities and compact sensors. When these devices perform a certain pre-defined signal processing task such as the estimation or detection of phenomena of interest, a cooperative scheme through wireless connections can significantly enhance the overall performance, especially in adverse conditions. The resulting network consisting of such connected devices (or nodes) is referred to as a wireless sensor network (WSN). In acoustical applications (e.g., speech enhancement) a variant of WSNs, called wireless acoustic sensor networks (WASNs) can be employed in which the sensing unit at each node consists of a single microphone or a microphone array. The nodes of such a WASN can then cooperate to perform a multi-channel acoustic ...
Hassani, Amin — KU Leuven
A Geometric Deep Learning Approach to Sound Source Localization and Tracking
The localization and tracking of sound sources using microphone arrays is a problem that, even if it has attracted attention from the signal processing research community for decades, remains open. In recent years, deep learning models have surpassed the state-of-the-art that had been established by classic signal processing techniques, but these models still struggle with handling rooms with strong reverberations or tracking multiple sources that dynamically appear and disappear, especially when we cannot apply any criteria to classify or order them. In this thesis, we follow the ideas of the Geometric Deep Learning framework to propose new models and techniques that mean an advance of the state-of-the-art in the aforementioned scenarios. As the input of our models, we use acoustic power maps computed using the SRP-PHAT algorithm, a classic signal processing technique that allows us to estimate the acoustic energy ...
Diaz-Guerra, David — University of Zaragoza
Probabilistic Model-Based Multiple Pitch Tracking of Speech
Multiple pitch tracking of speech is an important task for the segregation of multiple speakers in a single-channel recording. In this thesis, a probabilistic model-based approach for estimation and tracking of multiple pitch trajectories is proposed. A probabilistic model that captures pitch-dependent characteristics of the single-speaker short-time spectrum is obtained a priori from clean speech data. The resulting speaker model, which is based on Gaussian mixture models, can be trained either in a speaker independent (SI) or a speaker dependent (SD) fashion. Speaker models are then combined using an interaction model to obtain a probabilistic description of the observed speech mixture. A factorial hidden Markov model is applied for tracking the pitch trajectories of multiple speakers over time. The probabilistic model-based approach is capable to explicitly incorporate timbral information and all associated uncertainties of spectral structure into the model. While ...
Wohlmayr, Michael — Graz University of Technology
Robust Adaptive Machine Learning Algorithms for Distributed Signal Processing
Distributed networks comprising a large number of nodes, e.g., Wireless Sensor Networks, Personal Computers (PC’s), laptops, smart phones, etc., which cooperate with each other in order to reach a common goal, constitute a promising technology for several applications. Typical examples include: distributed environmental monitoring, acoustic source localization, power spectrum estimation, etc. Sophisticated cooperation mechanisms can significantly benefit the learning process, through which the nodes achieve their common objective. In this dissertation, the problem of adaptive learning in distributed networks is studied, focusing on the task of distributed estimation. A set of nodes sense information related to certain parameters and the estimation of these parameters constitutes the goal. Towards this direction, nodes exploit locally sensed measurements as well as information springing from interactions with other nodes of the network. Throughout this dissertation, the cooperation among the nodes follows the diffusion optimization ...
Chouvardas, Symeon — National and Kapodistrian University of Athens
This thesis presents a new approach to the problem of localizing and tracking multiple acoustic sources using a microphone array. The use of microphone arrays offers enhancements of speech signals recorded in meeting rooms and office spaces. A common solution for speech enhancement in realistic environments with ambient noise and multi-path propagation is the application of so-called beamforming techniques, that enhance signals at the desired angle, using constructive interference, while attenuating signals coming from other directions, by destructive interference. Such beamforming algorithms require as prior knowledge the source location. Therefore, source localization and tracking algorithms are an integral part of such a system. However, conventional localization algorithms deteriorate in realistic scenarios with multiple concurrent speakers. In contrast to conventional localization algorithms, the localization algorithm presented in this thesis makes use of fundamental frequency or pitch information of speech signals in ...
Tania Habib — Graz University of Technology
Distributed Signal Processing Algorithms for Acoustic Sensor Networks
In recent years, there has been a proliferation of wireless devices for individual use to the point of being ubiquitous. Recent trends have been incorporating many of these devices (or nodes) together, which acquire signals and work in unison over wireless channels, in order to accomplish a predefined task. This type of cooperative sensing and communication between devices form the basis of a so-called wireless sensor network (WSN). Due to the ever increasing processing power of these nodes, WSNs are being assigned more complicated and computationally demanding tasks. Recent research has started to exploit this increased processing power in order for the WSNs to perform tasks pertaining to audio signal acquisition and processing forming so-called wireless acoustic sensor networks (WASNs). Audio signal processing poses new and unique problems when compared to traditional sensing applications as the signals observed often have ...
Szurley, Joseph — KU Leuven
Sparsity-Aware Wireless Networks: Localization and Sensor Selection
Wireless networks have revolutionized nowadays world by providing real-time cost efficient service and connectivity. Even such an unprecedented level of service could not fulfill the insatiable desire of the modern world for more advanced technologies. As a result, a great deal of attention has been directed towards (mobile) wireless sensor networks (WSNs) which are comprised of considerably cheap nodes that can cooperate to perform complex tasks in a distributed fashion in extremely harsh environments. Unique features of wireless environments, added complexity owing to mobility, distributed nature of the network setup, and tight performance and energy constraints, pose a challenge for researchers to devise systems which strike a proper balance between performance and resource utilization. We study some of the fundamental challenges of wireless (sensor) networks associated with resource efficiency, scalability, and location-awareness. The pivotal point which distinguishes our studies from ...
Jamali-Rad, Hadi — TU Delft
Distributed Signal Processing Algorithms for Acoustic Sensor Networks
In recent years, there has been a proliferation of wireless devices for individual use to the point of being ubiquitous. Recent trends have been incorporating many of these devices (or nodes) together, which acquire signals and work in unison over wireless channels, in order to accomplish a predefined task. This type of cooperative sensing and communication between devices form the basis of a so-called wireless sensor network (WSN). Due to the ever increasing processing power of these nodes, WSNs are being assigned more complicated and computationally demanding tasks. Recent research has started to exploit this increased processing power in order for the WSNs to perform tasks pertaining to audio signal acquisition and processing forming so-called wireless acoustic sensor networks (WASNs). Audio signal processing poses new and unique problems when compared to traditional sensing applications as the signals observed often have ...
Szurley, Joseph C. — KU Leuven
Informed spatial filters for speech enhancement
In modern devices which provide hands-free speech capturing functionality, such as hands-free communication kits and voice-controlled devices, the received speech signal at the microphones is corrupted by background noise, interfering speech signals, and room reverberation. In many practical situations, the microphones are not necessarily located near the desired source, and hence, the ratio of the desired speech power to the power of the background noise, the interfering speech, and the reverberation at the microphones can be very low, often around or even below 0 dB. In such situations, the comfort of human-to-human communication, as well as the accuracy of automatic speech recognisers for voice-controlled applications can be signi cantly degraded. Therefore, e ffective speech enhancement algorithms are required to process the microphone signals before transmitting them to the far-end side for communication, or before feeding them into a speech recognition ...
Taseska, Maja — Friedrich-Alexander Universität Erlangen-Nürnberg
Robust Wireless Localization in Harsh Mixed Line-of-Sight/Non-Line-of-Sight Environments
This PhD thesis considers the problem of locating some target nodes in different wireless infrastructures such as wireless cellular radio networks and wireless sensor networks. To be as realistic as possible, mixed line-of-sight and non-line-of-sight (LOS/NLOS) localization environment is introduced. Both the conventional non-cooperative localization and the new emerging cooperative localization have been studied thoroughly. Owing to the random nature of the measurements, probabilistic methods are more advanced as compared to the old-fashioned geometric methods. The gist behind the probabilistic methods is to infer the unknown positions of the target nodes in an estimation process, given a set of noisy position related measurements, a probabilistic measurement model, and a few known reference positions. In contrast to the majority of the existing methods, harsh but practical constraints are taken into account: neither offline calibration nor non-line-of-sight state identification is equipped in ...
Yin, Feng — Technische Universität Darmstadt
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.