Solving inverse problems in room acoustics using physical models, sparse regularization and numerical optimization

Reverberation consists of a complex acoustic phenomenon that occurs inside rooms. Many audio signal processing methods, addressing source localization, signal enhancement and other tasks, often assume absence of reverberation. Consequently, reverberant environments are considered challenging as state-ofthe-art methods can perform poorly. The acoustics of a room can be described using a variety of mathematical models, among which, physical models are the most complete and accurate. The use of physical models in audio signal processing methods is often non-trivial since it can lead to ill-posed inverse problems. These inverse problems require proper regularization to achieve meaningful results and involve the solution of computationally intensive large-scale optimization problems. Recently, however, sparse regularization has been applied successfully to inverse problems arising in different scientific areas. The increased computational power of modern computers and the development of new efficient optimization algorithms makes it possible ...

Antonello, Niccolò — KU Leuven


Sensing physical fields: Inverse problems for the diffusion equation and beyond

Due to significant advances made over the last few decades in the areas of (wireless) networking, communications and microprocessor fabrication, the use of sensor networks to observe physical phenomena is rapidly becoming commonplace. Over this period, many aspects of sensor networks have been explored, yet a thorough understanding of how to analyse and process the vast amounts of sensor data collected remains an open area of research. This work, therefore, aims to provide theoretical, as well as practical, advances this area. In particular, we consider the problem of inferring certain underlying properties of the monitored phenomena, from our sensor measurements. Within mathematics, this is commonly formulated as an inverse problem; whereas in signal processing, it appears as a (multidimensional) sampling and reconstruction problem. Indeed it is well known that inverse problems are notoriously ill-posed and very demanding to solve; meanwhile ...

Murray-Bruce, John — Imperial College London


Implementation of the radiation characteristics of musical instruments in wave field synthesis applications

In this thesis a method to implement the radiation characteristics of musical instruments in wave field synthesis systems is developed. It is applied and tested in two loudspeaker systems. Because the loudspeaker systems have a comparably low number of loudspeakers the wave field is synthesized at discrete listening positions by solving a linear equation system. Thus, for every constellation of listening and source position all loudspeakers can be used for the synthesis. The calculations are done in spectral domain, denying sound propagation velocity at first. This approach causes artefacts in the loudspeaker signals and synthesis errors in the listening area which are compensated by means of psychoacoustic methods. With these methods the aliasing frequency is determined by the extent of the listening area whereas in other wave field synthesis systems it is determined by the distance of adjacent loudspeakers. Musical ...

Ziemer, Tim — University of Hamburg


Group-Sparse Regression - With Applications in Spectral Analysis and Audio Signal Processing

This doctorate thesis focuses on sparse regression, a statistical modeling tool for selecting valuable predictors in underdetermined linear models. By imposing different constraints on the structure of the variable vector in the regression problem, one obtains estimates which have sparse supports, i.e., where only a few of the elements in the response variable have non-zero values. The thesis collects six papers which, to a varying extent, deals with the applications, implementations, modifications, translations, and other analysis of such problems. Sparse regression is often used to approximate additive models with intricate, non-linear, non-smooth or otherwise problematic functions, by creating an underdetermined model consisting of candidate values for these functions, and linear response variables which selects among the candidates. Sparse regression is therefore a widely used tool in applications such as, e.g., image processing, audio processing, seismological and biomedical modeling, but is ...

Kronvall, Ted — Lund University


Cost functions for acoustic filters estimations in reverberant mixtures

This work is focused on the processing of multichannel and multisource audio signals. From an audio mixture of several audio sources recorded in a reverberant room, we wish to es- timate the acoustic responses (a.k.a. mixing filters) between the sources and the microphones. To solve this inverse problem one need to take into account additional hypotheses on the nature of the acoustic responses. Our approach consists in first identifying mathematically the neces- sary hypotheses on the acoustic responses for their estimation and then building cost functions and algorithms to effectively estimate them. First, we considered the case where the source signals are known. We developed a method to estimate the acoustic responses based on a convex regularization which exploits both the temporal sparsity of the filters and the exponentially decaying envelope. Real-world experi- ments confirmed the effectiveness of this method ...

Benichoux, Alexis — Université Rennes I


Bayesian Compressed Sensing using Alpha-Stable Distributions

During the last decades, information is being gathered and processed at an explosive rate. This fact gives rise to a very important issue, that is, how to effectively and precisely describe the information content of a given source signal or an ensemble of source signals, such that it can be stored, processed or transmitted by taking into consideration the limitations and capabilities of the several digital devices. One of the fundamental principles of signal processing for decades is the Nyquist-Shannon sampling theorem, which states that the minimum number of samples needed to reconstruct a signal without error is dictated by its bandwidth. However, there are many cases in our everyday life in which sampling at the Nyquist rate results in too many data and thus, demanding an increased processing power, as well as storage requirements. A mathematical theory that emerged ...

Tzagkarakis, George — University of Crete


Motion Analysis and Modeling for Activity Recognition and 3-D Animation based on Geometrical and Video Processing Algorithms

The analysis of audiovisual data aims at extracting high level information, equivalent with the one(s) that can be extracted by a human. It is considered as a fundamental, unsolved (in its general form) problem. Even though the inverse problem, the audiovisual (sound and animation) synthesis, is judged easier than the previous, it remains an unsolved problem. The systematic research on these problems yields solutions that constitute the basis for a great number of continuously developing applications. In this thesis, we examine the two aforementioned fundamental problems. We propose algorithms and models of analysis and synthesis of articulated motion and undulatory (snake) locomotion, using data from video sequences. The goal of this research is the multilevel information extraction from video, like object tracking and activity recognition, and the 3-D animation synthesis in virtual environments based on the results of analysis. An ...

Panagiotakis, Costas — University of Crete


Cognitive Models for Acoustic and Audiovisual Sound Source Localization

Sound source localization algorithms have a long research history in the field of digital signal processing. Many common applications like intelligent personal assistants, teleconferencing systems and methods for technical diagnosis in acoustics require an accurate localization of sound sources in the environment. However, dynamic environments entail a particular challenge for these systems. For instance, voice controlled smart home applications, where the speaker, as well as potential noise sources, are moving within the room, are a typical example of dynamic environments. Classical sound source localization systems only have limited capabilities to deal with dynamic acoustic scenarios. In this thesis, three novel approaches to sound source localization that extend existing classical methods will be presented. The first system is proposed in the context of audiovisual source localization. Determining the position of sound sources in adverse acoustic conditions can be improved by including ...

Schymura, Christopher — Ruhr University Bochum


Application of Sound Source Separation Methods to Advanced Spatial Audio Systems

This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in two-channel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to ...

Cobos, Maximo — Universidad Politecnica de Valencia


Distributed Localization and Tracking of Acoustic Sources

Localization, separation and tracking of acoustic sources are ancient challenges that lots of animals and human beings are doing intuitively and sometimes with an impressive accuracy. Artificial methods have been developed for various applications and conditions. The majority of those methods are centralized, meaning that all signals are processed together to produce the estimation results. The concept of distributed sensor networks is becoming more realistic as technology advances in the fields of nano-technology, micro electro-mechanic systems (MEMS) and communication. A distributed sensor network comprises scattered nodes which are autonomous, self-powered modules consisting of sensors, actuators and communication capabilities. A variety of layout and connectivity graphs are usually used. Distributed sensor networks have a broad range of applications, which can be categorized in ecology, military, environment monitoring, medical, security and surveillance. In this dissertation we develop algorithms for distributed sensor networks ...

Dorfan, Yuval — Bar Ilan University


From Blind to Semi-Blind Acoustic Source Separation based on Independent Component Analysis

Typical acoustic scenes consist of multiple superimposed sources, where some of them represent desired signals, but often many of them are undesired sources, e.g., interferers or noise. Hence, source separation and extraction, i.e., the estimation of the desired source signals based on observed mixtures, is one of the central problems in audio signal processing. A promising class of approaches to address such problems is based on Independent Component Analysis (ICA), an unsupervised machine learning technique. These methods enjoyed a lot of attention from the research community due to the small number of assumptions that have to be made about the considered problem. Furthermore, the resulting generalization ability to unseen acoustic conditions, their mathematical rigor and the simplicity of resulting algorithms have been appreciated by many researchers working in audio signal processing. However, knowledge about the acoustic scenario is often available ...

Brendel, Andreas — Friedrich-Alexander-Universität Erlangen-Nürnberg


General Approaches for Solving Inverse Problems with Arbitrary Signal Models

Ill-posed inverse problems appear in many signal and image processing applications, such as deblurring, super-resolution and compressed sensing. The common approach to address them is to design a specific algorithm, or recently, a specific deep neural network, for each problem. Both signal processing and machine learning tactics have drawbacks: traditional reconstruction strategies exhibit limited performance for complex signals, such as natural images, due to the hardness of their mathematical modeling; while modern works that circumvent signal modeling by training deep convolutional neural networks (CNNs) suffer from a huge performance drop when the observation model used in training is inexact. In this work, we develop and analyze reconstruction algorithms that are not restricted to a specific signal model and are able to handle different observation models. Our main contributions include: (a) We generalize the popular sparsity-based CoSaMP algorithm to any signal ...

Tirer, Tom — Tel Aviv University


Adaptive Nonlocal Signal Restoration and Enhancement Techniques for High-Dimensional Data

The large number of practical applications involving digital images has motivated a significant interest towards restoration solutions that improve the visual quality of the data under the presence of various acquisition and compression artifacts. Digital images are the results of an acquisition process based on the measurement of a physical quantity of interest incident upon an imaging sensor over a specified period of time. The quantity of interest depends on the targeted imaging application. Common imaging sensors measure the number of photons impinging over a dense grid of photodetectors in order to produce an image similar to what is perceived by the human visual system. Different applications focus on the part of the electromagnetic spectrum not visible by the human visual system, and thus require different sensing technologies to form the image. In all cases, even with the advance of ...

Maggioni, Matteo — Tampere University of Technology


Dereverberation and noise reduction techniques based on acoustic multi-channel equalization

In many hands-free speech communication applications such as teleconferencing or voice-controlled applications, the recorded microphone signals do not only contain the desired speech signal, but also attenuated and delayed copies of the desired speech signal due to reverberation as well as additive background noise. Reverberation and background noise cause a signal degradation which can impair speech intelligibility and decrease the performance for many signal processing techniques. Acoustic multi-channel equalization techniques, which aim at inverting or reshaping the measured or estimated room impulse responses between the speech source and the microphone array, comprise an attractive approach to speech dereverberation since in theory perfect dereverberation can be achieved. However in practice, such techniques suffer from several drawbacks, such as uncontrolled perceptual effects, sensitivity to perturbations in the measured or estimated room impulse responses, and background noise amplification. The aim of this thesis ...

Kodrasi, Ina — University of Oldenburg


Acoustic sensor network geometry calibration and applications

In the modern world, we are increasingly surrounded by computation devices with communication links and one or more microphones. Such devices are, for example, smartphones, tablets, laptops or hearing aids. These devices can work together as nodes in an acoustic sensor network (ASN). Such networks are a growing platform that opens the possibility for many practical applications. ASN based speech enhancement, source localization, and event detection can be applied for teleconferencing, camera control, automation, or assisted living. For this kind of applications, the awareness of auditory objects and their spatial positioning are key properties. In order to provide these two kinds of information, novel methods have been developed in this thesis. Information on the type of auditory objects is provided by a novel real-time sound classification method. Information on the position of human speakers is provided by a novel localization ...

Plinge, Axel — TU Dortmund University

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.