## Group-Sparse Regression - With Applications in Spectral Analysis and Audio Signal Processing (2017)

Parameter Estimation -in sparsity we trust

This thesis is based on nine papers, all concerned with parameter estimation. The thesis aims at solving problems related to real-world applications such as spectroscopy, DNA sequencing, and audio processing, using sparse modeling heuristics. For the problems considered in this thesis, one is not only concerned with finding the parameters in the signal model, but also to determine the number of signal components present in the measurements. In recent years, developments in sparse modeling have allowed for methods that jointly estimate the parameters in the model and the model order. Based on these achievements, the approach often taken in this thesis is as follows. First, a parametric model of the considered signal is derived, containing different parameters that capture the important characteristics of the signal. When the signal model has been determined, an optimization problem is formed aimed at finding ...

Swärd, Johan — Lund University

Probabilistic Model-Based Multiple Pitch Tracking of Speech

Multiple pitch tracking of speech is an important task for the segregation of multiple speakers in a single-channel recording. In this thesis, a probabilistic model-based approach for estimation and tracking of multiple pitch trajectories is proposed. A probabilistic model that captures pitch-dependent characteristics of the single-speaker short-time spectrum is obtained a priori from clean speech data. The resulting speaker model, which is based on Gaussian mixture models, can be trained either in a speaker independent (SI) or a speaker dependent (SD) fashion. Speaker models are then combined using an interaction model to obtain a probabilistic description of the observed speech mixture. A factorial hidden Markov model is applied for tracking the pitch trajectories of multiple speakers over time. The probabilistic model-based approach is capable to explicitly incorporate timbral information and all associated uncertainties of spectral structure into the model. While ...

Wohlmayr, Michael — Graz University of Technology

Sparsity in Linear Predictive Coding of Speech

This thesis deals with developing improved modeling methods for speech and audio processing based on the recent developments in sparse signal representation. In particular, this work is motivated by the need to address some of the limitations of the well-known linear prediction (LP) based all-pole models currently applied in many modern speech and audio processing systems. In the first part of this thesis, we introduce \emph{Sparse Linear Prediction}, a set of speech processing tools created by introducing sparsity constraints into the LP framework. This approach defines predictors that look for a sparse residual rather than a minimum variance one, with direct applications to coding but also consistent with the speech production model of voiced speech, where the excitation of the all-pole filter is model as an impulse train. Introducing sparsity in the LP framework, will also bring to develop the ...

Giacobello, Daniele — Aalborg University

Sparse Modeling Heuristics for Parameter Estimation - Applications in Statistical Signal Processing

This thesis examines sparse statistical modeling on a range of applications in audio modeling, audio localizations, DNA sequencing, and spectroscopy. In the examined cases, the resulting estimation problems are computationally cumbersome, both as one often suffers from a lack of model order knowledge for this form of problems, but also due to the high dimensionality of the parameter spaces, which typically also yield optimization problems with numerous local minima. In this thesis, these problems are treated using sparse modeling heuristics, with the resulting criteria being solved using convex relaxations, inspired from disciplined convex programming ideas, to maintain tractability. The contributions to audio modeling and estimation focus on the estimation of the fundamental frequency of harmonically related sinusoidal signals, which is commonly used model for, e.g., voiced speech or tonal audio. We examine both the problems of estimating multiple audio sources ...

Adalbjörnsson, Stefan Ingi — Lund University

Regularization techniques in model fitting and parameter estimation

We consider fitting data by linear and nonlinear models. The specific problems that we aim at, although they encompass classic formulations, have as common ground the fact that we attack a special situation: the ill-posed problems. In the linear case, we consider the total least squares problem. There exist special methods to approach the so-called nongeneric cases, but we propose extensions for the more commonly encountered close-to-nongeneric problems. Several methods of introducing regularization in the context of total least squares are analyzed. They are based on truncation methods or on penalty optimization. The obtained problems might not have closed form solutions. We discuss numerical linear algebra and local optimization methods. Data fitting by nonlinear or nonparametric models is the second subject of the thesis. We extend the nonlinear regression theory to the case when we have to deal with supplementary ...

Sima, Diana — Katholieke Universiteit Leuven

On some aspects of inverse problems in image processing

This work is concerned with two image-processing problems, image deconvolution with incomplete observations and data fusion of spectral images, and with some of the algorithms that are used to solve these and related problems. In image-deconvolution problems, the diagonalization of the blurring operator by means of the discrete Fourier transform usually yields very large speedups. When there are incomplete observations (e.g., in the case of unknown boundaries), standard deconvolution techniques normally involve non-diagonalizable operators, resulting in rather slow methods, or, otherwise, use inexact convolution models, resulting in the occurrence of artifacts in the enhanced images. We propose a new deconvolution framework for images with incomplete observations that allows one to work with diagonalizable convolution operators, and therefore is very fast. The framework is also an efficient, high-quality alternative to existing methods of dealing with the image boundaries, such as edge ...

Simões, Miguel — Universidade de Lisboa, Instituto Superior Técnico & Université Grenoble Alpes

Parameter Estimation and Filtering Using Sparse Modeling

Sparsity-based estimation techniques deal with the problem of retrieving a data vector from an undercomplete set of linear observations, when the data vector is known to have few nonzero elements with unknown positions. It is also known as the atomic decomposition problem, and has been carefully studied in the field of compressed sensing. Recent findings have led to a method called basis pursuit, also known as Least Absolute Shrinkage and Selection Operator (LASSO), as a numerically reliable sparsity-based approach. Although the atomic decomposition problem is generally NP-hard, it has been shown that basis pursuit may provide exact solutions under certain assumptions. This has led to an extensive study of signals with sparse representation in different domains, providing a new general insight into signal processing. This thesis further investigates the role of sparsity-based techniques, especially basis pursuit, for solving parameter estimation ...

Panahi, Ashkan — Chalmers University of Technology

Non-Intrusive Speech Intelligibility Prediction

The ability to communicate through speech is important for social interaction. We rely on the ability to communicate with each other even in noisy conditions. Ideally, the speech is easy to understand but this is not always the case, if the speech is degraded, e.g., due to background noise, distortion or hearing impairment. One of the most important factors to consider in relation to such degradations is speech intelligibility, which is a measure of how easy or difficult it is to understand the speech. In this thesis, the focus is on the topic of speech intelligibility prediction. The thesis consists of an introduction to the field of speech intelligibility prediction and a collection of scientific papers. The introduction provides a background to the challenges with speech communication in noisy conditions, followed by an introduction to how speech is produced and ...

Sørensen, Charlotte — Aalborg University

Enhancement of Speech Signals - with a Focus on Voiced Speech Models

The topic of this thesis is speech enhancement with a focus on models of voiced speech. Speech is divided into two subcategories dependent on the characteristics of the signal. One part is the voiced speech, the other is the unvoiced. In this thesis, we primarily focus on the voiced speech parts and utilise the structure of the signal in relation to speech enhancement. The basis for the models is the harmonic model which is a very often used model for voiced speech because it describes periodic signals perfectly. First, we consider the problem of non-stationarity in the speech signal. The speech signal changes its characteristics continuously over time whereas most speech analysis and enhancement methods assume stationarity within 20-30 ms. We propose to change the model to allow the fundamental frequency to vary linearly over time by introducing a chirp ...

Nørholm, Sidsel Marie — Aalborg University

Linear Dynamical Systems with Sparsity Constraints: Theory and Algorithms

This thesis develops new mathematical theory and presents novel recovery algorithms for discrete linear dynamical systems (LDS) with sparsity constraints on either control inputs or initial state. The recovery problems in this framework manifest as the problem of reconstructing one or more sparse signals from a set of noisy underdetermined linear measurements. The goal of our work is to design algorithms for sparse signal recovery which can exploit the underlying structure in the measurement matrix and the unknown sparse vectors, and to analyze the impact of these structures on the efficacy of the recovery. We answer three fundamental and interconnected questions on sparse signal recovery problems that arise in the context of LDS. First, what are necessary and sufficient conditions for the existence of a sparse solution? Second, given that a sparse solution exists, what are good low-complexity algorithms that ...

Joseph, Geethu — Indian Institute of Science, Bangalore

Sparse approximation and dictionary learning with applications to audio signals

Over-complete transforms have recently become the focus of a wide wealth of research in signal processing, machine learning, statistics and related fields. Their great modelling flexibility allows to find sparse representations and approximations of data that in turn prove to be very efficient in a wide range of applications. Sparse models express signals as linear combinations of a few basis functions called atoms taken from a so-called dictionary. Finding the optimal dictionary from a set of training signals of a given class is the objective of dictionary learning and the main focus of this thesis. The experimental evidence presented here focuses on the processing of audio signals, and the role of sparse algorithms in audio applications is accordingly highlighted. The first main contribution of this thesis is the development of a pitch-synchronous transform where the frame-by-frame analysis of audio data ...

Barchiesi, Daniele — Queen Mary University of London

The use of High-Order Sparse Linear Prediction for the Restoration of Archived Audio

Since the invention of Gramophone by Thomas Edison in 1877, vast amounts of cultural, entertainment, educational and historical audio recordings have been recorded and stored throughout the world. Through natural aging and improper storage, the recorded signal degrades and loses its information in terms of quality and intelligibility. Degradation of audio signals is considered as any unwanted modification to the audio signal after it has been recorded. There are different degradations affecting recorded signals on analog storage media. The degradations that are often encountered are clicks, hiss and ‘Wow and Flutter’. Several researches have been conducted in restoring degraded audio recordings. Most of the methods rely on some prior information of the underlying data and the degradation process. The success of these methods heavily depends on the prior information available. When such information is not available, a model of the ...

Dufera, Bisrat Derebssa — School of Electrical and Computer Engineering, Addis Ababa Institute of Technology, Addis Ababa University

Embedded Optimization Algorithms for Perceptual Enhancement of Audio Signals

This thesis investigates the design and evaluation of an embedded optimization framework for the perceptual enhancement of audio signals which are degraded by linear and/or nonlinear distortion. In general, audio signal enhancement has the goal to improve the perceived audio quality, speech intelligibility, or another desired perceptual attribute of the distorted audio signal by applying a real-time digital signal processing algorithm. In the designed embedded optimization framework, the audio signal enhancement problem under consideration is formulated and solved as a per-frame numerical optimization problem, allowing to compute the enhanced audio signal frame that is optimal according to a desired perceptual attribute. The first stage of the embedded optimization framework consists in the formulation of the per-frame optimization problem aimed at maximally enhancing the desired perceptual attribute, by explicitly incorporating a suitable model of human sound perception. The second stage of ...

Defraene, Bruno — KU Leuven

Cosparse regularization of physics-driven inverse problems

Inverse problems related to physical processes are of great importance in practically every field related to signal processing, such as tomography, acoustics, wireless communications, medical and radar imaging, to name only a few. At the same time, many of these problems are quite challenging due to their ill-posed nature. On the other hand, signals originating from physical phenomena are often governed by laws expressible through linear Partial Differential Equations (PDE), or equivalently, integral equations and the associated Green’s functions. In addition, these phenomena are usually induced by sparse singularities, appearing as sources or sinks of a vector field. In this thesis we primarily investigate the coupling of such physical laws with a prior assumption on the sparse origin of a physical process. This gives rise to a “dual” regularization concept, formulated either as sparse analysis (cosparse), yielded by a PDE ...

Kitić, Srđan — Université de Rennes 1

Acoustic Event Detection: Feature, Evaluation and Dataset Design

It takes more time to think of a silent scene, action or event than finding one that emanates sound. Not only speaking or playing music but almost everything that happens is accompanied with or results in one or more sounds mixed together. This makes acoustic event detection (AED) one of the most researched topics in audio signal processing nowadays and it will probably not see a decline anywhere in the near future. This is due to the thirst for understanding and digitally abstracting more and more events in life via the enormous amount of recorded audio through thousands of applications in our daily routine. But it is also a result of two intrinsic properties of audio: it doesn’t need a direct sight to be perceived and is less intrusive to record when compared to image or video. Many applications such ...

Mina Mounir — KU Leuven, ESAT STADIUS

The current layout is optimized for **mobile
phones**. Page previews, thumbnails, and full abstracts
will remain hidden until the browser window grows in width.

The current layout is optimized for **tablet
devices**. Page previews and some thumbnails will remain
hidden until the browser window grows in width.