Sparse approximation and dictionary learning with applications to audio signals

Over-complete transforms have recently become the focus of a wide wealth of research in signal processing, machine learning, statistics and related fields. Their great modelling flexibility allows to find sparse representations and approximations of data that in turn prove to be very efficient in a wide range of applications. Sparse models express signals as linear combinations of a few basis functions called atoms taken from a so-called dictionary. Finding the optimal dictionary from a set of training signals of a given class is the objective of dictionary learning and the main focus of this thesis. The experimental evidence presented here focuses on the processing of audio signals, and the role of sparse algorithms in audio applications is accordingly highlighted. The first main contribution of this thesis is the development of a pitch-synchronous transform where the frame-by-frame analysis of audio data ...

Barchiesi, Daniele — Queen Mary University of London

Parallel Dictionary Learning Algorithms for Sparse Representations

Sparse representations are intensively used in signal processing applications, like image coding, denoising, echo channels modeling, compression, classification and many others. Recent research has shown encouraging results when the sparse signals are created through the use of a learned dictionary. The current study focuses on finding new methods and algorithms, that have a parallel form where possible, for obtaining sparse representations of signals with improved dictionaries that lead to better performance in both representation error and execution time. We attack the general dictionary learning problem by first investigating and proposing new solutions for sparse representation stage and then moving on to the dictionary update stage where we propose a new parallel update strategy. Lastly, we study the effect of the representation algorithms on the dictionary update method. We also researched dictionary learning solutions where the dictionary has a specific form. ...

Irofti, Paul — Politehnica University of Bucharest

Sparsity in Linear Predictive Coding of Speech

This thesis deals with developing improved modeling methods for speech and audio processing based on the recent developments in sparse signal representation. In particular, this work is motivated by the need to address some of the limitations of the well-known linear prediction (LP) based all-pole models currently applied in many modern speech and audio processing systems. In the first part of this thesis, we introduce \emph{Sparse Linear Prediction}, a set of speech processing tools created by introducing sparsity constraints into the LP framework. This approach defines predictors that look for a sparse residual rather than a minimum variance one, with direct applications to coding but also consistent with the speech production model of voiced speech, where the excitation of the all-pole filter is model as an impulse train. Introducing sparsity in the LP framework, will also bring to develop the ...

Giacobello, Daniele — Aalborg University

Bayesian Compressed Sensing using Alpha-Stable Distributions

During the last decades, information is being gathered and processed at an explosive rate. This fact gives rise to a very important issue, that is, how to effectively and precisely describe the information content of a given source signal or an ensemble of source signals, such that it can be stored, processed or transmitted by taking into consideration the limitations and capabilities of the several digital devices. One of the fundamental principles of signal processing for decades is the Nyquist-Shannon sampling theorem, which states that the minimum number of samples needed to reconstruct a signal without error is dictated by its bandwidth. However, there are many cases in our everyday life in which sampling at the Nyquist rate results in too many data and thus, demanding an increased processing power, as well as storage requirements. A mathematical theory that emerged ...

Tzagkarakis, George — University of Crete

Toward sparse and geometry adapted video approximations

Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion ...

Divorra Escoda, Oscar — EPFL / Signal Processing Institute

Parameter Estimation -in sparsity we trust

This thesis is based on nine papers, all concerned with parameter estimation. The thesis aims at solving problems related to real-world applications such as spectroscopy, DNA sequencing, and audio processing, using sparse modeling heuristics. For the problems considered in this thesis, one is not only concerned with finding the parameters in the signal model, but also to determine the number of signal components present in the measurements. In recent years, developments in sparse modeling have allowed for methods that jointly estimate the parameters in the model and the model order. Based on these achievements, the approach often taken in this thesis is as follows. First, a parametric model of the considered signal is derived, containing different parameters that capture the important characteristics of the signal. When the signal model has been determined, an optimization problem is formed aimed at finding ...

Swärd, Johan — Lund University

Novel texture synthesis methods and their application to image prediction and image inpainting

This thesis presents novel exemplar-based texture synthesis methods for image prediction (i.e., predictive coding) and image inpainting problems. The main contributions of this study can also be seen as extensions to simple template matching, however the texture synthesis problem here is well-formulated in an optimization framework with different constraints. The image prediction problem has first been put into sparse representations framework by approximating the template with a sparsity constraint. The proposed sparse prediction method with locally and adaptive dictionaries has been shown to give better performance when compared to static waveform (such as DCT) dictionaries, and also to the template matching method. The image prediction problem has later been placed into an online dictionary learning framework by adapting conventional dictionary learning approaches for image prediction. The experimental observations show a better performance when compared to H.264/AVC intra and sparse prediction. ...

Turkan, Mehmet — INRIA-Rennes, France

Contributions to signal analysis and processing using compressed sensing techniques

Chapter 2 contains a short introduction to the fundamentals of compressed sensing theory, which is the larger context of this thesis. We start with introducing the key concepts of sparsity and sparse representations of signals. We discuss the central problem of compressed sensing, i.e. how to adequately recover sparse signals from a small number of measurements, as well as the multiple formulations of the reconstruction problem. A large part of the chapter is devoted to some of the most important conditions necessary and/or sufficient to guarantee accurate recovery. The aim is to introduce the reader to the basic results, without the burden of detailed proofs. In addition, we also present a few of the popular reconstruction and optimization algorithms that we use throughout the thesis. Chapter 3 presents an alternative sparsity model known as analysis sparsity, that offers similar recovery ...

Cleju, Nicolae — "Gheorghe Asachi" Technical University of Iasi

Exploiting Sparsity for Efficient Compression and Analysis of ECG and Fetal-ECG Signals

Over the last decade there has been an increasing interest in solutions for the continuous monitoring of health status with wireless, and in particular, wearable devices that provide remote analysis of physiological data. The use of wireless technologies have introduced new problems such as the transmission of a huge amount of data within the constraint of limited battery life devices. The design of an accurate and energy efficient telemonitoring system can be achieved by reducing the amount of data that should be transmitted, which is still a challenging task on devices with both computational and energy constraints. Furthermore, it is not sufficient merely to collect and transmit data, and algorithms that provide real-time analysis are needed. In this thesis, we address the problems of compression and analysis of physiological data using the emerging frameworks of Compressive Sensing (CS) and sparse ...

Da Poian, Giulia — University of Udine

Dynamic Scheme Selection in Image Coding

This thesis deals with the coding of images with multiple coding schemes and their dynamic selection. In our society of information highways, electronic communication is taking everyday a bigger place in our lives. The number of transmitted images is also increasing everyday. Therefore, research on image compression is still an active area. However, the current trend is to add several functionalities to the compression scheme such as progressiveness for more comfortable browsing of web-sites or databases. Classical image coding schemes have a rigid structure. They usually process an image as a whole and treat the pixels as a simple signal with no particular characteristics. Second generation schemes use the concept of objects in an image, and introduce a model of the human visual system in the design of the coding scheme. Dynamic coding schemes, as their name tells us, make ...

Fleury, Pascal — Swiss Federal Institute of Technology

Distributed Signal Processing for Binaural Hearing Aids

Over the last centuries, hearing aids have evolved from crude and bulky horn-shaped instruments to lightweight and almost invisible digital signal processing devices. While most of the research has focused on the design of monaural apparatus, the use of a wireless link has been recently advocated to enable data transfer between hearing aids such as to obtain a binaural system. The availability of a wireless link offers brand new perspectives but also poses great technical challenges. It requires the design of novel signal processing schemes that address the restricted communication bitrates, processing delays and power consumption limitations imposed by wireless hearing aids. The goal of this dissertation is to address these issues at both a theoretical and a practical level. We start by taking a distributed source coding view on the problem of binaural noise reduction. The proposed analysis allows ...

Roy, Olivier — EPFL

Decompositions parcimonieuses: approches Baysiennes et application a la compression d' image

This thesis interests in different methods of image compression combining both Bayesian aspects and sparse decomposition'' aspects. Two compression methods are in particular investigated. Transform coding, first, is addressed from a transform optimization point of view. The optimization is considered at two levels: in the spatial domain by adapting the support of the transform, and in the transform domain by selecting local bases among finite sets. The study of bases learned with an algorithm from the literature constitutes an introduction to a novel learning algorithm, which encourages the sparsity of the decompositions. Predictive coding is then addressed. Motivated by recent contributions based on sparse decompositions, we propose a novel Bayesian prediction algorithm based on mixtures of sparse decompositions. Finally, these works allowed to underline the interest of structuring the sparsity of the decompositions. For example, a weighting of the decomposition ...

Dremeau, Angelique — INRIA

Robust and multiresolution video delivery : From H.26x to Matching pursuit based technologies

With the joint development of networking and digital coding technologies multimedia and more particularly video services are clearly becoming one of the major consumers of the new information networks. The rapid growth of the Internet and computer industry however results in a very heterogeneous infrastructure commonly overloaded. Video service providers have nevertheless to oer to their clients the best possible quality according to their respective capabilities and communication channel status. The Quality of Service is not only inuenced by the compression artifacts, but also by unavoidable packet losses. Hence, the packet video stream has clearly to fulll possibly contradictory requirements, that are coding eciency and robustness to data loss. The rst contribution of this thesis is the complete modeling of the video Quality of Service (QoS) in standard and more particularly MPEG-2 applications. The performance of Forward Error Control (FEC) ...

Frossard, Pascal — Swiss Federal Institute of Technology

Mixed structural models for 3D audio in virtual environments

In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...

Geronazzo, Michele — University of Padova

Sound Source Separation in Monaural Music Signals

Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, one-channel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of real-world sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses model-based inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is modeled ...

Virtanen, Tuomas — Tampere University of Technology

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.