Robust and multiresolution video delivery : From H.26x to Matching pursuit based technologies

With the joint development of networking and digital coding technologies multimedia and more particularly video services are clearly becoming one of the major consumers of the new information networks. The rapid growth of the Internet and computer industry however results in a very heterogeneous infrastructure commonly overloaded. Video service providers have nevertheless to oer to their clients the best possible quality according to their respective capabilities and communication channel status. The Quality of Service is not only inuenced by the compression artifacts, but also by unavoidable packet losses. Hence, the packet video stream has clearly to fulll possibly contradictory requirements, that are coding eciency and robustness to data loss. The rst contribution of this thesis is the complete modeling of the video Quality of Service (QoS) in standard and more particularly MPEG-2 applications. The performance of Forward Error Control (FEC) ...

Frossard, Pascal — Swiss Federal Institute of Technology


A statistical approach to motion estimation

Digital video technology has been characterized by a steady growth in the last decade. New applications like video e-mail, third generation mobile phone video communications, videoconferencing, video streaming on the web continuously push for further evolution of research in digital video coding. In order to be sent over the internet or even wireless networks, video information clearly needs compression to meet bandwidth requirements. Compression is mainly realized by exploiting the redundancy present in the data. A sequence of images contains an intrinsic, intuitive and simple idea of redundancy: two successive images are very similar. This simple concept is called temporal redundancy. The research of a proper scheme to exploit the temporal redundancy completely changes the scenario between compression of still pictures and sequence of images. It also represents the key for very high performances in image sequence coding when compared ...

Moschetti, Fulvio — Swiss Federal Institute of Technology


Adaptive Sparse Coding and Dictionary Selection

The sparse coding is approximation/representation of signals with the minimum number of coefficients using an overcomplete set of elementary functions. This kind of approximations/ representations has found numerous applications in source separation, denoising, coding and compressed sensing. The adaptation of the sparse approximation framework to the coding problem of signals is investigated in this thesis. Open problems are the selection of appropriate models and their orders, coefficient quantization and sparse approximation method. Some of these questions are addressed in this thesis and novel methods developed. Because almost all recent communication and storage systems are digital, an easy method to compute quantized sparse approximations is introduced in the first part. The model selection problem is investigated next. The linear model can be adapted to better fit a given signal class. It can also be designed based on some a priori information ...

Yaghoobi, Mehrdad — University of Edinburgh


A floating polygon soup representation for 3D video

This thesis presents a new representation called floating polygon soup for applications like 3DTV and FTV (Free Viewpoint Television). The polygon soup is designed for compactness, compression efficiency, and view synthesis quality. The polygons are stored in 2D, with depth values at each corner. They are not necessarily connected to each other and can be deformed (or floated) w.r.t viewpoints and time. Starting from multi-view video plus depth (MVD), the construction holds in two steps: quadtree decomposition and multi-view redundancy reduction. It results in a compact set of polygons replacing the depth maps while preserving depth discontinuities and geometric details. Next, compression efficiency and view-synthesis quality are evaluated. Classical meth- ods such as inpainting and post-processing are implemented and adapted to the poly- gon soup. A new compression method is proposed. It exploits the quadtree structure and uses spatial prediction. ...

Colleu, Thomas — INRIA Rennes Bretagne Atlantique / Orange Labs / IETR


A flexible scalable video coding framework with adaptive spatio-temporal decompositions

The work presented in this thesis covers topics that extend the scalability functionalities in video coding and improve the compression performance. Two main novel approaches are presented, each targeting a different part of the scalable video coding (SVC) architecture: motion adaptive wavelet transform based on the wavelet transform in lifting implementation, and a design of a flexible framework for generalised spatio-temporal decomposition. Motion adaptive wavelet transform is based on the newly introduced concept of connectivity-map. The connectivity-map describes the underlying irregular structure of regularly sampled data. To enable a scalable representation of the connectivity-map, the corresponding analysis and synthesis operations have been derived. These are then employed to define a joint wavelet connectivity-map decomposition that serves as an adaptive alternative to the conventional wavelet decomposition. To demonstrate its applicability, the presented decomposition scheme is used in the proposed SVC framework, ...

Sprljan, Nikola — Queen Mary University of London


Sparsity Models for Signals: Theory and Applications

Many signal and image processing applications have benefited remarkably from the theory of sparse representations. In its classical form this theory models signal as having a sparse representation under a given dictionary -- this is referred to as the "Synthesis Model". In this work we focus on greedy methods for the problem of recovering a signal from a set of deteriorated linear measurements. We consider four different sparsity frameworks that extend the aforementioned synthesis model: (i) The cosparse analysis model; (ii) the signal space paradigm; (iii) the transform domain strategy; and (iv) the sparse Poisson noise model. Our algorithms of interest in the first part of the work are the greedy-like schemes: CoSaMP, subspace pursuit (SP), iterative hard thresholding (IHT) and hard thresholding pursuit (HTP). It has been shown for the synthesis model that these can achieve a stable recovery ...

Giryes, Raja — Technion


ROBUST WATERMARKING TECHNIQUES FOR SCALABLE CODED IMAGE AND VIDEO

In scalable image/video coding, high resolution content is encoded to the highest visual quality and the bit-streams are adapted to cater various communication channels, display devices and usage requirements. These content adaptations, which include quality, resolution and frame rate scaling may also affect the content protection data, such as, watermarks and are considered as a potential watermark attack. In this thesis, research on robust watermarking techniques for scalable coded image and video, are proposed and the improvements in robustness against various content adaptation attacks, such as, JPEG 2000 for image and Motion JPEG 2000, MC-EZBC and H.264/SVC for video, are reported. The spread spectrum domain, particularly wavelet-based image watermarking schemes often provides better robustness to compression attacks due to its multi-resolution decomposition and hence chosen for this work. A comprehensive and comparative analysis of the available wavelet-based watermarking schemes,is performed ...

Bhowmik, Deepayan — University of Sheffield


Decompositions Parcimonieuses Structurees: Application a la presentation objet de la musique

The amount of digital music available both on the Internet and by each listener has considerably raised for about ten years. The organization and the accessibillity of this amount of data demand that additional informations are available, such as artist, album and song names, musical genre, tempo, mood or other symbolic or semantic attributes. Automatic music indexing has thus become a challenging research area. If some tasks are now correctly handled for certain types of music, such as automatic genre classification for stereotypical music, music instrument recoginition on solo performance and tempo extraction, others are more difficult to perform. For example, automatic transcription of polyphonic signals and instrument ensemble recognition are still limited to some particular cases. The goal of our study is not to obain a perfect transcription of the signals and an exact classification of all the instruments ...

Leveau, Pierre — Universite Pierre et Marie Curie, Telecom ParisTech


Parallel Dictionary Learning Algorithms for Sparse Representations

Sparse representations are intensively used in signal processing applications, like image coding, denoising, echo channels modeling, compression, classification and many others. Recent research has shown encouraging results when the sparse signals are created through the use of a learned dictionary. The current study focuses on finding new methods and algorithms, that have a parallel form where possible, for obtaining sparse representations of signals with improved dictionaries that lead to better performance in both representation error and execution time. We attack the general dictionary learning problem by first investigating and proposing new solutions for sparse representation stage and then moving on to the dictionary update stage where we propose a new parallel update strategy. Lastly, we study the effect of the representation algorithms on the dictionary update method. We also researched dictionary learning solutions where the dictionary has a specific form. ...

Irofti, Paul — Politehnica University of Bucharest


Adaptive Nonlocal Signal Restoration and Enhancement Techniques for High-Dimensional Data

The large number of practical applications involving digital images has motivated a significant interest towards restoration solutions that improve the visual quality of the data under the presence of various acquisition and compression artifacts. Digital images are the results of an acquisition process based on the measurement of a physical quantity of interest incident upon an imaging sensor over a specified period of time. The quantity of interest depends on the targeted imaging application. Common imaging sensors measure the number of photons impinging over a dense grid of photodetectors in order to produce an image similar to what is perceived by the human visual system. Different applications focus on the part of the electromagnetic spectrum not visible by the human visual system, and thus require different sensing technologies to form the image. In all cases, even with the advance of ...

Maggioni, Matteo — Tampere University of Technology


Ondelette et decompositions spatio-temporelles avancees : application au codage video scalable

L¢objectif de cette these consiste en l¢etude et la construction de nouvelles transformees scalables mises en jeu dans le schema de codage video t+2D, afin d¢ameliorer le gain en compression. L¢utilisation du formalisme lifting lors de la construction de ces transformees spatio-temporelles permet l¢introduction d¢operateurs non-lineaires, particulierement utiles pour representer efficacement les singularites et discontinuites presentes dans une sequence video. Nous nous interessons dans un premier temps a l¢optimisation et a la construction de nouvelles transformees temporelles compensees en mouvement, afin d¢augmenter l¢efficacite de codage objective et subjective. Nous envisageons ensuite l¢elaboration et la mise en place de bancs de filtres M-bandes pour decomposer spatialement les sous-bande temporelles. Nous traitons alors de l¢extension des proprietes de scalabilite du banc de synthese M-bandes a des facteurs rationnels quelconques. Enfin, nous decrivons la construction de decompositions spatiales en ondelettes adaptatives, non-lineaires et ...

Pau, Gregoire — Telecom Paris


Error Resilience and Concealment Techniques for High Efficiency Video Coding

This thesis investigates the problem of robust coding and error concealment in High Efficiency Video Coding (HEVC). After a review of the current state of the art, a simulation study about error robustness, revealed that the HEVC has weak protection against network losses with significant impact on video quality degradation. Based on this evidence, the first contribution of this work is a new method to reduce the temporal dependencies between motion vectors, by improving the decoded video quality without compromising the compression efficiency. The second contribution of this thesis is a two-stage approach for reducing the mismatch of temporal predictions in case of video streams received with errors or lost data. At the encoding stage, the reference pictures are dynamically distributed based on a constrained Lagrangian rate-distortion optimization to reduce the number of predictions from a single reference. At the ...

João Filipe Monteiro Carreira — Loughborough University London


Novel texture synthesis methods and their application to image prediction and image inpainting

This thesis presents novel exemplar-based texture synthesis methods for image prediction (i.e., predictive coding) and image inpainting problems. The main contributions of this study can also be seen as extensions to simple template matching, however the texture synthesis problem here is well-formulated in an optimization framework with different constraints. The image prediction problem has first been put into sparse representations framework by approximating the template with a sparsity constraint. The proposed sparse prediction method with locally and adaptive dictionaries has been shown to give better performance when compared to static waveform (such as DCT) dictionaries, and also to the template matching method. The image prediction problem has later been placed into an online dictionary learning framework by adapting conventional dictionary learning approaches for image prediction. The experimental observations show a better performance when compared to H.264/AVC intra and sparse prediction. ...

Turkan, Mehmet — INRIA-Rennes, France


Decompositions parcimonieuses: approches Baysiennes et application a la compression d' image

This thesis interests in different methods of image compression combining both Bayesian aspects and ``sparse decomposition'' aspects. Two compression methods are in particular investigated. Transform coding, first, is addressed from a transform optimization point of view. The optimization is considered at two levels: in the spatial domain by adapting the support of the transform, and in the transform domain by selecting local bases among finite sets. The study of bases learned with an algorithm from the literature constitutes an introduction to a novel learning algorithm, which encourages the sparsity of the decompositions. Predictive coding is then addressed. Motivated by recent contributions based on sparse decompositions, we propose a novel Bayesian prediction algorithm based on mixtures of sparse decompositions. Finally, these works allowed to underline the interest of structuring the sparsity of the decompositions. For example, a weighting of the decomposition ...

Dremeau, Angelique — INRIA


WATERMARKING FOR 3D REPRESENTATIONS

In this thesis, a number of novel watermarking techniques for different 3D representations are presented. A novel watermarking method is proposed for the mono-view video, which might be interpreted as the basic implicit representation of 3D scenes. The proposed method solves the common flickering problem in the existing video watermarking schemes by means of adjusting the watermark strength with respect to temporal contrast thresholds of human visual system (HVS), which define the maximum invisible distortions in the temporal direction. The experimental results indicate that the proposed method gives better results in both objective and subjective measures, compared to some recognized methods in the literature. The watermarking techniques for the geometry and image based representations of 3D scenes, denoted as 3D watermarking, are examined and classified into three groups, as 3D-3D, 3D-2D and 2D-2D watermarking, in which the pair of symbols ...

Koz, Alper — Middle East Technical University, Department of Electrical and Electronics Engineering

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.