Stereoscopic depth map estimation and coding techniques for multiview video systems

The dissertation deals with the problems of stereoscopic depth estimation and coding in multiview video systems, which are vital for development of the next generation three-dimensional television. The depth estimation algorithms known from literature, along with theoretical foundations are discussed. The problem of estimation of depth maps with high quality, expressed by means of accuracy, precision and temporal consistency, has been stated. Next, original solutions have been proposed. Author has proposed a novel, theoretically founded approach to depth estimation which employs Maximum A posteriori Probability (MAP) rule for modeling of the cost function used in optimization algorithms. The proposal has been presented along with a method for estimation of parameters of such model. In order to attain that, an analysis of the noise existing in multiview video and a study of inter-view correlation of corresponding samples of pictures have been ...

Stankiewicz, Olgierd — Poznan University of Technology


Planar 3D Scene Representations for Depth Compression

The recent invasion of stereoscopic 3D television technologies is expected to be followed by autostereoscopic and holographic technologies. Glasses-free multiple stereoscopic pair displaying capabilities of these technologies will advance the 3D experience. The prospective 3D format to create the multiple views for such displays is Multiview Video plus Depth (MVD) format based on the Depth Image Based Rendering (DIBR) techniques. The depth modality of the MVD format is an active research area whose main objective is to develop DIBR friendly efficient compression methods. As a part this research, the thesis proposes novel 3D planar-based depth representations. The planar approximation of the stereo depth images is formulated as an energy-based co-segmentation problem by a Markov Random Field model. The energy terms of this problem are designed to mimic the rate-distortion tradeoff for a depth compression application. A heuristic algorithm is developed ...

Özkalaycı, Burak Oğuz — Middle East Technical University


Real Time Stereo to Multi-view Video Conversion

A novel and efficient methodology is presented for the conversion of stereo to multi-view video in order to address the 3D content requirements for the next generation 3D-TVs and auto-stereoscopic multi-view displays. There are two main algorithmic blocks in such a conversion system; stereo matching and virtual view rendering that enable extraction of 3D information from stereo video and synthesis of inexistent virtual views, respectively. In the intermediate steps of these functional blocks, a novel edge-preserving filter is proposed that recursively constructs connected support regions for each pixel among color-wise similar neighboring pixels. The proposed recursive update structure eliminates pre-defined window dependency of the conventional approaches, providing complete content adaptibility with quite low computational complexity. Based on extensive tests, it is observed that the proposed filtering technique yields better or competetive results against some leading techniques in the literature. The ...

Cigla, Cevahir — Middle East Technical University


Distributed Compressed Representation of Correlated Image Sets

Vision sensor networks and video cameras find widespread usage in several applications that rely on effective representation of scenes or analysis of 3D information. These systems usually acquire multiple images of the same 3D scene from different viewpoints or at different time instants. Therefore, these images are generally correlated through displacement of scene objects. Efficient compression techniques have to exploit this correlation in order to efficiently communicate the 3D scene information. Instead of joint encoding that requires communication between the cameras, in this thesis we concentrate on distributed representation, where the captured images are encoded independently, but decoded jointly to exploit the correlation between images. One of the most important and challenging tasks relies in estimation of the underlying correlation from the compressed correlated images for effective reconstruction or analysis in the joint decoder. This thesis focuses on developing efficient ...

Thirumalai, Vijayaraghavan — EPFL, Switzerland


Nonlinear rate control techniques for constant bit rate MPEG video coders

Digital visual communication has been increasingly adopted as an efficient new medium in a variety of different fields; multi-media computers, digital televisions, telecommunications, etc. Exchange of visual information between remote sites requires that digital video is encoded by compressing the amount of data and transmitting it through specified network connections. The compression and transmission of digital video is an amalgamation of statistical data coding processes, which aims at efficient exchange of visual information without technical barriers due to different standards, services, media, etc. It is associated with a series of different disciplines of digital signal processing, each of which can be applied independently. It includes a few different technical principles; distortion, rate theory, prediction techniques and control theory. The MPEG (Moving Picture Experts Group) video compression standard is based on this paradigm, thus, it contains a variety of different coding ...

Saw, Yoo-Sok — University Of Edinburgh


WATERMARKING FOR 3D REPRESENTATIONS

In this thesis, a number of novel watermarking techniques for different 3D representations are presented. A novel watermarking method is proposed for the mono-view video, which might be interpreted as the basic implicit representation of 3D scenes. The proposed method solves the common flickering problem in the existing video watermarking schemes by means of adjusting the watermark strength with respect to temporal contrast thresholds of human visual system (HVS), which define the maximum invisible distortions in the temporal direction. The experimental results indicate that the proposed method gives better results in both objective and subjective measures, compared to some recognized methods in the literature. The watermarking techniques for the geometry and image based representations of 3D scenes, denoted as 3D watermarking, are examined and classified into three groups, as 3D-3D, 3D-2D and 2D-2D watermarking, in which the pair of symbols ...

Koz, Alper — Middle East Technical University, Department of Electrical and Electronics Engineering


Robust and multiresolution video delivery : From H.26x to Matching pursuit based technologies

With the joint development of networking and digital coding technologies multimedia and more particularly video services are clearly becoming one of the major consumers of the new information networks. The rapid growth of the Internet and computer industry however results in a very heterogeneous infrastructure commonly overloaded. Video service providers have nevertheless to oer to their clients the best possible quality according to their respective capabilities and communication channel status. The Quality of Service is not only inuenced by the compression artifacts, but also by unavoidable packet losses. Hence, the packet video stream has clearly to fulll possibly contradictory requirements, that are coding eciency and robustness to data loss. The rst contribution of this thesis is the complete modeling of the video Quality of Service (QoS) in standard and more particularly MPEG-2 applications. The performance of Forward Error Control (FEC) ...

Frossard, Pascal — Swiss Federal Institute of Technology


Efficient representation, generation and compression of digital holograms

Digital holography is a discipline of science that measures or reconstructs the wavefield of light by means of interference. The wavefield encodes three-dimensional information, which has many applications, such as interferometry, microscopy, non-destructive testing and data storage. Moreover, digital holography is emerging as a display technology. Holograms can recreate the wavefield of a 3D object, thereby reproducing all depth cues for all viewpoints, unlike current stereoscopic 3D displays. At high quality, the appearance of an object on a holographic display system becomes indistinguishable from a real one. High-quality holograms need large volumes of data to be represented, approaching resolutions of billions of pixels. For holographic videos, the data rates needed for transmitting and encoding of the raw holograms quickly become unfeasible with currently available hardware. Efficient generation and coding of holograms will be of utmost importance for future holographic displays. ...

Blinder, David — Vrije Universiteit Brussel


Motion Analysis and Modeling for Activity Recognition and 3-D Animation based on Geometrical and Video Processing Algorithms

The analysis of audiovisual data aims at extracting high level information, equivalent with the one(s) that can be extracted by a human. It is considered as a fundamental, unsolved (in its general form) problem. Even though the inverse problem, the audiovisual (sound and animation) synthesis, is judged easier than the previous, it remains an unsolved problem. The systematic research on these problems yields solutions that constitute the basis for a great number of continuously developing applications. In this thesis, we examine the two aforementioned fundamental problems. We propose algorithms and models of analysis and synthesis of articulated motion and undulatory (snake) locomotion, using data from video sequences. The goal of this research is the multilevel information extraction from video, like object tracking and activity recognition, and the 3-D animation synthesis in virtual environments based on the results of analysis. An ...

Panagiotakis, Costas — University of Crete


Novel Methods in H.264/AVC (Inter Prediction, Data Hiding, Bit Rate Transcoding)

H.264 Advanced Video Coding has become the dominant video coding standard in the market, within a few years after the first version of the standard was completed by the ISO/IEC MPEG and the ITU-T VCEG groups in May 2003. That happened mainly due to the great coding efficiency of H.264. Compared to MPEG-2, the previous dominant standard, the H.264 compression ratio is about twice as higher for the same video quality. That makes H.264 ideal for a numerous of applications, such as video broadcasting, video streaming and video conferencing. However, the H.264 efficiency is achieved at the expense of the codec¢s complexity. H.264 complexity is about four times that of MPEG-2. As a consequence, many video coding issues, which have been addressed in previous standards, need to be re-considered. For example the H.264 encoding of a video in real time ...

Kapotas, Spyridon — Hellenic Open University


MPEGII Video Coding For Noisy Channels

This thesis considers the performance of MPEG-II compressed video when transmitted over noisy channels, a subject of relevance to digital terrestrial television, video communication and mobile digital video. Results of bit sensitivity and resynchronisation sensitivity measurements are presented and techniques proposed for substantially improving the resilience of MPEG-II to transmission errors without the addition of any extra redundancy into the bitstream. It is errors in variable length encoded data which are found to cause the greatest artifacts as errors in these data can cause loss of bitstream synchronisation. The concept of a ‘black box transcoder’ is developed where MPEG-II is losslessly transcoded into a different structure for transmission. Bitstream resynchronisation is achieved using a technique known as error-resilient entropy coding (EREC). The error-resilience of differentially coded information is then improved by replacing the standard 1D-DPCM with a more resilient hierarchical ...

Swan, Robert — University of Cambridge


3D motion capture by computer vision and virtual rendering

Networked 3D virtual environments allow multiple users to interact with each other over the Internet. Users can share some sense of telepresence by remotely animating an avatar that represents them. However, avatar control may be tedious and still render user gestures poorly. This work aims at animating a user‟s avatar from real time 3D motion capture by monoscopic computer vision, thus allowing virtual telepresence to anyone using a personal computer with a webcam. The approach followed consists of registering a 3D articulated upper-body model to a video sequence. This involves searching iteratively for the best match between features extracted from the 3D model and from the image. A two-step registration process matches regions and then edges. The first contribution of this thesis is a method of allocating computing iterations under real-time constrain that achieves optimal robustness and accuracy. The major ...

Gomez Jauregui, David Antonio — Telecom SudParis


Parallel Magnetic Resonance Imaging reconstruction problems using wavelet representations

To reduce scanning time or improve spatio-temporal resolution in some MRI applications, parallel MRI acquisition techniques with multiple coils have emerged since the early 90’s as powerful methods. In these techniques, MRI images have to be reconstructed from ac- quired undersampled “k-space” data. To this end, several reconstruction techniques have been proposed such as the widely-used SENSitivity Encoding (SENSE) method. However, the reconstructed images generally present artifacts due to the noise corrupting the ob- served data and coil sensitivity profile estimation errors. In this work, we present novel SENSE-based reconstruction methods which proceed with regularization in the complex wavelet domain so as to promote the sparsity of the solution. These methods achieve ac- curate image reconstruction under degraded experimental conditions, in which neither the SENSE method nor standard regularized methods (e.g. Tikhonov) give convincing results. The proposed approaches relies on ...

Lotfi CHAARI — University Paris-Est


Multiple Objective Optimization for Video Streaming

In this thesis, we propose Multiple Objective Optimization (MOO) frameworks for efficient video streaming. Firstly, we introduce pre-roll delay-distortion optimization (DDO) for uninterrupted content-adaptive video streaming over low capacity, constant bitrate (CBR) channels using MOO. Content analysis is used to divide the input video into shots with assigned relevance levels. The video is adaptively encoded and streamed aiming minimum pre-roll delay and distortion with the optimal spatial and temporal resolutions and quantization parameters for each shot. With buffer and distortion constraints, the bitrate of unimportant shots is reduced to achieve an acceptable quality in important shots. Secondly, we introduce a cross-layer optimized video rate adaptation and scheduling scheme to achieve maximum "application layer" Quality-of-Service (QoS), maximum video throughput (video seconds per transmission slot), and QoS fairness for wireless video streaming. Using the MOO framework, these objectives are jointly optimized such ...

Ozcelebi, Tanir — Koc University


ROBUST WATERMARKING TECHNIQUES FOR SCALABLE CODED IMAGE AND VIDEO

In scalable image/video coding, high resolution content is encoded to the highest visual quality and the bit-streams are adapted to cater various communication channels, display devices and usage requirements. These content adaptations, which include quality, resolution and frame rate scaling may also affect the content protection data, such as, watermarks and are considered as a potential watermark attack. In this thesis, research on robust watermarking techniques for scalable coded image and video, are proposed and the improvements in robustness against various content adaptation attacks, such as, JPEG 2000 for image and Motion JPEG 2000, MC-EZBC and H.264/SVC for video, are reported. The spread spectrum domain, particularly wavelet-based image watermarking schemes often provides better robustness to compression attacks due to its multi-resolution decomposition and hence chosen for this work. A comprehensive and comparative analysis of the available wavelet-based watermarking schemes,is performed ...

Bhowmik, Deepayan — University of Sheffield

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.