Motion Estimation and Compensation of Video Sequences using Affine Transforms (1999)
Image Sequence Restoration Using Gibbs Distributions
This thesis addresses a number of issues concerned with the restoration of one type of image sequence namely archived black and white motion pictures. These are often a valuable historical record but because of the physical nature of the film they can suffer from a variety of degradations which reduce their usefulness. The main visual defects are ‘dirt and sparkle’ due to dust and dirt becoming attached to the film or abrasion removing the emulsion and ‘line scratches’ due to the film running against foreign bodies in the camera or projector. For an image restoration algorithm to be successful it must be based on a mathematical model of the image. A number of models have been proposed and here we explore the use of a general class of model known as Markov Random Fields (MRFs) based on Gibbs distributions by ...
Morris, Robin David — University of Cambridge
A statistical approach to motion estimation
Digital video technology has been characterized by a steady growth in the last decade. New applications like video e-mail, third generation mobile phone video communications, videoconferencing, video streaming on the web continuously push for further evolution of research in digital video coding. In order to be sent over the internet or even wireless networks, video information clearly needs compression to meet bandwidth requirements. Compression is mainly realized by exploiting the redundancy present in the data. A sequence of images contains an intrinsic, intuitive and simple idea of redundancy: two successive images are very similar. This simple concept is called temporal redundancy. The research of a proper scheme to exploit the temporal redundancy completely changes the scenario between compression of still pictures and sequence of images. It also represents the key for very high performances in image sequence coding when compared ...
Moschetti, Fulvio — Swiss Federal Institute of Technology
Techniques for improving the performance of distributed video coding
Distributed Video Coding (DVC) is a recently proposed paradigm in video communication, which fits well emerging applications such as wireless video surveillance, multimedia sensor networks, wireless PC cameras, and mobile cameras phones. These applications require a low complexity encoding, while possibly affording a high complexity decoding. DVC presents several advantages: First, the complexity can be distributed between the encoder and the decoder. Second, the DVC is robust to errors, since it uses a channel code. In DVC, a Side Information (SI) is estimated at the decoder, using the available decoded frames, and used for the decoding and reconstruction of other frames. In this Ph.D thesis, we propose new techniques in order to improve the quality of the SI. First, successive refinement of the SI is performed after each decoded DCT band, using a Partially Decoded WZF (PDWZF), along with the ...
Abou-Elailah, Abdalbassir — Telecom Paristech
Toward sparse and geometry adapted video approximations
Video signals are sequences of natural images, where images are often modeled as piecewise-smooth signals. Hence, video can be seen as a 3D piecewise-smooth signal made of piecewise-smooth regions that move through time. Based on the piecewise-smooth model and on related theoretical work on rate-distortion performance of wavelet and oracle based coding schemes, one can better analyze the appropriate coding strategies that adaptive video codecs need to implement in order to be efficient. Efficient video representations for coding purposes require the use of adaptive signal decompositions able to capture appropriately the structure and redundancy appearing in video signals. Adaptivity needs to be such that it allows for proper modeling of signals in order to represent these with the lowest possible coding cost. Video is a very structured signal with high geometric content. This includes temporal geometry (normally represented by motion ...
Divorra Escoda, Oscar — EPFL / Signal Processing Institute
PRIORITIZED 3D SCENE RECONSTRUCTION AND RATE-DISTORTION
In this dissertation, a novel scheme performing 3D reconstruction of a scene from a 2D video sequence is presented. To this aim, first, the trajectories of the salient features in the scene are determined as a sequence of displacements via Kanade-Lukas-Tomasi tracker and Kalman filter. Then, a tentative camera trajectory with respect to a metric reference reconstruction is estimated. All frame pairs are ordered with respect to their amenability to 3D reconstruction by a metric that utilizes the baseline distances and the number of tracked correspondences between the frames. The ordered frame pairs are processed via a sequential structure-from- motion algorithm to estimate the sparse structure and camera matrices. The metric and the associated reconstruction algorithm are shown to outperform their counterparts in the literature via experiments. Finally, a mesh-based, rate- distortion efficient representation is constructed through a novel procedure ...
Imre, Evren — Middle East Technical University, Department of Electrical and Electronics Engineering
Novel Methods in H.264/AVC (Inter Prediction, Data Hiding, Bit Rate Transcoding)
H.264 Advanced Video Coding has become the dominant video coding standard in the market, within a few years after the first version of the standard was completed by the ISO/IEC MPEG and the ITU-T VCEG groups in May 2003. That happened mainly due to the great coding efficiency of H.264. Compared to MPEG-2, the previous dominant standard, the H.264 compression ratio is about twice as higher for the same video quality. That makes H.264 ideal for a numerous of applications, such as video broadcasting, video streaming and video conferencing. However, the H.264 efficiency is achieved at the expense of the codec¢s complexity. H.264 complexity is about four times that of MPEG-2. As a consequence, many video coding issues, which have been addressed in previous standards, need to be re-considered. For example the H.264 encoding of a video in real time ...
Kapotas, Spyridon — Hellenic Open University
Video Sequence Analysis for Content Description, Summarization and Content-Based Retrieval
The main research area of this Ph.D. thesis is video sequence processing and analysis for description and indexing of visual content. Its objective is to contribute in the development of a computational system with the capabilities of object-based segmentation of audiovisual material, automatic content description, summarization for preview and browsing, as well as content-based retrieval. The thesis consists of four parts. The first introduces video sequence analysis, segmentation and object extraction based on color, motion, and depth field. A fusion technique is proposed that combines individual cue segmentations and allows for reliable identification of semantic objects. The second part refers to automatic description and annotation of the visual content by means of feature vectors, summarization, implemented by optimal selection of a limited set of key frames and shots, and content-based search and retrieval. In the third part, the problem of ...
Avrithis, Yannis — National Technical University of Athens
Gliomas represent about 80% of all malignant primary brain tumors. Despite recent advancements in glioma research, patient outcome remains poor. The 5 year survival rate of the most common and most malignant subtype, i.e. glioblastoma, is about 5%. Magnetic resonance imaging (MRI) has become the imaging modality of choice in the management of brain tumor patients. Conventional MRI (cMRI) provides excellent soft tissue contrast without exposing the patient to potentially harmful ionizing radiation. Over the past decade, advanced MRI modalities, such as perfusion-weighted imaging (PWI), diffusion-weighted imaging (DWI) and magnetic resonance spectroscopic imaging (MRSI) have gained interest in the clinical field, and their added value regarding brain tumor diagnosis, treatment planning and follow-up has been recognized. Tumor segmentation involves the imaging-based delineation of a tumor and its subcompartments. In gliomas, segmentation plays an important role in treatment planning as well ...
Sauwen, Nicolas — KU Leuven
Radial Basis Function Network Robust Learning Algorithms in Computer Vision Applications
This thesis introduces new learning algorithms for Radial Basis Function (RBF) networks. RBF networks is a feed-forward two-layer neural network used for functional approximation or pattern classification applications. The proposed training algorithms are based on robust statistics. Their theoretical performance has been assessed and compared with that of classical algorithms for training RBF networks. The applications of RBF networks described in this thesis consist of simultaneously modeling moving object segmentation and optical flow estimation in image sequences and 3-D image modeling and segmentation. A Bayesian classifier model is used for the representation of the image sequence and 3-D images. This employs an energy based description of the probability functions involved. The energy functions are represented by RBF networks whose inputs are various features drawn from the images and whose outputs are objects. The hidden units embed kernel functions. Each kernel ...
Bors, Adrian G. — Aristotle University of Thessaloniki
MPEGII Video Coding For Noisy Channels
This thesis considers the performance of MPEG-II compressed video when transmitted over noisy channels, a subject of relevance to digital terrestrial television, video communication and mobile digital video. Results of bit sensitivity and resynchronisation sensitivity measurements are presented and techniques proposed for substantially improving the resilience of MPEG-II to transmission errors without the addition of any extra redundancy into the bitstream. It is errors in variable length encoded data which are found to cause the greatest artifacts as errors in these data can cause loss of bitstream synchronisation. The concept of a ‘black box transcoder’ is developed where MPEG-II is losslessly transcoded into a different structure for transmission. Bitstream resynchronisation is achieved using a technique known as error-resilient entropy coding (EREC). The error-resilience of differentially coded information is then improved by replacing the standard 1D-DPCM with a more resilient hierarchical ...
Swan, Robert — University of Cambridge
In Wireless Sensor Networks (WSN), the ability of sensor nodes to know its position is an enabler for a wide variety of applications for monitoring, control, and automation. Often, sensor data is meaningful only if its position can be determined. Many WSN are deployed indoors or in areas where Global Navigation Satellite System (GNSS) signal coverage is not available, and thus GNSS positioning cannot be guaranteed. In these scenarios, WSN may be relied upon to achieve a satisfactory degree of positioning accuracy. Typically, batteries power sensor nodes in WSN. These batteries are costly to replace. Therefore, power consumption is an important aspect, being performance and lifetime ofWSN strongly relying on the ability to reduce it. It is crucial to design effective strategies to maximize battery lifetime. Optimization of power consumption can be made at different layers. For example, at the ...
Moragrega, Ana — Universitat Politecnica de Catalunya
This thesis presents a system for the interpretation of natural speech which serves as input module for a spoken dialog system. It carries out the task of extracting application-specific pieces of information from the user utterance in order to pass them to the control module of the dialog system. By following the approach of integrating speech recognition and speech interpretation, the system is able to determine the spoken word sequence together with the hierarchical utterance structure that is necessary for the extraction of information directly from the recorded speech signal. The efficient implementation of the underlying decoder is based on the powerful tool of weighted finite state transducers (WFSTs). This tool allows to compile all involved knowledge sources into an optimized network representation of the search space which is constructed dynamically during the ongoing decoding process. In addition to the ...
Lieb, Robert — Technische Universität München
Low Complexity Image Recognition Algorithms for Handheld Devices
Content Based Image Retrieval (CBIR) has gained a lot of interest over the last two decades. The need to search and retrieve images from databases, based on information (“features”) extracted from the image itself, is becoming increasingly important. CBIR can be useful for handheld image recognition devices in which the image to be recognized is acquired with a camera, and thus there is no additional metadata associated to it. However, most CBIR systems require large computations, preventing their use in handheld devices. In this PhD work, we have developed low-complexity algorithms for content based image retrieval in handheld devices for camera acquired images. Two novel algorithms, ‘Color Density Circular Crop’ (CDCC) and ‘DCT-Phase Match’ (DCTPM), to perform image retrieval along with a two-stage image retrieval algorithm that combines CDCC and DCTPM, to achieve the low complexity required in handheld devices ...
Ayyalasomayajula, Pradyumna — EPFL
Antenna Arrays for Multipath and Interference Mitigation in GNSS Receivers
This thesis deals with the synchronization of one or several replicas of a known signal received in a scenario with multipath propagation and directional interference. A connecting theme along this work is the systematic application of the maximum likelihood (ML) principle together with a signal model in which the spatial signatures are unstructured and the noise term is Gaussian- distributed with an unknown correlation matrix. This last assumption is key in obtaining estimators that are capable of mitigating the disturbing signals that exhibit a certain structure, and this is achieved without resorting to the estimation of the parameters of those signals. On the other hand, the assumption of unstructured spatial signatures is interesting from a practical standpoint and facilitates the estimation problem since the estimates of these signatures can be obtained in closed form. This constitutes a first step towards ...
Seco-Granados, Gonzalo — Universitat Politecnica de Catalunya
The present doctoral thesis aims towards the development of new long-term, multi-channel, audio-visual processing techniques for the analysis of bioacoustics phenomena. The effort is focused on the study of the physiology of the gastrointestinal system, aiming at the support of medical research for the discovery of gastrointestinal motility patterns and the diagnosis of functional disorders. The term "processing" in this case is quite broad, incorporating the procedures of signal processing, content description, manipulation and analysis, that are applied to all the recorded bioacoustics signals, the auxiliary audio-visual surveillance information (for the monitoring of experiments and the subjects' status), and the extracted audio-video sequences describing the abdominal sound-field alterations. The thesis outline is as follows. The main objective of the thesis, which is the technological support of medical research, is presented in the first chapter. A quick problem definition is initially ...
Dimoulas, Charalampos — Department of Electrical and Computer Engineering, Faculty of Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.