Integration of human color vision models into high quality image compression

Strong academic and commercial interest in image compression has resulted in a number of sophisticated compression techniques. Some of these techniques have evolved into international standards such as JPEG. However, the widespread success of JPEG has slowed the rate of innovation in such standards. Even most recent techniques, such as those proposed in the JPEG2000 standard, do not show significantly improved compression performance; rather they increase the bitstream functionality. Nevertheless, the manifold of multimedia applications demands for further improvements in compression quality. The problem of stagnating compression quality can be overcome by exploiting the limitations of the human visual system (HVS) for compression purposes. To do so, commonly used distortion metrics such as mean-square error (MSE) are replaced by an HVS-model-based quality metric. Thus, the "visual" quality is optimized. Due to the tremendous complexity of the physiological structures involved in ...

Nadenau, Marcus J. — Swiss Federal Institute of Technology


Vision models and quality metrics for image processing applications

Optimizing the performance of digital imaging systems with respect to the capture, display, storage and transmission of visual information represents one of the biggest challenges in the field of image and video processing. Taking into account the way humans perceive visual information can be greatly beneficial for this task. To achieve this, it is necessary to understand and model the human visual system, which is also the principal goal of this thesis. Computational models for different aspects of the visual system are developed, which can be used in a wide variety of image and video processing applications. The proposed models and metrics are shown to be consistent with human perception. The focus of this work is visual quality assessment. A perceptual distortion metric (PDM) for the evaluation of video quality is presented. It is based on a model of the ...

Winkler, Stefan — Swiss Federal Institute of Technology


Audio-visual processing and content management techniques, for the study of (human) bioacoustics phenomena

The present doctoral thesis aims towards the development of new long-term, multi-channel, audio-visual processing techniques for the analysis of bioacoustics phenomena. The effort is focused on the study of the physiology of the gastrointestinal system, aiming at the support of medical research for the discovery of gastrointestinal motility patterns and the diagnosis of functional disorders. The term "processing" in this case is quite broad, incorporating the procedures of signal processing, content description, manipulation and analysis, that are applied to all the recorded bioacoustics signals, the auxiliary audio-visual surveillance information (for the monitoring of experiments and the subjects' status), and the extracted audio-video sequences describing the abdominal sound-field alterations. The thesis outline is as follows. The main objective of the thesis, which is the technological support of medical research, is presented in the first chapter. A quick problem definition is initially ...

Dimoulas, Charalampos — Department of Electrical and Computer Engineering, Faculty of Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece


Advances in DFT-Based Single-Microphone Speech Enhancement

The interest in the field of speech enhancement emerges from the increased usage of digital speech processing applications like mobile telephony, digital hearing aids and human-machine communication systems in our daily life. The trend to make these applications mobile increases the variety of potential sources for quality degradation. Speech enhancement methods can be used to increase the quality of these speech processing devices and make them more robust under noisy conditions. The name "speech enhancement" refers to a large group of methods that are all meant to improve certain quality aspects of these devices. Examples of speech enhancement algorithms are echo control, bandwidth extension, packet loss concealment and noise reduction. In this thesis we focus on single-microphone additive noise reduction and aim at methods that work in the discrete Fourier transform (DFT) domain. The main objective of the presented research ...

Hendriks, Richard Christian — Delft University of Technology


Mixed structural models for 3D audio in virtual environments

In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...

Geronazzo, Michele — University of Padova


Time-frequency analysis of optical and electrical cardiac signals with applications in ultra-high-field MRI

Electrocardiography (ECG) is the standard method for assessing the state of the cardiovascular system non-invasively. In the context of magnetic resonance imaging (MRI) the ECG signal is used for cardiac monitoring and triggering, i.e., the acquisition of images synchronized to the cardiac cycle. However, ECG acquisition is impeded by the static and dynamic magnetic fields which alter the measured voltages and may reduce signal-to-noise ratio (SNR), leading to false alarms during cardiac monitoring or to image artifacts during cardiac triggering. A major source of noise is the magnetohydrodynamic (MHD) effect as it is proportional to field strength and represents a key challenge in application of ultra-high-field (UHF) MRI >=7 T. In this work, two approaches for overcoming these limitations are proposed: i) Development of a hardware and software system based on the principal of photoplethysmography imaging (PPGi) as an optical ...

Spicher, Nicolai — University of Duisburg-Essen


MPEGII Video Coding For Noisy Channels

This thesis considers the performance of MPEG-II compressed video when transmitted over noisy channels, a subject of relevance to digital terrestrial television, video communication and mobile digital video. Results of bit sensitivity and resynchronisation sensitivity measurements are presented and techniques proposed for substantially improving the resilience of MPEG-II to transmission errors without the addition of any extra redundancy into the bitstream. It is errors in variable length encoded data which are found to cause the greatest artifacts as errors in these data can cause loss of bitstream synchronisation. The concept of a ‘black box transcoder’ is developed where MPEG-II is losslessly transcoded into a different structure for transmission. Bitstream resynchronisation is achieved using a technique known as error-resilient entropy coding (EREC). The error-resilience of differentially coded information is then improved by replacing the standard 1D-DPCM with a more resilient hierarchical ...

Swan, Robert — University of Cambridge


Adapted Fusion Schemes for Multimodal Biometric Authentication

This Thesis is focused on the combination of multiple biometric traits for automatic person authentication, in what is called a multimodal biometric system. More generally, any type of biometric information can be combined in what is called a multibiometric system. The information sources in multibiometrics include not only multiple biometric traits but also multiple sensors, multiple biometric instances (e.g., different fingers in fingerprint verification), repeated instances, and multiple algorithms. Most of the approaches found in the literature for combining these various information sources are based on the combination of the matching scores provided by individual systems built on the different biometric evidences. The combination schemes following this architecture are typically based on combination rules or trained pattern classifiers, and most of them assume that the score level fusion function is fixed at verification time. This Thesis considers the problem of ...

Fierrez, Julian — Universidad Politecnica de Madrid


Direct Pore-based Identification For Fingerprint Matching Process

Fingerprint, is considered one of the most crucial scientific tools in solving criminal cases. This biometric feature is composed of unique and distinctive patterns found on the fingertips of each individual. With advancing technology and progress in forensic sciences, fingerprint analysis plays a vital role in forensic investigations and the analysis of evidence at crime scenes. The fingerprint patterns of each individual start to develop in early stagesof life and never change thereafter. This fact makes fingerprints an exceptional means of identification. In criminal cases, fingerprint analysis is used to decipher traces, evidence, and clues at crime scenes. These analyses not only provide insights into how a crime was committed but also assist in identifying the culprits or individuals involved. Computer-based fingerprint identification systems yield faster and more accurate results compared to traditional methods, making fingerprint comparisons in large databases ...

Vedat DELICAN, PhD — Istanbul Technical University


Point Cloud Quality Assessment

Nowadays, richer 3D visual representation formats are emerging, notably light fields and point clouds. These formats enable new applications in many usage domains, notably virtual and augmented reality, geographical information systems, immersive communications, and cultural heritage. Recently, following major improvements in 3D visual data acquisition, there is an increasing interest in point-based visual representation, which models real-world objects as a cloud of sampled points on their surfaces. Point cloud is a 3D representation model where the real visual world is represented by a set of 3D coordinates (the geometry) over the objects with some additional attributes such as color and normals. With the advances in 3D acquisition systems, it is now possible to capture a realistic point cloud to represent a visual scene with a very high resolution. These point clouds may have up to billions of points and, thus, ...

Javaheri, Alireza — Instituto Superior Técnico - University of Lisbon


Sparse Signal Recovery From Incomplete And Perturbed Data

Sparse signal recovery consists of algorithms that are able to recover undersampled high dimensional signals accurately. These algorithms require fewer measurements than traditional Shannon/Nyquist sampling theorem demands. Sparse signal recovery has found many applications including magnetic resonance imaging, electromagnetic inverse scattering, radar/sonar imaging, seismic data collection, sensor array processing and channel estimation. The focus of this thesis is on electromagentic inverse scattering problem and joint estimation of the frequency offset and the channel impulse response in OFDM. In the electromagnetic inverse scattering problem, the aim is to find the electromagnetic properties of unknown targets from measured scattered field. The reconstruction of closely placed point-like objects is investigated. The application of the greedy pursuit based sparse recovery methods, OMP and FTB-OMP, is proposed for increasing the reconstruction resolution. The performances of the proposed methods are compared against NESTA and MT-BCS methods. ...

Senyuva, Rifat Volkan — Bogazici University


Dialogue Enhancement and Personalization - Contributions to Quality Assessment and Control

The production and delivery of audio for television involve many creative and technical challenges. One of them is concerned with the level balance between the foreground speech (also referred to as dialogue) and the background elements, e.g., music, sound effects, and ambient sounds. Background elements are fundamental for the narrative and for creating an engaging atmosphere, but they can mask the dialogue, which the audience wishes to follow in a comfortable way. Very different individual factors of the people in the audience clash with the creative freedom of the content creators. As a result, service providers receive regular complaints about difficulties in understanding the dialogue because of too loud background sounds. While this has been a known issue for at least three decades, works analyzing the problem and up-to-date statics were scarce before the contributions in this work. Enabling the ...

Torcoli, Matteo — Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)


Planar 3D Scene Representations for Depth Compression

The recent invasion of stereoscopic 3D television technologies is expected to be followed by autostereoscopic and holographic technologies. Glasses-free multiple stereoscopic pair displaying capabilities of these technologies will advance the 3D experience. The prospective 3D format to create the multiple views for such displays is Multiview Video plus Depth (MVD) format based on the Depth Image Based Rendering (DIBR) techniques. The depth modality of the MVD format is an active research area whose main objective is to develop DIBR friendly efficient compression methods. As a part this research, the thesis proposes novel 3D planar-based depth representations. The planar approximation of the stereo depth images is formulated as an energy-based co-segmentation problem by a Markov Random Field model. The energy terms of this problem are designed to mimic the rate-distortion tradeoff for a depth compression application. A heuristic algorithm is developed ...

Özkalaycı, Burak Oğuz — Middle East Technical University


Modeling Perceived Quality for Imaging Applications

People of all generations are making more and more use of digital imaging systems in their daily lives. The image content rendered by these digital imaging systems largely differs in perceived quality depending on the system and its applications. To be able to optimize the experience of viewers of this content understanding and modeling perceived image quality is essential. Research on modeling image quality in a full-reference framework --- where the original content can be used as a reference --- is well established in literature. In many current applications, however, the perceived image quality needs to be modeled in a no-reference framework at real-time. As a consequence, the model needs to quantitatively predict perceived quality of a degraded image without being able to compare it to its original version, and has to achieve this with limited computational complexity in order ...

Liu, Hantao — Delft University of Technology


Audio Signal Processing for Binaural Reproduction with Improved Spatial Perception

Binaural technology aims to reproduce three-dimensional auditory scenes with a high level of realism by providing the auditory display with spatial hearing information. This technology has various applications in virtual acoustics, architectural acoustics, telecommunication and auditory science. One key element in binaural technology is the actual binaural signals, produced by filtering a sound-field with free-field head related transfer functions (HRTFs). With the increased popularity of spherical microphone arrays for sound-field recording, methods have been developed for rendering binaural signals from these recordings. The use of spherical arrays naturally leads to processing methods that are formulated in the spherical harmonics (SH) domain. For accurate SH representation, high-order functions, of both the sound-field and the HRTF, are required. However, a limited number of microphones, on one hand, and challenges in acquiring high resolution individual HRTFs, on the other hand, impose limitations on ...

Ben-Hur, Zamir — Ben-Gurion University of the Negev

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.