Similar: Natural-Scene Text Understanding

Super-Resolution Image Reconstruction Using Non-Linear Filtering Techniques

Super-resolution (SR) is a filtering technique that combines a sequence of under-sampled and degraded low-resolution images to produce an image at a higher resolution. The reconstruction takes advantage of the additional spatio-temporal data available in the sequence of images portraying the same scene. The fundamental problem addressed in super-resolution is a typical example of an inverse problem, wherein multiple low-resolution (LR)images are used to solve for the original high-resolution (HR) image. Super-resolution has already proved useful in many practical cases where multiple frames of the same scene can be obtained, including medical applications, satellite imaging and astronomical observatories. The application of super resolution filtering in consumer cameras and mobile devices shall be possible in the future, especially that the computational and memory resources in these devices are increasing all the time. For that goal, several research problems need to be ...

Trimeche, Mejdi — Tampere University of Technology

Contributions to the Information Fusion : application to Obstacle Recognition in Visible and Infrared Images

The interest for the intelligent vehicle field has been increased during the last years, must probably due to an important number of road accidents. Many accidents could be avoided if a device attached to the vehicle would assist the driver with some warnings when dangerous situations are about to appear. In recent years, leading car developers have recorded significant efforts and support research works regarding the intelligent vehicle field where they propose solutions for the existing problems, especially in the vision domain. Road detection and following, pedestrian or vehicle detection, recognition and tracking, night vision, among others are examples of applications which have been developed and improved recently. Still, a lot of challenges and unsolved problems remain in the intelligent vehicle domain. Our purpose in this thesis is to design an Obstacle Recognition system for improving the road security by ...

Apatean, Anca Ioana — Institut National des Sciences Appliquées de Rouen

Computational Attention: Towards attentive computers

Consciously or unconsciously, humans always pay attention to a wide variety of stimuli. Attention is part of daily life and it is the first step to understanding. The proposed thesis deals with a computational approach to the human attentional mechanism and with its possible applications mainly in the field of computer vision. In a first stage, the text introduces a rarity-based three-level attention model handling monodimensional signals as well as images or video sequences. The concept of attention is defined as the transformation of a huge acquired unstructured data set into a smaller structured one while preserving the information: the attentional mechanism turns rough data into intelligence. Afterwards, several applications are described in the fields of machine vision, signal coding and enhancement, medical imaging, event detection and so on. These applications not only show the applicability of the proposed computational ...

Mancas, Matei — University of Mons (UMONS)

Dealing with Variability Factors and Its Application to Biometrics at a Distance

This Thesis is focused on dealing with the variability factors in biometric recognition and applications of biometrics at a distance. In particular, this PhD Thesis explores the problem of variability factors assessment and how to deal with them by the incorporation of soft biometrics information in order to improve person recognition systems working at a distance. The proposed methods supported by experimental results show the benefits of adapting the system considering the variability of the sample at hand. Although being relatively young compared to other mature and long-used security technologies, biometrics have emerged in the last decade as a pushing alternative for applications where automatic recognition of people is needed. Certainly, biometrics are very attractive and useful for video surveillance systems at a distance, widely distributed in our lifes, and for the final user: forget about PINs and passwords, you ...

Tome, Pedro — Universidad Autónoma de Madrid

Computational Attention: Modelisation and Application to Audio and Image Processing

Mancas, Matei — Universite de Mons

Kernel PCA and Pre-Image Iterations for Speech Enhancement

In this thesis, we present novel methods to enhance speech corrupted by noise. All methods are based on the processing of complex-valued spectral data. First, kernel principal component analysis (PCA) for speech enhancement is proposed. Subsequently, a simplification of kernel PCA, called pre-image iterations (PI), is derived. This method computes enhanced feature vectors iteratively by linear combination of noisy feature vectors. The weighting for the linear combination is found by a kernel function that measures the similarity between the feature vectors. The kernel variance is a key parameter for the degree of de-noising and has to be set according to the signal-to-noise ratio (SNR). Initially, PI were proposed for speech corrupted by additive white Gaussian noise. To be independent of knowledge about the SNR and to generalize to other stationary noise types, PI are extended by automatic determination of the ...

Leitner, Christina — Graz University of Technology

Improving data-driven EEG-FMRI analyses for the study of cognitive functioning

Understanding the cognitive processes that are going on in the human brain, requires the combination of several types of observations. For this reason, since several years, neuroscience research started to focus on multimodal approaches. One such multimodal approach is the combination of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI). The non-invasive character of these two modalities makes their combination not only harmless and painless, but also especially suited for widespread research in both clinical and experimental applications. Moreover, the complementarity between the high temporal resolution of the EEG and the high spatial resolution of the fMRI, allows obtaining a more complete picture of the processes under study. However, the combination of EEG and fMRI is challenging, not only on the level of the data acquisition, but also when it comes to extracting the activity of interest and interpreting the ...

Vanderperren, Katrien — KU Leuven

Non-Intrusive Speech Intelligibility Prediction

The ability to communicate through speech is important for social interaction. We rely on the ability to communicate with each other even in noisy conditions. Ideally, the speech is easy to understand but this is not always the case, if the speech is degraded, e.g., due to background noise, distortion or hearing impairment. One of the most important factors to consider in relation to such degradations is speech intelligibility, which is a measure of how easy or difficult it is to understand the speech. In this thesis, the focus is on the topic of speech intelligibility prediction. The thesis consists of an introduction to the field of speech intelligibility prediction and a collection of scientific papers. The introduction provides a background to the challenges with speech communication in noisy conditions, followed by an introduction to how speech is produced and ...

Sørensen, Charlotte — Aalborg University

Digital Processing Based Solutions for Life Science Engineering Recognition Problems

The field of Life Science Engineering (LSE) is rapidly expanding and predicted to grow strongly in the next decades. It covers areas of food and medical research, plant and pests’ research, and environmental research. In each research area, engineers try to find equations that model a certain life science problem. Once found, they research different numerical techniques to solve for the unknown variables of these equations. Afterwards, solution improvement is examined by adopting more accurate conventional techniques, or developing novel algorithms. In particular, signal and image processing techniques are widely used to solve those LSE problems require pattern recognition. However, due to the continuous evolution of the life science problems and their natures, these solution techniques can not cover all aspects, and therefore demanding further enhancement and improvement. The thesis presents numerical algorithms of digital signal and image processing to ...

Hussein, Walid — Technische Universität München

Vision models and quality metrics for image processing applications

Optimizing the performance of digital imaging systems with respect to the capture, display, storage and transmission of visual information represents one of the biggest challenges in the field of image and video processing. Taking into account the way humans perceive visual information can be greatly beneficial for this task. To achieve this, it is necessary to understand and model the human visual system, which is also the principal goal of this thesis. Computational models for different aspects of the visual system are developed, which can be used in a wide variety of image and video processing applications. The proposed models and metrics are shown to be consistent with human perception. The focus of this work is visual quality assessment. A perceptual distortion metric (PDM) for the evaluation of video quality is presented. It is based on a model of the ...

Winkler, Stefan — Swiss Federal Institute of Technology

Security/Privacy Analysis of Biometric Hashing and Template Protection for Fingerprint Minutiae

This thesis has two main parts. The first part deals with security and privacy analysis of biometric hashing. The second part introduces a method for fixed-length feature vector extraction and hash generation from fingerprint minutiae. The upsurge of interest in biometric systems has led to development of biometric template protection methods in order to overcome security and privacy problems. Biometric hashing produces a secure binary template by combining a personal secret key and the biometric of a person, which leads to a two factor authentication method. This dissertation analyzes biometric hashing both from a theoretical point of view and in regards to its practical application. For theoretical evaluation of biohashes, a systematic approach which uses estimated entropy based on degree of freedom of a binomial distribution is outlined. In addition, novel practical security and privacy attacks against face image hashing ...

Berkay Topcu — Sabanci University

Camera based motion estimation and recognition for human-computer interaction

Communicating with mobile devices has become an unavoidable part of our daily life. Unfortunately, the current user interface designs are mostly taken directly from desktop computers. This has resulted in devices that are sometimes hard to use. Since more processing power and new sensing technologies are already available, there is a possibility to develop systems to communicate through different modalities. This thesis proposes some novel computer vision approaches, including head tracking, object motion analysis and device ego-motion estimation, to allow efficient interaction with mobile devices. For head tracking, two new methods have been developed. The first method detects a face region and facial features by employing skin detection, morphology, and a geometrical face model. The second method, designed especially for mobile use, detects the face and eyes using local texture features. In both cases, Kalman filtering is applied to estimate ...

Hannuksela, Jari — University of Oulou

Functional Neuroimaging Data Characterisation Via Tensor Representations

The growing interest in neuroimaging technologies generates a massive amount of biomedical data that exhibit high dimensionality. Tensor-based analysis of brain imaging data has by now been recognized as an effective approach exploiting its inherent multi-way nature. In particular, the advantages of tensorial over matrix-based methods have previously been demonstrated in the context of functional magnetic resonance imaging (fMRI) source localization; the identification of the regions of the brain which are activated at specific time instances. However, such methods can also become ineffective in realistic challenging scenarios, involving, e.g., strong noise and/or significant overlap among the activated regions. Moreover, they commonly rely on the assumption of an underlying multilinear model generating the data. In the first part of this thesis, we aimed at investigating the possible gains from exploiting the 3-dimensional nature of the brain images, through a higher-order tensorization ...

Christos Chatzichristos — National and Kapodistrian University of Athens

Modeling Perceived Quality for Imaging Applications

People of all generations are making more and more use of digital imaging systems in their daily lives. The image content rendered by these digital imaging systems largely differs in perceived quality depending on the system and its applications. To be able to optimize the experience of viewers of this content understanding and modeling perceived image quality is essential. Research on modeling image quality in a full-reference framework --- where the original content can be used as a reference --- is well established in literature. In many current applications, however, the perceived image quality needs to be modeled in a no-reference framework at real-time. As a consequence, the model needs to quantitatively predict perceived quality of a degraded image without being able to compare it to its original version, and has to achieve this with limited computational complexity in order ...

Liu, Hantao — Delft University of Technology

Dialogue Enhancement and Personalization - Contributions to Quality Assessment and Control

The production and delivery of audio for television involve many creative and technical challenges. One of them is concerned with the level balance between the foreground speech (also referred to as dialogue) and the background elements, e.g., music, sound effects, and ambient sounds. Background elements are fundamental for the narrative and for creating an engaging atmosphere, but they can mask the dialogue, which the audience wishes to follow in a comfortable way. Very different individual factors of the people in the audience clash with the creative freedom of the content creators. As a result, service providers receive regular complaints about difficulties in understanding the dialogue because of too loud background sounds. While this has been a known issue for at least three decades, works analyzing the problem and up-to-date statics were scarce before the contributions in this work. Enabling the ...

Torcoli, Matteo — Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.

Follow @eurasip

Natural-Scene Text Understanding (2006)