Motion detection and human recognition in video sequences

This thesis is concerned with the design of a complete framework that allows the real-time recognition of humans in a video stream acquired by a static camera. For each stage of the processing chain, which takes as input the raw images of the stream and eventually outputs the identity of the persons, we propose an original algorithm. The first algorithm is a background subtraction technique named ViBe. The purpose of ViBe is to detect the parts of the images that contain moving objects. The second algorithm determines which moving objects correspond to individuals. The third algorithm allows the recognition of the detected individuals from their gait. Our background subtraction algorithm, ViBe, uses a collection of samples to model the history of each pixel. The current value of a pixel is classified by comparison with the closest samples that belong to ...

Olivier, Barnich — University of Liege


Motion Analysis and Modeling for Activity Recognition and 3-D Animation based on Geometrical and Video Processing Algorithms

The analysis of audiovisual data aims at extracting high level information, equivalent with the one(s) that can be extracted by a human. It is considered as a fundamental, unsolved (in its general form) problem. Even though the inverse problem, the audiovisual (sound and animation) synthesis, is judged easier than the previous, it remains an unsolved problem. The systematic research on these problems yields solutions that constitute the basis for a great number of continuously developing applications. In this thesis, we examine the two aforementioned fundamental problems. We propose algorithms and models of analysis and synthesis of articulated motion and undulatory (snake) locomotion, using data from video sequences. The goal of this research is the multilevel information extraction from video, like object tracking and activity recognition, and the 3-D animation synthesis in virtual environments based on the results of analysis. An ...

Panagiotakis, Costas — University of Crete


Face Recognition Robust to Occlusions

Face recognition is an important technology in computer vision, which often acts as an essential component in biometrics systems, HCI systems, access control systems, multimedia indexing applications, etc. In recent years, identification of subjects in non-controlled scenarios has received large amount of attentions from the biometrics research community. The deployment of real-time and robust face recognition systems can significantly reinforce the safety and security in public places or/and private residences. However, variations due to expressions/illuminations/poses/occlusions can significantly deteriorate the performance of face recognition systems in non-controlled environments. Partial occlusion, which significantly changes the appearance of part of a face, cannot only cause large performance deterioration of face recognition, but also can cause severe security issues. In this thesis, we focus on the occlusion problem in automatic face recognition in noncontrolled environments. Toward this goal, we propose a framework that consists ...

Min, Rui — Telecom ParisTech


Video person recognition strategies using head motion and facial appearance

In this doctoral dissertation, we principally explore the use of the temporal information available in video sequences for person and gender recognition; in particular, we focus on the analysis of head and facial motion, and their potential application as biometric identifiers. We also investigate how to exploit as much video information as possible for the automatic recognition; more precisely, we examine the possibility of integrating the head and mouth motion information with facial appearance into a multimodal biometric system, and we study the extraction of novel spatio-temporal facial features for recognition. We initially present a person recognition system that exploits the unconstrained head motion information, extracted by tracking a few facial landmarks in the image plane. In particular, we detail how each video sequence is firstly pre-processed by semiautomatically detecting the face, and then automatically tracking the facial landmarks over ...

Matta, Federico — Eurécom / Multimedia communications


Improvements in Pose Invariance and Local Description for Gabor-based 2D Face Recognition

Automatic face recognition has attracted a lot of attention not only because of the large number of practical applications where human identification is needed but also due to the technical challenges involved in this problem: large variability in facial appearance, non-linearity of face manifolds and high dimensionality are some the most critical handicaps. In order to deal with the above mentioned challenges, there are two possible strategies: the first is to construct a “good” feature space in which the manifolds become simpler (more linear and more convex). This scheme usually comprises two levels of processing: (1) normalize images geometrically and photometrically and (2) extract features that are stable with respect to these variations (such as those based on Gabor filters). The second strategy is to use classification structures that are able to deal with non-linearities and to generalize properly. To ...

Gonzalez-Jimenez, Daniel — University of Vigo


Contactless and less-constrained palmprint recognition

Biometric systems consist in the combination of devices, algorithms, and procedures used to recognize the individuals based on the characteristics, physical or behavioral, of their persons. These characteristics are called biometric traits. Nowadays, biometric technologies are becoming more and more widespread, and many people use biometric systems daily. However, in some cases the procedures used for the collection of the biometric traits need the cooperation of the user, controlled environments, illuminations perceived as unpleasant, too strong, or harmful, or the contact of the body with a sensor. For these reasons, techniques for the contactless and less-constrained biometric recognition are being researched, in order to increase the usability and social acceptance of biometric systems, and increase the fields of application of biometric technologies. In this context, the palmprint is a biometric trait whose acquisition is generally well accepted by the users. ...

Genovese, Angelo — Università degli Studi di Milano


Fusing prosodic and acoustic information for speaker recognition

Automatic speaker recognition is the use of a machine to identify an individual from a spoken sentence. Recently, this technology has been undergone an increasing use in applications such as access control, transaction authentication, law enforcement, forensics, and system customisation, among others. One of the central questions addressed by this field is what is it in the speech signal that conveys speaker identity. Traditionally, automatic speaker recognition systems have relied mostly on short-term features related to the spectrum of the voice. However, human speaker recognition relies on other sources of information; therefore, there is reason to believe that these sources can play also an important role in the automatic speaker recognition task, adding complementary knowledge to the traditional spectrum-based recognition systems and thus improving their accuracy. The main objective of this thesis is to add prosodic information to a traditional ...

Farrus, Mireia — Universitat Politecnica de Catalunya


Dealing with Variability Factors and Its Application to Biometrics at a Distance

This Thesis is focused on dealing with the variability factors in biometric recognition and applications of biometrics at a distance. In particular, this PhD Thesis explores the problem of variability factors assessment and how to deal with them by the incorporation of soft biometrics information in order to improve person recognition systems working at a distance. The proposed methods supported by experimental results show the benefits of adapting the system considering the variability of the sample at hand. Although being relatively young compared to other mature and long-used security technologies, biometrics have emerged in the last decade as a pushing alternative for applications where automatic recognition of people is needed. Certainly, biometrics are very attractive and useful for video surveillance systems at a distance, widely distributed in our lifes, and for the final user: forget about PINs and passwords, you ...

Tome, Pedro — Universidad Autónoma de Madrid


Contributions to practical iris biometrics on smartphones

This thesis investigates the practical adaption of iris biometrics on smartphones. Iris recognition is a mature and widely deployed technology which will be able to provide the high security demanded by next generation smartphones. Practical challenges in widely adopting this technology on smartphones are identified. Based on this, a number of design strategies are presented for constraint free, high performing iris biometrics on smartphones. A prototype, smartphone form factor device is presented to be used as a front-facing camera. Analysis of its optical properties and iris imaging capabilities shows that such a device with improved optics and sensors could be used for implementing iris recognition in the next generation of smartphones. A novel iris liveness detection is presented to prevent spoofing attacks on such a system. Also, the social impact of wider adoption of this technology is discussed. Iris pattern ...

Thavalengal, Shejin — National University of Ireland Galway


Fire Detection Algorithms Using Multimodal Signal and Image Analysis

Dynamic textures are common in natural scenes. Examples of dynamic textures in video include fire, smoke, clouds, volatile organic compound (VOC) plumes in infra-red (IR) videos, trees in the wind, sea and ocean waves, etc. Researchers extensively studied 2-D textures and related problems in the fields of image processing and computer vision. On the other hand, there is very little research on dynamic texture detection in video. In this dissertation, signal and image processing methods developed for detection of a specific set of dynamic textures are presented. Signal and image processing methods are developed for the detection of flames and smoke in open and large spaces with a range of up to $30$m to the camera in visible-range (IR) video. Smoke is semi-transparent at the early stages of fire. Edges present in image frames with smoke start loosing their sharpness ...

Toreyin, Behcet Ugur — Bilkent University


Non-rigid Registration-based Data-driven 3D Facial Action Unit Detection

Automated analysis of facial expressions has been an active area of study due to its potential applications not only for intelligent human-computer interfaces but also for human facial behavior research. To advance automatic expression analysis, this thesis proposes and empirically proves two hypotheses: (i) 3D face data is a better data modality than conventional 2D camera images, not only for being much less disturbed by illumination and head pose effects but also for capturing true facial surface information. (ii) It is possible to perform detailed face registration without resorting to any face modeling. This means that data-driven methods in automatic expression analysis can compensate for the confounding effects like pose and physiognomy differences, and can process facial features more effectively, without suffering the drawbacks of model-driven analysis. Our study is based upon Facial Action Coding System (FACS) as this paradigm ...

Savran, Arman — Bogazici University


Mixed structural models for 3D audio in virtual environments

In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...

Geronazzo, Michele — University of Padova


Light Field Based Biometric Recognition and Presentation Attack Detection

In a world where security issues have been gaining explosive importance, face and ear recognition systems have attracted increasing attention in multiple application areas, ranging from forensics and surveillance to commerce and entertainment. While the recognition performance has been steadily improving, there are still challenging recognition scenarios and conditions, notably when facing large variations in the biometric data characteristics. Additionally, the widespread use of face and ear recognition solutions raises new security concerns, making the robustness against presentation attacks a very active field of research. Lenslet light field cameras have recently come into prominence as they are able to also capture the intensity of the light rays coming from multiple directions, thus offering a richer representation of the visual scene, notably spatio-angular information. To take benefit of this richer representation, light field cameras have recently been successfully applied, not only ...

Alireza Sepas-Moghaddam — Instituto Superior Técnico, University of Lisbon


Vulnerabilities and Attack Protection in Security Systems Based on Biometric Recognition

Absolute security does not exist: given funding, willpower and the proper technology, every security system can be compromised. However, the objective of the security community should be to develop such applications that the funding, the will, and the resources needed by the attacker to crack the system prevent him from attempting to do so. This Thesis is focused on the vulnerability assessment of biometric systems. Although being relatively young compared to other mature and long-used security technologies, biometrics have emerged in the last decade as a pushing alternative for applications where automatic recognition of people is needed. Certainly, biometrics are very attractive and useful for the final user: forget about PINs and passwords, you are your own key. However, we cannot forget that as any technology aimed to provide a security service, biometric systems are exposed to external attacks which ...

Javier Galbally — Universidad Autonoma de Madrid


Video Based Detection of Driver Fatigue

This thesis addresses the problem of drowsy driver detection using computer vision techniques applied to the human face. Specifically we explore the possibility of discriminating drowsy from alert video segments using facial expressions automatically extracted from video. Several approaches were previously proposed for the detection and prediction of drowsiness. There has recently been increasing interest in computer vision approaches as it is a potentially promising approach due to its non-invasive nature for detecting drowsiness. Previous studies with vision based approaches detect driver drowsiness primarily by making pre-assumptions about the relevant behavior, focusing on blink rate, eye closure, and yawning. Here we employ machine learning to explore, understand and exploit actual human behavior during drowsiness episodes. We have collected two datasets including facial and head movement measures. Head motion is collected through an accelerometer for the first dataset (UYAN-1) and an ...

Vural, Esra — Sabanci University

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.