Improvements in Pose Invariance and Local Description for Gabor-based 2D Face Recognition
Automatic face recognition has attracted a lot of attention not only because of the large number of practical applications where human identification is needed but also due to the technical challenges involved in this problem: large variability in facial appearance, non-linearity of face manifolds and high dimensionality are some the most critical handicaps. In order to deal with the above mentioned challenges, there are two possible strategies: the first is to construct a ?good? feature space in which the manifolds become simpler (more linear and more convex). This scheme usually comprises two levels of processing: (1) normalize images geometrically and photometrically and (2) extract features that are stable with respect to these variations (such as those based on Gabor filters). The second strategy is to use classification structures that are able to deal with non-linearities and to generalize properly. To obtain high performance, an algorithm may need to combine both strategies. In this Thesis we have tackled completely different problems throughout the complex face recognition process, proposing solutions that combine both schemes in the framework of Gabor-based face recognition. Jointly with factors such as illumination and expression, differences in viewpoint are mostly responsible for the large appearance variability in face images. In this Thesis we have tackled the pose problem by proposing two different approaches based on a 2D linear model. These techniques take advantage of facial symmetry to overcome problems due to self-occlussion and synthesize virtual images at specific viewpoints by means of texture mapping, obtaining comparable results to a 3D approach based on Morphable Models with horizontal rotations up to 67.5?. Some of the most successful face recognition approaches that have been proposed up to date are those based on extraction of Gabor features. This choice is motivated both by biological reasons and because of their optimal characterization in the space and frequency domains. Using Gabor filters as recognition engine, we have proposed a method for extracting features from positions or regions that are somehow subjectspecific, by exploiting individual face structure. This constitutes a new point of view with respect to classical methods that extract features from a pre-defined (either rectangular or face-like) graph. Following with Gabor-based approaches and in order to obtain better performance, we have empirically validated different state-of-the-art tools for combining local Gabor similarities, and proposed an evaluation of different distance measures for Gabor features comparison. Despite the large number of papers dealing with Gabor-based recognition systems, no statistical model has been proposed or used for Gabor feature coefficients. In this Thesis we have studied the marginal statistics of coefficients extracted from face images, proposing the Generalized Gaussian distribution to model the characteristic non-normal behavior these features show. In addition, multivariate characterization of Gabor coefficients has been also considered for study. A novel multivariate extension of the Generalized Gaussian has been proposed and tested with success in limited experiments. Finally, in this Thesis we have implemented software for tracking faces throughout video sequences, therefore setting the necessary grounds to develop face recognition systems from video. Recent results using this tracking module have been obtained in the context of pose-robust recognition from video and audio-video asynchrony detection.
