Visual ear detection and recognition in unconstrained environments

Automatic ear recognition systems have seen increased interest over recent years due to multiple desirable characteristics. Ear images used in such systems can typically be extracted from profile head shots or video footage. The acquisition procedure is contactless and non-intrusive, and it also does not depend on the cooperation of the subjects. In this regard, ear recognition technology shares similarities with other image-based biometric modalities. Another appealing property of ear biometrics is its distinctiveness. Recent studies even empirically validated existing conjectures that certain features of the ear are distinct for identical twins. This fact has significant implications for security-related applications and puts ear images on a par with epigenetic biometric modalities, such as the iris. Ear images can also supplement other biometric modalities in automatic recognition systems and provide identity cues when other information is unreliable or even unavailable. In surveillance applications, for example, where face recognition technology may struggle with profile faces, the ear can serve as a source of information of the identity of people in the surveillance footage. The importance and potential value of ear recognition technology for multi-modal biometric systems are also evidenced by the number of research studies on this topic. Today, ear recognition represents an active research area for which new techniques are developed regularly, and several datasets needed for the training and testing of the technology are publicly available. Nevertheless, despite the research efforts directed at ear biometrics, to the best of our knowledge, there exist only a few commercial systems based on ear biometrics. We conjecture that the limited availability of commercial xi xii ? Emer?i? Visual ear detection and recognition in unconstrained environments ear recognition technology is a consequence of the open challenges that have still not been appropriately addressed. This thesis attempts to meet some of these challenges and provide the community with new solutions and insights that can be used to advance the field further. Most of the early research work on ear biometrics focused on laboratorylike settings, where the variability in ear appearance was limited. Usually, there were no major variations in pose, occlusion, etc. However, in real-life applications, this is not the case. Development and research hit a wall by using images captured in constrained environments. Methods developed for these settings were difficult to translate into real-world scenarios and would typically perform poorly. In unconstrained acquisition environments, ear recognition techniques are confronted with large pose variations, illumination changes, etc. We surmise that this significant difference in the data is the main culprit for performance differences and the stagnation of the field. Furthermore, the lack of commercial solutions could also be attributed to the lack of unconstrained ear data available. This issue pervades recognition and detection tasks, meaning that annotated datasets of ear images are needed, together with images with annotated ear positions for ear detection tasks. Existing ear recognition solutions do not suffice and under-perform in unconstrained settings. More powerful solutions are needed to address ear detection and recognition in these environments successfully. Another key issue to be addressed is identifying the most important weaknesses of existing ear recognition techniques. This can be done through a performance analysis taking into account various covariates that can help answer the following research questions: How do ear recognition techniques perform across different image resolutions in unconstrained settings? How sensitive are existing techniques to the presence of occlusions and ear accessories? Do existing recognition approaches exhibit a performance bias when presented with images of either male or female subjects? How do recognition techniques generalize to ear image data with different characteristics? Answering these questions may help identify open problems with existing ear recognition approaches and help focus research efforts and resources in the right direction. To address the research problems discussed above, we make several contributions in this thesis, i.e., 1. We present novel techniques for ear detection that work in unconstrained environments and consider contextual information for the detection procedure. The techniques frame ear detection as a semantic segmentation task and are shown to yield state-of-the-art performance on public ear datasets. 2. We develop novel recognition approaches that consider local and global information and ensure robust recognition in unconstrained environments. The developed approaches are also shown to enable explainable decision-making by focusing only on the most important ear regions. 3. We introduce new ear datasets captured in unconstrained environments to train and test the developed detection and recognition techniques. While we make contributions targeting ear recognition systems specifically, many of the solutions presented are also applicable to other biometric modalities and therefore have implications for other areas of biometrics as well.

File Type: pdf
File Size: 6 MB
Publication Year: 2021
Author: Emer?i?, ?iga
Supervisors: Peter Peer, Vitomir ?truc
Institution: University of Ljubljana, Faculty of Computer and Information Science
Keywords: ear recognition, ear detection, deep neural networks, unconstrained environment