Radial Basis Function Network Robust Learning Algorithms in Computer Vision Applications

This thesis introduces new learning algorithms for Radial Basis Function (RBF) networks. RBF networks is a feed-forward two-layer neural network used for functional approximation or pattern classification applications. The proposed training algorithms are based on robust statistics. Their theoretical performance has been assessed and compared with that of classical algorithms for training RBF networks. The applications of RBF networks described in this thesis consist of simultaneously modeling moving object segmentation and optical flow estimation in image sequences and 3-D image modeling and segmentation. A Bayesian classifier model is used for the representation of the image sequence and 3-D images. This employs an energy based description of the probability functions involved. The energy functions are represented by RBF networks whose inputs are various features drawn from the images and whose outputs are objects. The hidden units embed kernel functions. Each kernel ...

Bors, Adrian G. — Aristotle University of Thessaloniki


Non-rigid Registration-based Data-driven 3D Facial Action Unit Detection

Automated analysis of facial expressions has been an active area of study due to its potential applications not only for intelligent human-computer interfaces but also for human facial behavior research. To advance automatic expression analysis, this thesis proposes and empirically proves two hypotheses: (i) 3D face data is a better data modality than conventional 2D camera images, not only for being much less disturbed by illumination and head pose effects but also for capturing true facial surface information. (ii) It is possible to perform detailed face registration without resorting to any face modeling. This means that data-driven methods in automatic expression analysis can compensate for the confounding effects like pose and physiognomy differences, and can process facial features more effectively, without suffering the drawbacks of model-driven analysis. Our study is based upon Facial Action Coding System (FACS) as this paradigm ...

Savran, Arman — Bogazici University


Unsupervised and semi-supervised Non-negative Matrix Factorization methods for brain tumor segmentation using multi-parametric MRI data

Gliomas represent about 80% of all malignant primary brain tumors. Despite recent advancements in glioma research, patient outcome remains poor. The 5 year survival rate of the most common and most malignant subtype, i.e. glioblastoma, is about 5%. Magnetic resonance imaging (MRI) has become the imaging modality of choice in the management of brain tumor patients. Conventional MRI (cMRI) provides excellent soft tissue contrast without exposing the patient to potentially harmful ionizing radiation. Over the past decade, advanced MRI modalities, such as perfusion-weighted imaging (PWI), diffusion-weighted imaging (DWI) and magnetic resonance spectroscopic imaging (MRSI) have gained interest in the clinical field, and their added value regarding brain tumor diagnosis, treatment planning and follow-up has been recognized. Tumor segmentation involves the imaging-based delineation of a tumor and its subcompartments. In gliomas, segmentation plays an important role in treatment planning as well ...

Sauwen, Nicolas — KU Leuven


Biologically Inspired 3D Face Recognition

Face recognition has been an active area of study for both computer vision and image processing communities, not only for biometrics but also for human-computer interaction applications. The purpose of the present work is to evaluate the existing 3D face recognition techniques and seek biologically motivated methods to improve them. We especially look at findings in psychophysics and cognitive science for insights. We propose a biologically motivated computational model, and focus on the earlier stages of the model, whose performance is critical for the later stages. Our emphasis is on automatic localization of facial features. We first propose a strong unsupervised learning algorithm for flexible and automatic training of Gaussian mixture models and use it in a novel feature-based algorithm for facial fiducial point localization. We also propose a novel structural correction algorithm to evaluate the quality of landmarking and ...

Salah, Albert Ali — Bogazici University


Spatial Consistency of 3D Channel Models

Developing realistic channel models is one of the greatest challenges for describing wireless communications. Their quality is crucial for accurately predicting the performance of a wireless system. While on the one hand, channel models have to be accurate in describing the physical properties of wave propagation, on the other hand, they have to be as least complex as possible. With the recent emergence of antennas with a massive amount of elements as a promising technology for a further enhancement of spectral efficiency, new channel models that characterize the propagation environment in both azimuth and elevation become necessary. While standardization bodies such as 3rd Generation Partnership Project (3GPP) and International Telecommunications Unit (ITU) have introduced a 3-dimensional (3D) geometry-based stochastic channel model, a system-level modeling has been missing to serve the purpose of further analysis and evaluations. Furthermore, with such a ...

Fjolla Ademaj — TU Wien


Segmentation par modèle déformable surfacique localement régularisé par spline

Image segmentation through deformable models is a method that localizes object boundaries. When difficult segmentation context are proposed because of noise or a lack of information, the use of prior knowledge in the deformation process increases segmentation accuracy. Medical imaging is often concerned by these context. Moreover, medical applications deal with large amounts of data. Then it is mandatory to use a robust and fast processing. This question lead us to a local regularisation of the deformable model. Highly based on the active contour framework, also known as \emph{snake}, we propose a new regularization scheme. This is done by filtering the displacements at each iteration. The filter is based on a smoothing spline kernel whose aim was to approximate a set of points rather than interpolating it. We point out the consistency of the regularization parameter in such a method. ...

Velut, Jerome — INSA-Lyon / CREATIS-LRMN


Video Sequence Analysis for Content Description, Summarization and Content-Based Retrieval

The main research area of this Ph.D. thesis is video sequence processing and analysis for description and indexing of visual content. Its objective is to contribute in the development of a computational system with the capabilities of object-based segmentation of audiovisual material, automatic content description, summarization for preview and browsing, as well as content-based retrieval. The thesis consists of four parts. The first introduces video sequence analysis, segmentation and object extraction based on color, motion, and depth field. A fusion technique is proposed that combines individual cue segmentations and allows for reliable identification of semantic objects. The second part refers to automatic description and annotation of the visual content by means of feature vectors, summarization, implemented by optimal selection of a limited set of key frames and shots, and content-based search and retrieval. In the third part, the problem of ...

Avrithis, Yannis — National Technical University of Athens


New Higher-Order Active Contour Models, Shape Priors, and Multiscale Analysis - Their Application To Road Network Extraction From Very High Resolution Satelite Images

The objective of this thesis is to develop and validate robust approaches for the semi-automatic extraction of road networks in dense urban areas from very high resolution (VHR) optical satellite images. Our models are based on the recently developed higher-order active contour (HOAC) phase field framework. The problem is difficult for two main reasons: VHR images are intrinsically complex and network regions may have arbitrary topology. To tackle the complexity of the information contained in VHR images, we propose a multiresolution statistical data model and a multiresolution constrained prior model. They enable the integration of segmentation results from coarse resolution and fine resolution. Subsequently, for the particular case of road map updating, we present a specific shape prior model derived from an outdated GIS digital map. This specific prior term balances the effect of the generic prior knowledge carried by ...

Peng, Ting — Project-Team Ariana (INRIA-Sophia Antipolis, France); LIAMA (CASIA, China)


Spectral Variability in Hyperspectral Unmixing: Multiscale, Tensor, and Neural Network-based Approaches

The spectral signatures of the materials contained in hyperspectral images, also called endmembers (EMs), can be significantly affected by variations in atmospheric, illumination or environmental conditions typically occurring within an image. Traditional spectral unmixing (SU) algorithms neglect the spectral variability of the endmembers, what propagates significant mismodeling errors throughout the whole unmixing process and compromises the quality of the estimated abundances. Therefore, significant effort have been recently dedicated to mitigate the effects of spectral variability in SU. However, many challenges still remain in how to best explore a priori information about the problem in order to improve the quality, the robustness and the efficiency of SU algorithms that account for spectral variability. In this thesis, new strategies are developed to address spectral variability in SU. First, an (over)-segmentation-based multiscale regularization strategy is proposed to explore spatial information about the abundance ...

Borsoi, Ricardo Augusto — Université Côte d'Azur; Federal University of Santa Catarina


Object Recognition in Subspaces: Applications in Biometry and 3D Model Retrieval

Shape description is a crucial step in many computer vision applications. This thesis is an attempt to introduce various representations of two and three dimensional shape information. These representations are aimed to be in homogeneous parametric forms in 2D or 3D space, such that subspace-based feature extraction techniques are applicable on them. We tackle three di erent applications: (i) Person recognition with hand biometry, (ii) Person recognition with three-dimensional face biometry, (iii) Indexing and retrieval of generic three-dimensional models. For each application, we propose various combinations of shape representation schemes and subspace-based feature extraction methods. We consider subspaces with fixed bases such as cosines, complex exponentials and tailored subspaces such as Principal Component Analysis, Independent Component Analysis and Nonnegative Matrix Factorization. Most of the descriptors we propose are dependent on the pose of the object. In this thesis we give ...

Dutagaci, Helin — Bogazici University


Mixed structural models for 3D audio in virtual environments

In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...

Geronazzo, Michele — University of Padova


Motion Analysis and Modeling for Activity Recognition and 3-D Animation based on Geometrical and Video Processing Algorithms

The analysis of audiovisual data aims at extracting high level information, equivalent with the one(s) that can be extracted by a human. It is considered as a fundamental, unsolved (in its general form) problem. Even though the inverse problem, the audiovisual (sound and animation) synthesis, is judged easier than the previous, it remains an unsolved problem. The systematic research on these problems yields solutions that constitute the basis for a great number of continuously developing applications. In this thesis, we examine the two aforementioned fundamental problems. We propose algorithms and models of analysis and synthesis of articulated motion and undulatory (snake) locomotion, using data from video sequences. The goal of this research is the multilevel information extraction from video, like object tracking and activity recognition, and the 3-D animation synthesis in virtual environments based on the results of analysis. An ...

Panagiotakis, Costas — University of Crete


Combining anatomical and spectral information to enhance MRSI resolution and quantification: Application to Multiple Sclerosis

Multiple sclerosis is a progressive autoimmune disease that a˙ects young adults. Magnetic resonance (MR) imaging has become an integral part in monitoring multiple sclerosis disease. Conventional MR imaging sequences such as fluid attenuated inversion recovery imaging have high spatial resolution, and can visualise the presence of focal white matter brain lesions in multiple sclerosis disease. Manual delineation of these lesions on conventional MR images is time consuming and su˙ers from intra and inter-rater variability. Among the advanced MR imaging techniques, MR spectroscopic imaging can o˙er complementary information on lesion characterisation compared to conventional MR images. However, MR spectroscopic images have low spatial resolution. Therefore, the aim of this thesis is to automatically segment multiple sclerosis lesions on conventional MR images and use the information from high-resolution conventional MR images to enhance the resolution of MR spectroscopic images. Automatic single time ...

Jain, Saurabh — KU Leuven


Three-Dimensional Face Recognition

In this thesis, we attack the problem of identifying humans from their three dimensional facial characteristics. For this purpose, a complete 3D face recognition system is developed. We divide the whole system into sub-processes. These sub-processes can be categorized as follows: 1) registration, 2) representation of faces, 3) extraction of discriminative features, and 4) fusion of matchers. For each module, we evaluate the state-of-the art methods, and also propose novel ones. For the registration task, we propose to use a generic face model which speeds up the correspondence establishment process. We compare the benefits of rigid and non-rigid registration schemes using a generic face model. In terms of face representation schemes, we implement a diverse range of approaches such as point clouds, curvature-based descriptors, and range images. In relation to these, various feature extraction methods are used to determine the ...

Gokberk, Berk — Bogazici University


Parametric and non-parametric approaches for multisensor data fusion

Multisensor data fusion technology combines data and information from multiple sensors to achieve improved accuracies and better inference about the environment than could be achieved by the use of a single sensor alone. In this dissertation, we propose parametric and nonparametric multisensor data fusion algorithms with a broad range of applications. Image registration is a vital first step in fusing sensor data. Among the wide range of registration techniques that have been developed for various applications, mutual information based registration algorithms have been accepted as one of the most accurate and robust methods. Inspired by the mutual information based approaches, we propose to use the joint R´enyi entropy as the dissimilarity metric between images. Since the R´enyi entropy of an image can be estimated with the length of the minimum spanning tree over the corresponding graph, the proposed information-theoretic registration ...

Ma, Bing — University of Michigan

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.