Contributions to the Information Fusion : application to Obstacle Recognition in Visible and Infrared Images
The interest for the intelligent vehicle field has been increased during the last years, must probably due to an important number of road accidents. Many accidents could be avoided if a device attached to the vehicle would assist the driver with some warnings when dangerous situations are about to appear. In recent years, leading car developers have recorded significant efforts and support research works regarding the intelligent vehicle field where they propose solutions for the existing problems, especially in the vision domain. Road detection and following, pedestrian or vehicle detection, recognition and tracking, night vision, among others are examples of applications which have been developed and improved recently. Still, a lot of challenges and unsolved problems remain in the intelligent vehicle domain. Our purpose in this thesis is to design an Obstacle Recognition system for improving the road security by directing the driver?s attention towards situations which may become dangerous. Many systems still encounter problems at the detection step and since this task is still a work in progress in the frame of the LITIS laboratory (from INSA our goal was to develop a system to continue and improve the detection task. We have focused solely on the fusion between the visible and infrared fields from the viewpoint of an Obstacle Recognition module. Our main purpose was to investigate if the combination of the visible-infrared information is efficient, especially if it is associated with an SVM (Support Vector Machine)-based classification. The outdoor environment, the variety of obstacles appearance from the road scene (considering also the multitude of possible types of obstacles), the cluttered background and the fact that the system must cope with the moving vehicle constraints make the categorization of road obstacles a real challenge. In addition, there are some critical requirements that a driver assistance system should fulfil in order to be considered a possible solution to be implemented on board of a vehicle: the system cost should be low enough to allow to be incorporated in every series vehicle, the system has to be fast enough to detect and then recognize obstacles in real time, it has to be efficient (to detect all obstacles with very few false alarms) and robust (to be able to face different difficult environmental conditions). To outline the system, we were looking for sensors which could provide enough information to detect obstacles (even those occluded) in any illumination or weather situation, to recognize them and to identify their position in the scene. In the intelligent vehicle domain there is no such a perfect sensor to handle all these concerned tasks, but there are systems employing one or many different sensors in order to perform obstacles detection, recognition or tracking or some combination of them. After comparing advantages and disadvantages between passive and active technologies, we chose the proper sensors for developing our Obstacle Detection and Recognition system. Due to possible interferences among active sensors, which could be critical for a large number of vehicles moving simultaneously in the same environment, we concentrate on using passive sensors, which are non-invasive, like cameras. Therefore, our proposed system employ visible spectrum and infrared spectrum cameras, which are relatively chosen to be complementary, because the system must work well even under difficult conditions, like poor illumination or bad-weather situations (such as dark, rain, fog). The monomodal systems are adapted to a single modality, either visible or infrared and even if they provide good recognition rates on the test set, these results could be improved by the combined processing of the visible and infrared information, which means in the frame of a bimodal system. The bimodal systems could take different forms in function of the level at which the information is combined or fused. Thus, we propose three different fusion systems: at the levels of features or at the level of SVM?s kernels, or even higher, at the level of matching-scores provided by the SVM. Each one of these systems improves classification performances comparing to the monomodal systems. In order to ensure the adaptation of the system to the environmental conditions, within fusion schemes the kernels, the matching-scores and the features were weighted (with a sensor weighting coefficient) according to the relative importance of the modality sensors. This allowed for better classification performances. In the frame of the matching-scores fusion there is also the possibility to dynamically perform the adaptation of the weighting coefficient to the context. In order to represent the obstacles? images which have to be recognized by the Obstacle Recognition system, some features have been preferred to encode this information. These features are obtained in the features extraction module and they are wavelet features, statistical features, the coefficients of some transforms, and others. Generally, the features extraction module is followed by a features selection one, in which the importance of these features is estimated and only the ones that are most relevant will be chosen to further represent the information. Different features selection methods are tested and compared in order to evaluate the pertinence of each feature (and of each family of features) in relation to our objective of obstacle classification. The pertinence of each vector constructed based on these features selection methods was first evaluated by a KNN (k Nearest Neighbours) (with the number of neighbours k = 1) classifier, due to the simplicity in its usage: it does not require a parameter optimization process (as the SVM does). To increase the accuracy of the classification, but also to obtain a powerful classifier, more parametrizable for the proposed fusion schemes, the KNN one was later (after the best features selection method have been chosen on the training set and the most relevant features have been selected) replaced by a SVM classifier. Because there is not known beforehand which combination of the SVM hyper-parameters is the most appropriate for a certain classification problem, an operation of model search, performed by 10 folds cross-validation, provides the optimized kernel for the SVM to be used on each fusion schemes and on each feature vector we considered. Finally, we tested our features extraction, features selection and the proposed fusion schemes for a 4-class problem, thus discriminating between vehicles, pedestrians, cyclists and background obstacles. The results have proven that all bimodal visible-infrared systems are better than the monomodal ones, thus the fusion is efficient and robust since it allows for improving the recognition rates. In addition, features selection scheme provides smaller vector comprising only the most relevant features for the classification process. This reduction of the feature-vector dimension besides providing higher accuracy rates, allows the reduction of the computation time which is crucial in this type of application.
