Dynamic Scheme Selection in Image Coding
This thesis deals with the coding of images with multiple coding schemes and their dynamic selection. In our society of information highways, electronic communication is taking everyday a bigger place in our lives. The number of transmitted images is also increasing everyday. Therefore, research on image compression is still an active area. However, the current trend is to add several functionalities to the compression scheme such as progressiveness for more comfortable browsing of web-sites or databases. Classical image coding schemes have a rigid structure. They usually process an image as a whole and treat the pixels as a simple signal with no particular characteristics. Second generation schemes use the concept of objects in an image, and introduce a model of the human visual system in the design of the coding scheme. Dynamic coding schemes, as their name tells us, make a dynamic selection of the most suited scheme among a predefined set of coding schemes. The coder selection is computationally very intensive, and the optimization criterion is usually rate-distortion, which is not well correlated to the human visual system. In this work, a new approach of dynamic coding is proposed. The quality metric used for the optimization is a modified perceived quality metric. Because all existing quality metrics are applied to the full image and not objects, an extension of such a metric is developed. This new region based quality metric (RBQM) is then used throughout this work for quality evaluation. For its validation, a new web-based image quality rating system has been proposed and used for the assessment of this new RBQM. The coding scheme and the perceived quality evaluation are computationally heavy. To reduce the complexity, a different approach is presented, which models the coding algorithm and the quality evaluation as a black box. For an image at the input of the black box, the output describes the perceived quality according to a selected metric. This procedure is then implemented as a prediction of the perceived quality, given an input image. For the prediction itself, several nonlinear models have been evaluated for their performance. Because we adopted an object-based framework, the regions are not usually of the same size. A direct exploitation of the values of the pixels is impractical because it results in a varying number of parameters for the prediction blackbox. To encompass this, an intermediate representation is needed, which keeps the characteristics of the input image, but presents a constant number of inputs to the prediction system. For this, we introduced decision features. These decision features represent the input image features as well as the coder state. This is important, as the coding quality of a given coding scheme heavily depends on coder state variables as the bitrate for example. Three nonlinear predictors are then evaluated: artificial neural networks, radial basis functions and polynomial approximations. An exhaustive assessment of the best representation of the decision features for the predictor has also been done. For better prediction, a new representation, termed SlotRange, has been introduced and has improved the prediction of the quality. The verification of the quality prediction scheme is then done in using it for coding scheme selection, as well as coding quality optimization. Several optimization criteria are proposed, including the classical rate-distortion criterion. The rapidly evaluated quality prediction can be used for less classical optimization criteria. Modeling the temporal aspect of the human visual system by decreasing the number of bits allocated to the regions in proportion to their speed in one of them. A scheme computing the point of focus of the eye is also used to distribute the bits in such a way that the point of focus gets a better quality than the surroundings. For the sake of completeness, a shape coding algorithm has been adapted to work in the object-based framework. It has been adapted to handle non-connex regions of the image. The dynamic coding scheme as developed here is therefore suitable for extensions to color models, or motion pictures
