Image & Video Analysis / Processing

Query by Pictorial Example

Appearance of real scenes is highly dependent on actual conditions as illumination and viewpoint, which significantly complicates automatic analysis of images of such scenes. In this thesis, we introduce novel textural features, which are suitable for robust recognition of natural and artificial materials (textures) present in real scenes. These features are based on efficient modelling of spatial relations by a type of Markov Random Field (MRF) model and we proved that they are invariant to illumination colour, cast shadows, and texture rotation. Moreover, the features are robust to illumination direction and degradation by Gaussian noise, they are also related to human perception of textures.

Image Retrieval Measures Based on Illumination Invariant Textural MRF Features

Content-based image retrieval (CBIR) systems, target database images using feature similarities with respect to the query. We introduce fast and robust image retrieval measures that utilise novel illumination invariant features extracted from three different Markov random field (MRF) based texture representations. These measures allow retrieving images with similar scenes comprising colour-textured objects viewed with different illumination brightness or spectrum. The proposed illumination insensitive measures are compared favourably with the most frequently used features like the Local Binary Patterns, steerable pyramid and Gabor textural features, respectively. The superiority of these new illumination invariant measures and their robustness to added noise are empirically demonstrated in the illumination invariant recognition of textures from the Outex database.

A Fast Model-Based Restoration of Colour Movie Scratches

This work presents a new type of scratch removal algorithm based on a causal adaptive multidimensional multitemporal prediction. The predictor use available information from the neighbourhood of a missing multispectral pixel due to spectral, temporal and spatial correlation of video data but not any information from the failed pixel itself. The model assumes white Gaussian noise in each spectral layer, but layers can be mutually correlated. A significant improvement of the 3D model performance is obtained if the temporal information is included, i.e., using the 3.5D causal AR model. Such information is natural to obtain from previous or/and following frame(s) for which we know all necessary data, due to high between-frame temporal correlation. Thanks to this we can treat data from different frames (specified by the contextual neighbourhood) in the same way, so we attach to each data information about its shift according to predicted pixel placement. The contextual neighbourhood has to be causal (in the reconstructed frame lattice subspace) . It means that the predictor can use only data from the model history. Then if we assume normal-Wishart parameter prior the predictor have analytical (not iterative) solution.