Mixtures of Product Components Versus Mixtures of Dependence Trees

01.10.2014 - 14:00
Valid

Considering the probabilistic approach to practical problems we have to estimate unknown multivariate probability density functions or distributions from large high-dimensional databases. The underlying densities are usually strongly multimodal and therefore mixtures of unimodal density functions suggest themselves as a suitable approximation tool. In this respect the product mixture models are preferable because they can be efficiently estimated from data by means of EM algorithm and have some advantageous properties. However, in some cases the simplicity of product components could appear too restrictive and a natural idea is to use a more complex mixture of dependence-tree distributions. The dependence tree distributions can explicitly describe the statistical relationships between pairs of variables at the level of individual components and therefore the approximation power of the resulting mixture may essentially increase. Nonetheless, in application to classification of numerals we have found that both models perform comparably and the contribution of the dependence-tree structures decreases in the course of EM iterations. Especially in case of a large number of components the optimal estimate of the dependence-tree mixture tends to converge to a simple product mixture model.