Publication details

A Statistical Review of the MNIST Benchmark Data Problem

Monography Chapter

Grim Jiří, Somol Petr


serial: Advances in Pattern Recognition Research, p. 172-193 , Eds: Lu T., Chao T.H.

project(s): GA17-18407S, GA ČR

keywords: MNIST benchmark, multivariate Bernoulli mixtures, EM algorithm

preview: Download

abstract (eng):

The recognition of MNIST numerals is discussed as a benchmark problem. Applying the probabilistic neural networks to MNIST data we have found that the training and test set have slightly different statistical properties with negative consequences for classifier performance. We assume that the frequently used extension of MNIST training data by distorted patterns improves the recognition accuracy by creating images similar to the atypical test set numerals. In this way the benchmark experiments may be influenced by the external knowledge about the hand-written digits and the comparative value of the benchmark becomes more or less limited to recognition of MNIST numerals. As a more generally applicable benchmark model we propose recognition of artificial binary patterns generated on a chessboard by random moves of the pieces rook and knight.

RIV: IN