Publication details

Structural Poisson Mixtures for Classification of Documents

Conference Paper (international conference)

Grim Jiří, Novovičová Jana, Somol Petr


serial: Proceedings of the 19th International Conference on Pattern Recognition, p. 1324-1327

action: 19th International Conference on Pattern Recognition, (Tampa, US, 07.12.2008-11.12.2008)

research: CEZ:AV0Z10750506

project(s): 1M0572, GA MŠk, 2C06019, GA MŠk, GA102/07/1594, GA ČR

keywords: classification of documents, Poisson mixtures, Structural approach

preview: Download

abstract (eng):

Considering the statistical text classification problem we approximate class-conditional probability distributions by structurally modified Poisson mixtures. By introducing the structural model we can use different subsets of input variables to evaluate conditional probabilities of different classes in the Bayes formula. The method is applicable to document vectors of arbitrary dimension without any preprocessing. The structural optimization can be included into the EM algorithm in a statistically correct way.

abstract (cze):

V rámci statistického přístupu k problému klasifikace dokumentů jsou dokumenty reprezentovány formou /bag-of-words/. Podmíněné distribuce dokumentů v jednotlivých třídách jsou aproximovány ve tvaru strukturní poissonovské distribuční směsi. Bayesovská klasifikace dokumentů je ověřována na datových souborech Reuters a 20 NEWSGROUPS.

RIV: IN