Publication details

Application of finite mixtures to text document classification

Conference Paper (Czech conference)

Novovičová Jana, Malík Antonín


serial: Znalosti 2003. Sborník příspěvků 2. ročníku konference, p. 23-32 , Eds: Svátek V.

publisher: VŠB, (Ostrava 2003)

action: Znalosti 2003 /2./, (Ostrava, CZ, 19.02.2003-21.02.2003)

research: CEZ:AV0Z1075907

project(s): IAA2075302, GA AV ČR, KSK1019101, GA AV ČR

keywords: text classification, mixture model

abstract (eng):

Finite mixture modelling of class-conditional distributions is a standard method in a statistical pattern recognition. We proposed to use the mixture of multinomial distributions as a model for class-conditional distribution for text document classification task. The vector document representations using a bag-of-words or a unigram approach are employed. Experimental comparison of the proposed model and the standard models was performed using Reuters-21578 database.

Cosati: 09K, 12B

RIV: BB