Publication details

Conference Paper (international conference)

Application of multinomial mixture model to text classification

Novovičová Jana, Malík Antonín

: Pattern Recognition and Image Analysis, p. 646-653 , Eds: Perales F. J., Campilho A. J. C.

: Springer, (Berlin 2003)


: Iberian Conference on Pattern Recognition and Image Analysis. IbPRIA 2003 /1./, (Puerto de Andratx, ES, 04.06.2003-06.06.2003)

: CEZ:AV0Z1075907

: IAA2075302, GA AV ČR, KSK1019101, GA AV ČR

: text classification, multinomial mixture model, Bhattacharyya distance

(eng): The mixture of multinomial distributions is proposed as a model for class-conditional distributions in document classification task. Experimental results on the Reuters and the Newsgroups data sets indicate the effectiveness of the multinomial mixture model. Furthermore, an increase in classification accuracy is achieved for small training data sets, when multiclass Bhattacharyya distance is used instead of average mutual information as a feature selection criterion.

: 09K, 12B

: BB