Publication details

Introduction to Feature Selection Toolbox 3 – The C++ Library for Subset Search, Data Modeling and Classification

Research Report

Somol Petr, Vácha Pavel, Mikeš Stanislav, Hora Jan, Pudil Pavel, Žid Pavel

publisher: ÚTIA, (Praha 2010)

edition: Research Report 2287

research: CEZ:AV0Z10750506

project(s): 1M0572, GA MŠk, 2C06019, GA MŠk

keywords: feature selection, software library, subset search, attribute selection, variable selection, optimization, machine learning, classification, pattern recognition

preview: Download

abstract (eng):

We introduce a new standalone widely applicable software library for feature selection (also known as attribute or variable selection), capable of reducing problem dimensionality to maximize the accuracy of data models, performance of automatic decision rules as well as to reduce data acquisition cost. The library can be exploited by users in research as well as in industry. Less experienced users can experiment with different provided methods and their application to real-life problems, experts can implement their own criteria or search schemes taking advantage of the toolbox framework. In this paper we first provide a concise survey of a variety of existing feature selection approaches. Then we focus on a selected group of methods of good general performance as well as on tools surpassing the limits of existing libraries. We build a feature selection framework around them and design an object-based generic software library. We describe the key design points and properties of the library.