Home Page

Papers

Submissions

News

Editorial Board

Proceedings

Open Source Software

Search

Statistics

Login

Frequently Asked Questions

Contact Us



RSS Feed

Consistent Feature Selection for Pattern Recognition in Polynomial Time

Roland Nilsson, José M. Peña, Johan Björkegren, Jesper Tegnér; 8(21):589−612, 2007.

Abstract

We analyze two different feature selection problems: finding a minimal feature set optimal for classification (MINIMAL-OPTIMAL) vs. finding all features relevant to the target variable (ALL-RELEVANT). The latter problem is motivated by recent applications within bioinformatics, particularly gene expression analysis. For both problems, we identify classes of data distributions for which there exist consistent, polynomial-time algorithms. We also prove that ALL-RELEVANT is much harder than MINIMAL-OPTIMAL and propose two consistent, polynomial-time algorithms. We argue that the distribution classes considered are reasonable in many practical cases, so that our results simplify feature selection in a wide range of machine learning tasks.

[abs][pdf][bib]       
© JMLR 2007. (edit, beta)