Are Random Forests Truly the Best Classifiers?

Michael Wainberg; Babak Alipanahi; Brendan J. Frey

The JMLR study Do we need hundreds of classifiers to solve real world classification problems? benchmarks 179 classifiers in 17 families on 121 data sets from the UCI repository and claims that âthe random forest is clearly the best family of classifierâ. In this response, we show that the study's results are biased by the lack of a held-out test set and the exclusion of trials with errors. Further, the study's own statistical tests indicate that random forests do not have significantly higher percent accuracy than support vector machines and neural networks, calling into question the conclusion that random forests are the best classifiers.

Are Random Forests Truly the Best Classifiers?

Abstract