That's not *thoroughly* ridiculous. For many problems in machine learning, k-nea...

mortehu · on April 6, 2016

According to [1], k-NN is pretty far from being the top general classifier. Admittedly, all the data sets are small, with no more than 130,000 instances. When does it start becoming "very hard to beat", and what are you basing that on?

1. http://jmlr.org/papers/volume15/delgado14a/delgado14a.pdf

argonaut · on April 8, 2016

The vast majority of the data sets are very small (on the order of 100 to 1000 examples!). In fact, in the paper they discarded some of the larger UCI datasets.

It's not surprising that they found random forests perform better, even though the conventional wisdom is boosting outperforms (it's much harder to overfit with forests).