Hacker News new | past | comments | ask | show | jobs | submit login

You have to account for the fact that you used so many algorithms, though. Using ten different algorithm makes it ten times more likely you'll fit your dataset well just by chance.



I acknowledge that the reported accuracy of a system will be higher if you take the max accuracy of 10 methods which have the same 'true' accuracy +- some noise. The results presented in table 5 of the paper are very unlikely in my opinion (as someone currently in the ML field and who has worked in the field of computational neuroscience) to be solely due to randomly trying different ML techniques without the underlying data providing a noteworthy difference between the target classes.

If these results are replicated independently with a different dataset then the magnitude of the overselling of the method will be seen. I just don't think that it makes sense to doubt the results (i.e. with a grid of EEG sensors and bandpower features it is possible to identify a portion of autism cases) based upon this factor alone.


>"The results presented in table 5 of the paper are very unlikely... to be solely due to randomly trying different ML techniques"

This is a strawman, noone argues that their methods picked up on some correlations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: