Hacker News new | past | comments | ask | show | jobs | submit login

I can't answer with certainty, but I think any fixed set of tests can be cheated with overfiting.



I do think that's a material risk - however broadly, do you think that such a scheme would make ecosystem better or worse? If you made a hypothetical JSON implementation in your language of choice, would you use it?


> I do think that's a material risk - however broadly, do you think that such a scheme would make ecosystem better or worse?

I don't think it would be easy to cheat if:

  - tested implementation is open source: it would make cheating too obvious,

  - tests are constantly updated: it would make cheating too cumbersome and

  - tests include a randomization: it would not always work.
So, satisfying these points would drastically increase trust on the test corpus and tested program.

> If you made a hypothetical JSON implementation in your language of choice, would you use it?

On my machine? I use my own hacked kernel on my machine! In production? Only if tests indicate my implementation is as good as the best ones available.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: