Hacker News new | past | comments | ask | show | jobs | submit login

In my experience when the data is very small it is almost always also biased towards how easy it was to gather, which also makes it non representative. Think about it, if it were as easy to let n=5000 as it were to let n=25, you would always pick 5000. You only pick n=25 because of the low effort involved, which often means proximity.

A very common example is when some software feature is A/B tested only internally, or even only tested on the team that developed it. It introduces a lot of bias in users’ technical competence, willingness to understand/understanding of the new behavior, how the environment is set up, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: