Hacker News new | past | comments | ask | show | jobs | submit login

There are better approaches for tackling this problem (with 0-regret asymptotically). You can take a look at the UCB (Upper Confidence Bound) algorithm, and you can do even more if you assume some continuity, e.g. what is commonly done is to assume that the whole distribution is from a Gaussian Processes. Many interesting ideas in the literature indeed :)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: