I agree with you (and love your blog, btw), but I think you're skipping over at least a few benefits you can get out of a mature / well built a/b framework that are hard to build into a bandit approach. The biggest one I've found personally useful is days-in analysis; for example, quantifying the impact of a signup-time experiment on one-week retention. This doesn't really apply to learning ranking functions or other transactional (short-feedback loop) optimization.
That being said, building a "proper" a/b harness is really hard and will be a constant source of bugs / FUD around decision-making (don't believe me? try running an a/a experiment and see how many false positives you get). I've personally built a dead-simple bandit system when starting greenfield and would recommend the same to anyone else.
Probably worth mentioning that the Google Content Experiments framework is in the process of being replaced with Google Optimize (currently in a private beta) which does NOT make use of multi-armed bandits much to my confusion and disappointment.
Huh. So do you know if they do anything help with repeat testing/peeking?
Optimizely takes an interesting approach: they apply repeat testing methods, segmenting the tests by user views of the results. Like 30x more complicated than multi-bandit, but they don't need a feedback mechanism.
That being said, building a "proper" a/b harness is really hard and will be a constant source of bugs / FUD around decision-making (don't believe me? try running an a/a experiment and see how many false positives you get). I've personally built a dead-simple bandit system when starting greenfield and would recommend the same to anyone else.