I definitely disagree about curveballs. Coming across stuff you don't like is part of discovering what you do like.
On your first trip to the record store, you won't have any idea what you're buying. It might be good, it might not be good. After a while, you get to know the owner of the store, and they learn your tastes, and you learn a bit more about music and start to know what will be good and what won't be good. Sometimes you'll both make a mistake and you'll still buy something you don't like. And you'll learn from that.
Why shouldn't our learning algorithms work the same way?
On your first trip to the record store, you won't have any idea what you're buying. It might be good, it might not be good. After a while, you get to know the owner of the store, and they learn your tastes, and you learn a bit more about music and start to know what will be good and what won't be good. Sometimes you'll both make a mistake and you'll still buy something you don't like. And you'll learn from that.
Why shouldn't our learning algorithms work the same way?