You don't lose local knowledge, no. You're basically doing a bunch of parallel s...

You don't lose local knowledge, no. You're basically doing a bunch of parallel stochastic hill-climbings that can share information. In fact, with suitable parameters, a GA can reduce to a stochastic hill climber.

You're right that the problem structure is key. There's a key theorem called the No Free Lunch Theorem for Search which says that will always be the case. But on search problems with many related local maxima, evolution will often outperform a series of stochastic gradient ascents, because the information in one hill can be applied to another.

Of course, search and knowledge are always two sides of the same coin. The less knowledge you encode, the more search you'll have to do. But GAs allow you to encode knowledge too, in their operators, particularly.

There's been quite a bit of work done on the kinds of fitness landscapes that evolutionary algorithms are better suited for.

[NB: Used 'ascent' terminology to avoid confusion - it's a convention only, of course, energy minimization or fitness maximization]