It's not just "some" caveats: then auto-unwrapping means that you cannot use then as fmap reliably. In JS there is no such thing as Promise<Promise<Number>>. You cannot await/then the outer promise so that you can do something with the inner promise.
In practice it's not too much of a problem, but it does mean that if you use some FP library, you cannot trivially wrap a promise to make it a monadic type that will work with the library's monad utilities.
So, just out of curiosity: I know in ECMAScript this is probably mainly for the purpose of giving you a convenient way to lift either (α → β) or (α → Promise β) into (Promise α → Promise β) without static typing or having two "then" functions, by forcing β and Promise β to be disjoint. It also allows looser composition underneath, since you can choose which kind of return to provide at runtime without having to re-wrap the bare case manually. But are there any other reasons for Promise of Promise to be squashed away like that?
Apart from simplicity, it was designed to be used with "thenables". If the value that a Promise resolved with had a `then()` member, the value would be treated like a first-party Promise. So even if there was a separate fmap-then and bind-then, the fmap-then would have to deal with thenables and act like bind-then anyway.
There's a bunch of history in [1], particularly [2] and [3], if you care to read them.
In practice it's not too much of a problem, but it does mean that if you use some FP library, you cannot trivially wrap a promise to make it a monadic type that will work with the library's monad utilities.