I always thought the problem with bulldozer was that marketing considered a module to be 2 cores, while someone in engineering meant for a module to compete with a single Intel core. If you go by the later definition, Bulldozer kicked Intels ass in multithreaded performance and can still compete. It makes sense because they touted a different variant of hyperthreading they claimed was better. The problem of course is that makes for a huge expensive core and only half as many per chip for marketing purposes. It also had shit for single thread performance either way. But I've always felt this difference of interpretation for the architecture was a big part of the failure.
Even if you compare a 4-module Bulldozer/Piledriver against a 4-core Intel Sandy/Ivy Bridge, AMD's die size and power consumption were much higher to deliver similar or worse performance.
They were also stuck at /least/ a process node back due to earlier anti-competitive practices that Intel didn't get sufficiently penalized for. (Those practices literally forced AMD out of the same /business/ as Intel; which is what cemented the situation they've ended up in.) A fair settlement would have broken up Intel in to isolated fabrication and pattern-design businesses.