What did go wrong with Intel's MIC (Xeon Phi) project? I can't find a compressive account of this from HPC people. The idea seemed pretty sound: large die, simpler circuit, and much more parallelism, in the x86 line..
I vaguely remember that at the dawn of the deep learning (2013 to 2014), there were talks and hopes that Xeon Phi would smash the performance of Nvidia GPUs. However, the sample people got are too late (I believe it is at the end of 2014) and the performance figures are disappointing. It might be just the software was simply not there yet unfortunately. But then the wheels moved forward and everyone started to buy Nvidia chips in their datacenters.
They didn't really go that wrong, Cori and Trinity are still useful machines. But I can say for computational science, programming models have gotten way better in the last few years. Now it's as easy to get a new sparse algorithm to high occupancy on a GPU as it is to scale on a manycore CPU. So GPUs just look better now considering cost, power efficiency and software support.