As I understand it, the projects and efforts you mention didn't "fail fail" - the computers didn't explode incinerating the buildings and the programmers involved.
The ideas failed in a relative fashion. They failed to yield that many result relative relative to the effort and resources that were put into them. And they especially failed to yield as many results as you could get by just increasing raw clock speed. And their cost was not just their raw complexity but the training required for programmer to understand parallelism (the low cost of today's "entry level" programmer is a huge bogus to the IT industry. If companies had to spend a year on system-specific training, the cost would be vast).
But given that such projects were only failures relative to the alternative of just coming up with a simple architecture with a higher clock speed, if the alternative is going away, there's no reason they can't become relative successes. Watson was a relative success - if you could have Watson-level processing-power on a single chip programmable with Ruby, the effort IBM put into the project would look silly. But since it looks you can't, Watson seems like a productive use of resources.
In a lot of ways, the last thirty years have involved substituting low-cost, high-power chips for high-price, high-skill programmers yielding a huge, de-skilled programming workforce. The end of Moore's-law-for-speed would seem to mean things will move differently in the future.
I won't address the particular architecture points you make, which all sound good but are unrelated to the general of previous parallelism efforts "failing".
The ideas failed in a relative fashion. They failed to yield that many result relative relative to the effort and resources that were put into them. And they especially failed to yield as many results as you could get by just increasing raw clock speed. And their cost was not just their raw complexity but the training required for programmer to understand parallelism (the low cost of today's "entry level" programmer is a huge bogus to the IT industry. If companies had to spend a year on system-specific training, the cost would be vast).
But given that such projects were only failures relative to the alternative of just coming up with a simple architecture with a higher clock speed, if the alternative is going away, there's no reason they can't become relative successes. Watson was a relative success - if you could have Watson-level processing-power on a single chip programmable with Ruby, the effort IBM put into the project would look silly. But since it looks you can't, Watson seems like a productive use of resources.
In a lot of ways, the last thirty years have involved substituting low-cost, high-power chips for high-price, high-skill programmers yielding a huge, de-skilled programming workforce. The end of Moore's-law-for-speed would seem to mean things will move differently in the future.
I won't address the particular architecture points you make, which all sound good but are unrelated to the general of previous parallelism efforts "failing".