Invocations of massive parallelism are always exciting and often convincing because of the huge speedups you can get for niche applications. But historically, SIMD has not been successful (http://en.wikipedia.org/wiki/SIMD). Or, as a NASA program manager in the area of parallel computing once remarked more colorfully in the mid-90s, "SIMD is dead".
You could argue that GPUs have been a successful back door pathway for SIMD. This seems analogous to the offboard processors that are used in medical imaging or military signal processing (radar) -- there is a SIMD unit, but it is controlled by a general-purpose computer, and the whole thing is in a peculiar niche.
Then why does Intel keep adding larger, more expanded SIMD instruction sets with every new CPU?
Why did ARM invent NEON and place it on all the high-end ARM chips, taking up more space than the entire main ALU?
Why are so many DSPs, for so many purposes, basically dumb SIMD machines? Broadcom's VideoCore, for example, is just a specialized 512-bit SIMD chip with insanely flexible instructions and a gigantic, transposable register file.
OpenCL/CUDA are SIMD languages -- controlling them with a "general purpose computer" isn't any different from your typical Intel chip, which is also a big SIMD unit controlled by a general-purpose computer.
It's rather difficult to call something that is an integral part of billions of CPUs "not successful". Without SIMD, much of modern computing would be crippled by an order of magnitude -- more for GPUs, whose shader units are basically SIMD engines and little more.
- Because SIMD is 'easy' from the processor's point of view. Basically the compiler it tasked to bundle parallel computations together. SIMD died out back in the 90s because no one could get that magical parallelizing compiler to work. Similar story with VLIW ISA.
- Modern GPUs are not your 80's SIMD. Programs are written with very fine granularity. Lots of processor real estate is dedicated to scheduling/task switching, basically bundling fine-grained computations where possible. Modern GPUs are more like the 80's experimental dataflow machines than the SIMDs.
SIMD died out back in the 90s because no one could get that magical parallelizing compiler to work.
That "magical parallelizing compiler" does work, and is known as a "human being". This curious type of compiler has been used for over 20 years to produce countless lines of SIMD code, ranging from Altivec to SSE.
The list of examples is beside my point, and the point of the original article.
The original article seems to be talking about SIMD as a general-purpose computer, with general-purpose applications, not as a secondary unit to another system.
My comment acknowledged the uses you list above, especially for GPUs, in which SIMD is a secondary processor to a controlling general-purpose computer.
I programmed Conway's Life on a 32x32 DAP in 1986, and I guess I feel like I've seen these claims once already.
You could argue that GPUs have been a successful back door pathway for SIMD. This seems analogous to the offboard processors that are used in medical imaging or military signal processing (radar) -- there is a SIMD unit, but it is controlled by a general-purpose computer, and the whole thing is in a peculiar niche.