While true in many cases, you can bet your bottom dollar that Intel did not spend $16.7bn to enter the embedded glue logic market. It is very much the compute aspect they are after, as a weapon against AMDs HSA and other GPGPU-style solutions.
I designed the first commercial Bitcoin mining FPGAs, and though for awhile the FPGAs were not competitive with GPUs (overall, they beat them on power and usability) they eventually were (with the advent of Kintex and similar). Of course, that only lasted briefly, as the rapid growth in the market led to an influx of VC money to fund ASICs.
And that's where FPGAs shine; that small to medium volume market where small companies are doing innovative things but don't have the millions required to risk building an ASIC.
Bitcoin mining was probably close to a best-case scenario for FPGA compute though - it was computing a fixed function designed for easy hardware implementation at full capacity 24/7. And even then, actually implementing it and making it competitive was a huge colossal pain.
It was completely compute bound. Communicating with the host using just an RS232C UART was sufficient to keep the FPGA busy while computing Bitcoin's 2xSHA256 hashes.
> And that's where FPGAs shine; that small to medium volume market where small companies are doing innovative things but don't have the millions required to risk building an ASIC
I don't disagree with you on that count. Especially in this case (since for most cases a processor does a job with a better cost/benefit), FPGAs shine on very specialized/heavy computation tasks.
Not only that, but I'd say it (could be) the best place to learn about the very bottom of the computing stack, and experiment with chip design and wacky ideas.
> in-the-fly reconfiguration for specific computations
I always wondered, given that Intel's processors already have a pretty large gap between their instruction set and their real microcode, whether it would make sense to have a nominal "CPU" that, when fed an instruction stream, executes it normally on general-purpose cores, but also runs a tracing+profiling JIT over it to eventually generate a VHDL gate-equivalent program to jam into an on-core FPGA. "Hardware JIT", basically, with no explicit programming step needed.
Programming a CPU is becoming more and more a problem of fitting as much in the data caches as possible. Bandwidth is the problem, not the speed of the execution units. I don't see the huge benefit of an FPGA here.
> But an FPGA (as it is today) cannot compete with a GPGPU
I read an interesting quip somewhere on software/hardware development: 'civil engineering would look very different if the properties of concrete changed every 4 years.'
If at some point we stop scaling chip performance. And many-core-integration in/on a single chip/die stops making sense. Then glue logic starts to look like a key differentiator. And control over glue logic starts to look like control over profits.
Intel ate the chipset for performance reasons and so they could shape their own destiny.
If there aren't fundamental breakthroughs to preserve performance scaling as we know it, then I see this as more of the same.
That's the logical assumption. Of course, to make this work, the tooling would have to be wildly different; at the very least, an open bitstream format.