To put the mind-boggling amount of logic in a contemporary FPGA into perspective, the smallest device in the Kintex-7 family seems to be the XC7K70T...
On page 19: The four boxes on the left are the lookup tables implementing combinatorical logic (A/W/O5/O6/...), the eight squares on the middle/right (D/CE/CK/SR/...) are flipflops (store one bit of data each). There's a bunch of random multiplexers (the trapezoid ones, they choose one output of X inputs) scattered around. This "schematic" is of course simplified ;-).
So, 10250*6/308=199.7 fits the 8088 "IP" 200 times. Of course this is a very naive calculation ignoring any routing between cores or any peripherals to make them do anything useful, and one would use one of such a 8-bit CPU for easy housekeeping tasks, and not 200 of them. But it shows nicely how incredibly dense current FPGAs are.
What does one even use that much FPGA for? Any organization that could afford that could probably afford making an ASIC of comparable or better performance.
One example that comes to mind is data analysis for radioastronomy. With arrays of radiotelescopes, as far as I understand it, you will downconvert an incoming frequency band spanning a few GHz, which gives you a few GByte/sec of data foreachantenna. Then you might have an array of 50 telescopes (the ALMA array is planned to have 50 or so, I think).
This stream of maybe a TByte/sec of data will then be filtered and decimated/downconverted in real time by racks full of DSP/FPGA boards. Here's a picture of one board used for an Australian facility:
5x Virtex II XC2VP50 (23,616 slices, 2 PowerPC CPU blocks, $1700 each)
5x Virtex 4 XC4VSX55 (15,360 slices, $1300 each)
Yes, an ASIC might be more energy efficient and could be made faster, but FPGAs give you the flexibility to adapt your algorithms and filter topologies. And an ASIC run might cost you half a million dollars whereas with FPGAs you only spend ~15'000$/board.
If you're going to build and sell a hundred of £20000 devices, for some very specific market niche, it is still more justified than building hundreds of thousands of £10 devices and only selling one hundred.
http://www.xilinx.com/products/silicon-devices/fpga/kintex-7... (<-- HTML overview page at Xilinx)
...which contains 10,250 slices where one slice contains four 6-in-2-out lookup tables and 8 flipflops:
http://www.xilinx.com/support/documentation/user_guides/ug47... (<-- family user-guide) http://imgur.com/viEdQUv (<-- png of page 19 with "schematic" of one logic slice)
On page 19: The four boxes on the left are the lookup tables implementing combinatorical logic (A/W/O5/O6/...), the eight squares on the middle/right (D/CE/CK/SR/...) are flipflops (store one bit of data each). There's a bunch of random multiplexers (the trapezoid ones, they choose one output of X inputs) scattered around. This "schematic" is of course simplified ;-).
So, 10250*6/308=199.7 fits the 8088 "IP" 200 times. Of course this is a very naive calculation ignoring any routing between cores or any peripherals to make them do anything useful, and one would use one of such a 8-bit CPU for easy housekeeping tasks, and not 200 of them. But it shows nicely how incredibly dense current FPGAs are.
Of course, just one bare chip will set you back around $120. https://octopart.com/search?q=XC7K70T (<-- part search)