It is not clear when a flat address space larger than 64 bits will be required. At the time of writing, the fastest supercomputer in the world as measured by the Top500 benchmark had over 1 PB of DRAM, and would require over 50 bits of address space if all the DRAM resided in a single address space. Some warehouse-scale computers already contain even larger quantities of DRAM, and new dense solid-state non-volatile memories and fast interconnect technologies
might drive a demand for even larger memory spaces. Exascale systems research is targeting 100 PB memory systems, which occupy 57 bits of address space. At historic rates of growth, it is possible that greater than 64 bits of address space might be required before 2030.
Key point, and that's a huge "if". All large systems are NUMA, and trying to treat that like a uniform address space will be absolutely horrible because of the extreme latencies that arise.
You're correct of course, but I wanted to point out that people shouldn't dismiss everything >64 bit out of hand based on this reasoning. There are architectures that can make use of those extra bits, and not just for addressing a universe of RAM. The Transmeta processors weren't perfect or anything, but there's merit in the VLIW approach. The first generation Crusoe chip was a 128 bit part, and the second was 256 bit; this was a decade ago, in chips designed for ultra-light consumer laptops. As I understand it, some key people from Transmeta ended up at P.A. Semi, which of course was acquired by Apple in 2008. I wouldn't be at all surprised if we're talking about VLIW architectures again by 2016 or at least 2020.
2) Yes Transmeta was VLIW internally, but I see that as an implementation-detail over other forms of superscalar; either way you have a linear stream of instructions generated by the compiler, with hardware turning that into parallel execution by the CPU at runtime. Calling that "VLIW" is about as interesting as calling a modern x86 "RISC."
Most numa machines are single address space, and it's a very handy thing to have. You can run general pupouse code on the machine for things that are not perf critical but need to get done.
A choice quote:
It is not clear when a flat address space larger than 64 bits will be required. At the time of writing, the fastest supercomputer in the world as measured by the Top500 benchmark had over 1 PB of DRAM, and would require over 50 bits of address space if all the DRAM resided in a single address space. Some warehouse-scale computers already contain even larger quantities of DRAM, and new dense solid-state non-volatile memories and fast interconnect technologies might drive a demand for even larger memory spaces. Exascale systems research is targeting 100 PB memory systems, which occupy 57 bits of address space. At historic rates of growth, it is possible that greater than 64 bits of address space might be required before 2030.