The entire address space (64 address bits, addressing bits, so "only" 61 bit addresses for bytes, but also 3 "kind" bits, separating code/data/control etc.) were available at all times.
The crucial point is that the RAM wasn't laid out as a linear array, but as a page-cache.
In a flat memory space, to allocate a single page far away from all others, you will need four additional pages (the fifth level is always present) for the page-tables, and four memory accesses to look them up before you get to the data.
In the R1000, you present the address you want on the bus, the memory board (think: DIMM) looks that address up in its tag-ram to see if it is present, completes the transaction or generates a memory fault, all in one single memory cycle, always taking the same time.
The problem is that there is a limit on how fast you can make the look-up in hardware. Today, given the large amounts of physical memory and the high frequency that CPUs are clocked at, single cycle lookup would be impossible. In fact today CPUs already have such lookup tables in the form of TLBs as hitting the page table every time would have very high latency; still TLBs cannot cover the whole address space and still need multi-level structures even for a subset of it.
Single Address Space OSs are an option, but it means that you are restricted to memory safe languages, it is very vulnerable to spectre-like attacks, and any bug in the runtime means game over.
The crucial point is that the RAM wasn't laid out as a linear array, but as a page-cache.
In a flat memory space, to allocate a single page far away from all others, you will need four additional pages (the fifth level is always present) for the page-tables, and four memory accesses to look them up before you get to the data.
In the R1000, you present the address you want on the bus, the memory board (think: DIMM) looks that address up in its tag-ram to see if it is present, completes the transaction or generates a memory fault, all in one single memory cycle, always taking the same time.