FUN FACT (if my math was right):
Suppose a computer "BYTE" == 1 JOULE of energy, how does the theoretical capacity of a 16-bit computer differ from 64-bit?
16-bit: How much ~2.2 sumo wrestlers eat per day
64-bit: How much energy the sun produces in ~2.575 minutes
In many instances a 16 bit computer would need more instructions to perform the same work.
TO address beyond 64K it would need to issue commands to direct where to find the various parts. I once worked in an environment where we were maintaining software that natively ran on a 16 bit machine. The CPU did have separate Instruction / Data space. The addressable memory was broken up into 8K chunks. To reference an 8K chunck that was not already mapped, you had to call a special routine to change one of your 8K chunks. So you put your main routines in one 8K chunck, your common subroutines in another, then you managed the rest in the 6 levels in between. You would have several overlays, typically. The overlays could get very messy trying to sort out what can call what.