Hacker News new | past | comments | ask | show | jobs | submit login

Nice piece of history.

From what I remember reading, JVM is a stack machine, while Dis is a register machine, right? Is there any other significant difference that would make Dis less "future proof"?




Why is stack more future proof


It’s less work to port it to a future machine that you don’t know the number and type of registers of yet.

If a future CPU has more registers than your VM, you have to either cripple your VM by not using all registers of the new hardware or write code to detect and remove register spilling that was necessary on the smaller architecture.

If, on the other hand, it has fewer or a different mix, you have to change your VM to add register spilling.

Either way, your byte code compiler has done register assignment work that it almost certainly has to discard once it starts running on the target architecture.

If you to start at the extreme end of “no registers”, you only ever have to handle the first case, and do that once and for all (sort-of. Things will be ‘a bit’ more complex in reality, certainly now that CPUs have vector registers, may have float8 or float16 hardware, etc)

You can also start at the extreme end of an infinite amount of registers. That’s what LLVM does. I think one reason that’s less popular with VMs because it makes it harder to get a proof of concept VM running on a system.


It isn't, unless you consider SPARC the future. SPARC had an interesting register architecture, where making a function call would renumber registers 9-16 to 1-8, and give the function a fresh set of 9-16; IIRC most SPARC CPUs had 128 or so registers total, with this "sliding window" that went up and down as you called and returned, which essentially gave you something like a stack.

The rest of the world's CPUs have normal registers, which is one aspect of what makes register-based VM bytecode easier to JIT to the target architecture, which was one of Dis' original design goals (with an interpreter being a fallback). It also happens that we know a lot more about actually optimising register-based instructions (rather than stack-based), so even if you had to fall back on an interpreter, the bytecode could've gone thru an actual proper optimisation pass.


Not an expert but I'd argue a stack is somewhat more future proof in the sense that it's less tied to a particular number of physical registers, whereas a register machine must be. Which is exactly what makes it a bit harder to optimise for, but that's what abstractions tend to do :-)

The sparc sliding register window turned out to be a very bad idea, but I guess you already know that.


I don't think the concept of register windows is necessarily a bad idea. IMHO, SPARC was flawed in that every activation frame needed to also have a save area for a registers window just in case the processor ran out of internal registers.

I think the Itanium did register windows right: allocate only as many registers as the function needs, and overflow into a separate "safe stack". Also, the return address register was among them, never on the regular stack, so a buffer overrun couldn't overwrite it.

There is a third option to stack and registers: The upcoming The Mill CPU has a "Belt": like a stack but which you only push onto. An instruction or function call takes belt indices as parameters. Except for the result pushed onto the end, the belt is restored after a function call – like a register window sliding back. It also uses a separate safe stack for storing overruns and return addresses. Long ago, I invented a very similar scheme for a virtual machine ... except for the important detail of the separate stack, so it got too complicated for me and I abandoned it.


> The sparc sliding register window turned out to be a very bad idea, but I guess you already know that.

Yikes, that triggered flashbacks of 8086 segmented memory.


It wasn't really like that AIUI. Using a stack introduced hardware complexity while also serialising instruction processing (because you only work from the top of the stack, unlike a register set where you can access any part of it at any time) which caused the chip not to be the raging speed demon the designers thought it was going to be.

I'd very much like to understood what was going through the sparc designers minds when they did that. Looking back on it with my own current understanding of CPU designs and all that, they seem to have made some incredibly basic mistakes, including designing the hardware without talking to the compiler writers (a cockup the Alpha designers very definitely didn't make). It's all very odd.

Another mistake they made was apparently deciding to leave out instructions based on counting them in the code – if an instruction didn't appear very often they omitted it. Sounds reasonable but that meant they left out the multiply instruction initially, which might not have been so common in the code was actually executed quite often (e.g. in array lookups) and there were complaints about the new Sparc stations with their new superior chip, that they were slower than the 68000-based CPU that preceded it. Hardware multiply was later added.


Among all the other reasons stated, like independence from platform registers, stack-based VMs are really easy to implement -- you don't need to worry about register allocation in your VM code generator, you can leave that bit to the stage of the VM that generates native code which would need register allocation, even on a register-based VM.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: