Hacker News new | past | comments | ask | show | jobs | submit login

If you only write to the lower 32-bit of the v0 register, which could be 1024 bit wide, that claims that the hardware somehow has to allocate a 1024-bit wide register to back those up, and then makes some "locality" arguments.

The hardware can back up the 1024-bit register with a pool of 32-bit registers, and if you only wrote to the first 32-bits, and all others are zero, it can use a single 32-bit register to back it up, making this "as good" as the single mask register solution, which the author thinks is good.




Determining that you only wrote to the bottom 32 bits of the register being copied from is hard for hardware to see; and if the compiler can see, it has no way to tell the hardware.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: