Hacker News new | past | comments | ask | show | jobs | submit login

That strictness of x86 execution order has been substantially relaxed in the last two decades and can be a bit of a pain to deal with in multithreaded code for the novice. The Pentium 3 added SFENCE (with SSE) and the Pentium 4 added LFENCE and MFENCE (with SSE2). I believe that prior to that, only the LOCK prefix was available.



From the memory model point of view LFENCE and SFENCE are only relevant for SSE non temporal load and stores. A novice is never going to stumble on them by mistake.

MFENCE was added for convenience, but the same effect can be had with any locked instruction on a dummy memory location. In fact XCHG is often still faster than MFENCE.

In fact the x86 memory model has been strengthened in the last couple of decades as some reordering that were theoretically possible (but were never implemented in practice in any hardware) have been finally documented to be impossible since TSO has been embraced.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: