To somewhat clarify, x87 is the instruction set of the Intel 8087 and its succes...

brudgers · on Aug 11, 2018

The integration of the x87 FPU (floating point unit) occurred with the 80486 series chips.

shawnz · on Aug 11, 2018

Why wasn't the x87 instruction updated to use the implementation of the SSE2 instruction?

DiabloD3 · on Aug 11, 2018

Because then it would no longer comply with the x87 specification.

acqq · on Aug 11, 2018

No. SSE and SSE2 both simply don't have any instruction to calculate sin, also not for the reduced range, like in x87.

microcolonel · on Aug 11, 2018

They could microcode it, it would just be a fairly large sequence, and might depend on a table.

acqq · on Aug 11, 2018

What I wanted to point is that neither SSE nor SSE2 instruction sets have the "sine" function at all, which was, the way I understood it, implied by the question asked ("Why wasn't the x87 instruction updated to use the implementation of the SSE2 instruction?")

What exists are different libraries that calculate the "sine" better than fsin from x87 instruction set, and for all of these, it is actually not important that they use SSE or SSE2 -- it is completely possible to implement the better sine algorithms with the basic x87 functions too, or with anything that doesn't have SSE and SSE2. The effect of not having the sine function at all in SSE and SSE2 sets is that if you decide to use only these instructions for your x86_64 library, you have to implement everything with the basic instructions that you have, that is, you'd surely have to use some library code, even in the range in which fsin would suffice.

However, if the question was actually "why wasn't the implementation of the x87 fsin instruction updated to be simply better" (which certainly could be implemented in the microcode) the answer is that apparently AMD tried exactly that with their K5 (1996) and then for the later processors had to revert to the "worse" to keep the compatibility with the existing programs, it is written in the original article or in the comments of it.

microcolonel · on Aug 11, 2018

In order for that to work well, I have a hunch that they would need to do some very dirty tricks with register renaming, and add a 80-bit ("long double") mode to the SSE hardware, and I suspect that's why they are not as fast as eachother (whether that's as a result of fully separate implementation, or very conservative microcode [possibly involving fencing and swapping registers, emulation of arithmetic quirks of x87]) in practice.

Though, disclaimer: I am not even really an amateur at this level of analysis, so take what I've said with its very own salt lick.

brudgers · on Aug 11, 2018

The x87 instructions are both single and double precision, 32 and 64 bits respectively. The design motivation was around engineering/scientific calculations. SSE instructions are mostly single precision floating point motivated by audio and graphics for multi-media.

acqq · on Aug 11, 2018

> The x87 instructions are both single and double precision, 32 and 64 bits respectively.

x87 supports 80-bits calculations, others are the result of setting the configuration register to shorter widths.

That's one of its advantages over both SSE and SSE2. There are still some use cases where it's reasonable to use x87.

SSE has single precision (32-bits) instructions only.

SSE2 has double precision (64-bits) instructions.

brudgers · on Aug 11, 2018

Sorry for not being clear. I was giving an historic rationale for why x87 didn't change to use SSE. The x87 instruction set has a much longer history than SSE and came out of Intel. SSE started at AMD and the multimedia focus was a competitive advantage at introduction while x87 was becoming IEEE 754.

acqq · on Aug 12, 2018

> I was giving an historic rationale for why x87 didn't change to use SSE.

When stated as "why x87 didn't change to use SSE" the question is like asking "why a dog didn't change to use a cat" as both x87 and SSE are the instruction sets, from the start differently defined, and given that, one can't "use" another.

The original question, however, referred to fsin instruction of x87 instruction set, but also reflected somewhat of confusion, as neither SSE nor SSE2 had ever an instruction to calculate sine.

And the answer in which you apparently gave a "historic rationale" had incorrect statements, the correction of which is: 1) x87 didn't "have 32 and 64 bits", the x87 was ambitiously designed to do 80-bit precise calculations with the shorter results as the additional modes. 2) SSE was 32-bit only, but SSE2 added 64-bit instructions too. Still, x87 could not "use the implementation" of fsin from SSE2 as SSE2 doesn't provide the sine function. Finally, if the question was why wasn't the "fsin" ever improved, please see the other responses here, including mine.

brudgers · on Aug 12, 2018

By "SSE" I mean SSE. Sorry for the confusion.