Copy paste: Fun fact, the Mill's proposed method for hiding that latency, "defer...

jimrandomh · on July 19, 2017

IIRC the Mill's presentation about deferred loads predates this paper, though the paper is a lot more detailed and has simulations. It's not clear how the Mill's gain from deferred loads would compare (it differs in a lot of other ways that would interact).

legulere · on July 19, 2017

8% seems huge for an architectural change to me.

hvidgaard · on July 19, 2017

8% for an unoptimized PoC is pretty substansial, or it could be nothing when all details are accounted for. If it comes with a simpler silicon as well it's better and surely worth evaluating for GP processors.

mistercow · on July 19, 2017

I'm pretty out of my depth here, but I thought the deferred loads latency was part of the 12% gained from dynamic scheduling, not the 88% that the Mill claims to tackle with its phasing thing. In that case, 8% doesn't sound measly at all.

But like I said, I wouldn't be surprised if I'm just not understanding what I'm reading/watching.