Hacker News new | past | comments | ask | show | jobs | submit login

Really? I don't know about GPUs? That's news to me! Did you know that the precision of the fast reciprocal square root on NVIDIA GPUs is 1 ulps out of 23 bits? That's a world of difference away from 1 ulps out of 7 bits. I wouldn't touch a 7-bit floating point processor. Life is too damned short for that.

And that's because I have spent days chasing and correcting dynamic range errors that doomed HPC applications that tried to dump 64-bit double-precision for 32-bit floating point. It turns out in the end that while you can do this, you often need to accumulate 32-bit quantities into a 64-bit accumulator. Technically, D.E. Shaw demonstrated you can do it with 48 bits, but who makes 48-bit double precision units?

I stand by the computational tripe definition (with the caveat that Hershel has now posted an app where this architecture is possibly optimal). My objections to the broad extraordinary claims made in the presentation above.

And hey, you're a game developer, let me give you an analogy: would you develop a software renderer these days if you were 100% constrained to relying on mathematical operations on signed chars? It's doable, but would you bother? Start with Chris Hecker's texture mapper from Game Developer back in the 1990s, I'm guessing madness would ensue shortly thereafter. Evidence: HPC apps on GPUs that rely entirely on 9-bit subtexel precision to get crazy 1000x speedups over traditional CPU interpolation do not generally produce the same results as the CPU. If the result is visual, it's usually OK. If it's quantitative, no way.




> who makes 48-bit double precision units?

IIRC Cray did (but they called it “single”). =)

Snark aside, I agree broadly with the points your making here. This isn’t especially groundbreaking; this is using the fact that logarithmic number representations don’t require much area to implement if you don’t need high-accuracy and are willing to trade latency for throughput (something that FPGA programmers have been taking advantage of since forever), and then going shopping for algorithms that can still run correctly in such an environment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: