Hacker News new | past | comments | ask | show | jobs | submit login
Exploring the native use of 64-bit posit arithmetic in scientific computing (arxiv.org)
79 points by PaulHoule on May 12, 2023 | hide | past | favorite | 24 comments



The real hope is that 32-bit posits (with 512-bit quire for exact dot products and exact sums) can replace 64-bit floats where users hope 15-decimal accuracy in every variable means they don't have to learn numerical analysis. When you can do all your linear algebra to 8-decimal accuracy with 32-bit posits, the need for 64-bit representation starts to look expensive and unnecessary.

Also, please note that all traditional algorithms are wary of the disasters of overflow to infinity and underflow to zero, so they tend to manage the magnitudes of numbers to prevent that. Posits take advantage of that by decreasing relative error when the exponent scaling is not extreme. Standard 64-bit posits (2 exponent bits) have 60-bit significands, versus 53-bit significands for IEEE floats, for values between 1/16 and 16 in magnitude. And floats do not have anything like the quire, since an exact dot product accumulator for 64-bit IEEE floats has to be something like 4,664 bits wide (an ugly number) and has no provisions for infinities and NaN values.



That interactive diagram is nice but I don't think it should say floats only have a length. Floats as a general data format let you customize the size of the exponent field too.

As far as explaining posits, the key point I'd say to people is that a posit is almost the same as a float, but it stores the exponent in a different (variable-length) way.


Does "common" hardware support nonstandard exponent lengths for floats?


Only a few, like bfloat, but common hardware doesn't support nonstandard total lengths either.



Here's that paper's pdf[1]. ACM... sigh.

[1] http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf


That paywall is symptomatic of the ACM being a professional society for computer science professors and not software practitioners. If you go to the editorial section of CACM it is frequently stuck in a loop as every decade or so there are violent ups and downs of CS program enrollment driven by the realization that “cs students are in demand” and then “IT jobs suck”, the latter of which could be addressed by having a professional society for software practitioners…


Lots of free information on posits at www.posithub.org.


Slides from this year's CoNGA (Conference on Next Generation Arithmetic) are available here [1]. Not sure if there are video recordings to come.

[1] https://posithub.org/conga/2023/#schedules


I can't tell from the paper, but are they using maximally precise double-precision libraries? Implementations may have errors much larger than 1ULP for transcendental functions and roots.


Unum (number format) > Posit (Type III Unum) https://en.wikipedia.org/wiki/Unum_(number_format) :

> Posits have superior accuracy in the range near one, where most computations occur. This makes it very attractive to the current trend in deep learning to minimise the number of bits used. It potentially helps any application to accelerate by enabling the use of fewer bits (since it has more fraction bits for accuracy) reducing network and memory bandwidth and power requirements.

> [...] Note: 32-bit posit is expected to be sufficient to solve almost all classes of applications [citation needed]


All the claims made in this paper about the precision of the results obtained when computing with posits are completely meaningless, because there is not a single word about the most important factor that determines the accuracy of the operations with posits: the ranges of the values chosen for the input data used in their benchmarks.

The floating-point numbers are designed to have almost constant relative errors over their entire range of representable numbers and no other numeric format can improve on that.

The posits are the result of a different trade-off, where improved precision for the numbers close in magnitude to 1 is obtained by reducing the precision of the small numbers and of the big numbers.

When evaluating the accuracy of benchmark results, with floating-point numbers the input values do not matter, unless they have values that cause underflows or overflows. On the other hand, with posits the input values matter a lot, because with some values the accuracy will be excellent, much better than with floating-point, while with other values the accuracy will be much worse than with floating-point.

There are many problems for which low precision, i.e. up to 32 bits, is adequate and where posits can be better than floating-point numbers. Nevertheless, for each such problem a careful numeric error analysis must be made, because using them blindly can produce unexpected results. Floating-point numbers are more foolproof.

However, for scientific computing problems that require precision of at least 64 bits, I have never seen one that could benefit from using posits. All the problems that arise from the simulation of the physics of sufficiently complex systems, e.g. the simulation of electronics circuits, require computations with small numbers and big numbers, where the precision of posits drops dramatically.

In theory, it would be possible to also use posits in many 64-bit applications, if an analysis of the problem would be made, to determine a large number of constant scaling factors, which would be inserted in various formulae, to bring the operands in the range where posits are more accurate than floating-point numbers.

Nevertheless, such an analysis consists in a huge amount of work, which is never worthwhile just for replacing FP numbers with posits. If such a search for optimal scaling factors were done, then it would be better to implement the computations with fixed-point numbers, to obtain even more accurate results than with posits.


Posits are pretty cool. I hacked a relatively high-performance implementation in ocaml.

They're really elegant.


Not just the representation is much better than IEEE 754, which is awful (e.g. having negative zero and wasting lots of combinations on encoding NaN even though a single one would do).

And not just that, it seems they also standardized arithmetics? Which is a big deal because IEEE 754 is unusable in heterogeneous distributed systems as every hardware implementation does something different.


Yes, that's also an important selling point: sane and *consistent* rounding.


What is quire? I couldn't find anything that made sense in the way it is referred to in the paper.


Simply put: an accumulator (in fixes-point format) with enough bits to calculate dot products with up to 2^31 terms with NO loss of precision.


I believe it's something like an accumulator that has higher precision than the data types being used. It helps reduce rounding errors by maintaining extra bits until you store a final result.


Specifically they are using a 1024 bit accumulator register to hold intermediate results for 64-bit posit operations. You can get more precision for IEEE float operations as well if you're willing to just add loads of extra bits!

It's expensive though:

  Synthesis results of the 64-bit PAU in Big-PERCIVAL
  have shown that it requires 2.5× as many resources as
  the double-precision FPNew FPU. Moreover, we studied
  the impact of the corresponding 1024-bit quire accumulator
  register, which increased the total hardware cost to a third
  of the area of the core. Detailed area results illustrated how
  the hardware resources are distributed among the different
  operations. In particular, the most resource-hungry elements
  are the quire-related units and the posit division and square
  root units.
I don't think this is a particularly positive results for posits.


Yeah, I have had a tough time thinking posits are worth it in hardware - posit ops of length n seem to take almost as much hardware as floating point ops of length 2n.

Quad-precision float seems more general-purpose and honestly more promising for scientific computing, since the error analysis is easier.


I think Gustafson would argue that it doesn't matter, since the storage cost impacts power more than the FPU computation cost. (Not that I would agree with him).

But in general, it seems that the strongest features of posits are basically recognizing that being strategic with where you need extra precision is advantageous, and if you apply the same techniques to IEEE 754 floats, you lose most of the seeming advantage of posits.


IEEE 754 is just the codification of the Intel 8087 coprocessor design that John Palmer and Bruce Ravenel came up with. They brought in William Kahan as a consultant, and Kahan disagreed with almost every aspect of their design (he wanted decimal representation, not binary, and 128-bit extended precision instead of 80-bit, and bitwise reproducibility, not 'better answers on Intel') but he lost every argument. Kahan's clout helped Intel's design become the IEEE Std 754, and John Palmer chortled over the fact that they'd foisted that on the world. I used to work for him, so this is first-hand info. IEEE 754 is not a mathematical design, and the exception conditions are a complete mess, which is why it takes almost 90 pages to describe the Standard. The Posit Standard (2022) is only 12 pages long.

The #1 issue in computer performance is The Memory Wall... it is orders of magnitude more expensive to move data between external DRAM and the processor than it is to do operations within the processor. The solution is to increase information-per-bit so that real numbers can be represented in 32-bit precision with sufficient accuracy. That more than doubles the performance over 64-bit floats since it allows more data to fit in cache at every level of the memory hierarchy.


The 754 spec has somewhat moved past the 8087 at this point (3 revisions later). A lot of things have been fixed, including the whole language around exceptions (which used to define "traps" - a very processor-specific idea rather than an arithmetic-centered one). I am hoping we can be free of (required) exceptions in 2028.

As I understand it, your other complaints tend to center around overflow to infinity and precise summation of vectors. For applications that really care about that precision, there are ways to do it in floating point without a quire register - sorting before summing is the naive approach, but look into ReproBLAS for some better algorithms.

Also, I can't help but wonder if the memory wall idea here is centered only around synthetic benchmarks like gigantic dot products. A lot of code leans heavily on caches these days, which make the energy cost of operations a lot lower, and pretty much everything short of massive dot products uses them. I imagine you would have to make a very nuanced argument about why a 1k fixed point sum is saving energy here. Even matmuls are pretty cache-efficient now.

Elsewhere in computing, we are actually generally moving away from tightly-packed structs in performance-sensitive code despite the memory retrieval cost, because they are just easier to deal with in both hardware and software, and locality picks up all the slack.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: