Hacker News new | past | comments | ask | show | jobs | submit login

The number of bits does seem to affect neural network training and really kills it when you get really low (like 8 to 12 bits). The problem is that really small numbers are rounded down to zero, and then when summed together the result is really inaccurate.

Ideally you could do some kind of stochastic rounding, where the decimal value represents it's probability of being rounded up or down. On average the totals would still be the same. You could probably get the required number of bits really low then, perhaps even only one or two!

Something to look into when these algorithms are put into FPGAs and ASICs, which is starting to happen.




The problem is that really small numbers are rounded down to zero, and then when summed together the result is really inaccurate

Use log values instead?


Using the log domain is cool but log summation still has the same problem. Logs just transform numbers into a different range, they don't magically add more bits of precision. That is adding a very small number in the log domain to another much larger number in the log domain, will just round it down.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: