Hacker News new | past | comments | ask | show | jobs | submit login

yes, but your gain is still an integer number. the non-integer ratio is just a statistic (gain per number of units).



Why is it an integer number?

Entropy (in bits) is defined as - \sum_x (p(x) log_2 p(x))

There is no reason this has to be an integer, since probabilities are not restricted to being reciprocals of powers of 2.

Consider also that you can simply use a different logarithm base to get a different unit (e.g. use the natural logarithm to obtain the entropy in nats). It would be bizarre if the arbitrary choice of 2 as the base gave a unit that was indivisible.

I think this whole confusion comes down to the difference between a bit as a "unit of information in the sense of information theory" [divisible] and a bit as a "single physical one or zero" [not divisible]. The relationship between the two is that the entropy of a random variable is a lower bound on the average number of bits required to represent it.


yes, you are quite right -- i was referring to the binary digit rather than the shannon


only when you consider bits to be the final, indivisible, fundamental unit of information.

which they aren't.

if you have a data storage thingy that can store any of three values, a ternary digit, it is exactly equivalent to log2(3) = ln(3) / ln(2) ~= 1.585 bits.

kind of like US pop-science articles like to say stuff like "a volume 1.5 olympic-size swimming pools" (because a megagallon is just weird), even though obviously, can never have half of such a pool or it would empty.

(ok after some consideration, you could have the bottom half)


you are obviously right, but i think that in the specific case described above -- computer code -- we have binary digits as final and indivisible units.


Filling the top half would be more fun

https://what-if.xkcd.com/6/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: