Hacker News new | past | comments | ask | show | jobs | submit login

I somewhat agree. I think that Shannon's theorem about information entropy comes into play and AFAIK we are very close already so improvements are bound to be small as we try to get closer and closer to the theoretical limit.

Still, I am reminded of the Huffman compression: there researchers already thought they reached the optimum. They were proven fenomemally wrong when the researchers relized they should look at compressing sequences rather than individual bytes (I believe it was LZ77 or LZW algorithm that was the first).

If we take that into account and following the same thought, improvements may still be possible if we could compress based on a larger body of data. For example, headers of the same file type contain the same common information. In theory, these could only be stored once on the hard drive. This is something deduplicated storage is already trying to achieve, so nothing new here, just trying to point to one avenue that can still yield future improvements :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: