Hacker News new | past | comments | ask | show | jobs | submit login

Although the family of algorithms is called LempeLl-Ziv, Ziv was the primary author, so it should really be called ziv-Lempel



I understand there is a tradition in papers to put the primary author first but in the end, it is the letter Z almost became synonymous with compression.

.Z, .zip, .7z, .gz, .xz, .bz2. Of course, many use LZ77 or LZMA, LZ stands for Lempel–Ziv, but only the Z remained in the format.


To be pendantic: Nobody today uses LZ77. LZ77 is quite primitive and not very efficient. However, nearly everyone today uses an algorithm that is descended from LZ77, so "LZ77" is just a sort of shorthand for "any algorithm descended from LZ77". LZMA and Zip's deflate are both examples of LZ77 descendants.

(To be even more pedantic, all modern algorithms are descendants of LZSS, which was one of the first improvements on LZ77.)

Out of the ones you listed, only .Z and .bz2 are not LZ77-LZSS descendants.

.Z is an implementation of LZW, which is a descendant of LZ78, which is a significantly different (but still related) algorithm to LZ77. It never really caught on beyond a few implementations of LZW, though.

.bz2 is the only one completely unrelated to Ziv's work. It uses the Burrows-Wheeler transform, instead.


I've been trying to figure out where the first z comes from. At least I tend to think "z" is for compression because of ZIP, but it predates that.

.Z was used for compress(1), probably because pack(1) used .z, but where did that come from?

My best theory is that it's from Steve Zucker who wrote pack(1) at RAND. There were other versions of pack(1) too, but if his was the first, it seems possible that he choose the z extension for Zucker.

Does anyone have more info or other theories?


I don't believe bzip2 is based on lz at all.


It uses arithmetic coding, I believe.

EDIT: wikipedia says it did but changed to use Huffman encoding due to patents? But those patents have expired.


It uses the Burrows-Wheeler transform and Huffman coding. It is one of the few lossless compressions algorithms that do not actually build on the work of Ziv.


You can also add zstandard to the list.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: