Stateless ZIP library – SLZ

JoshTriplett · on Sept 3, 2015

Every DEFLATE compression block can either supply its own Huffman table or compress using a fixed table. You can emit a DEFLATE-compatible compression stream without ever emitting a custom Huffman table or depending on previous data. And you'll still do better than nothing for many types of data.

Given this, I wonder why HPACK invented a new compression scheme rather than just carefully using DEFLATE.

andrewf · on Sept 3, 2015

The HPACK spec [1] remarks on this. Google carefully used DEFLATE in earlier SPDY versions [2] but HTTP/2 decided to be less clever.

Interestingly, SLZ's page comments on lack of history as a way to avoid CRIME attacks, which is one of the things the HTTP/2 people worried about. That seems like something it'd be easy for an implementation to screw up though (just add some buffering somewhere before the compressor...)

[1] https://http2.github.io/http2-spec/compression.html#rfc.sect... [2] https://www.chromium.org/spdy/spdy-protocol/spdy-protocol-dr...

doublerebel · on Sept 3, 2015

Looks like SLZ hit 1.0 back in April [1]. Has anyone used it in production? Are packaged binaries available to get it up and running quickly?

[1]: http://git.1wt.eu/web/libslz.git/

donatj · on Sept 3, 2015

Very interesting. This appears to do much better something I've been working on a PHP wrapper to do for a while now. I ended up currently opting out of compression as a whole and just using STORE zips to send multiple files at once without having to keep data in memory. This is absolutely worth looking into. It's been a long time since something on HN has gotten me this excited.

striking · on Sept 3, 2015

Absolutely incredible. I had no idea this was even possible. Defeating CRIME while keeping compression? I can hardly believe it.

amluto · on Sept 3, 2015

AFAICT this only avoids CRIME if there's never both attacker-controlled data and a secret in the same SLZ compress call.

bonyt · on Sept 3, 2015

But I think the idea is that the compress call takes such a small amount of data that it is far less likely for there to be such data in the same call

mikeash · on Sept 3, 2015

It goes beyond that, and suggests explicitly dividing up the data into sensitive and potentially attacker-controlled pieces, and processing those pieces separately. In short (and I could be wrong, but this seems to be the basic idea), if you always start a new chunk after you're done sending headers and before you start sending content, you're safe.

striking · on Sept 3, 2015

You can separate the calls while keeping the streams intact.

trippy_biscuits · on Sept 4, 2015

How does it address quines?

http://research.swtch.com/zip

X-Cubed · on Sept 4, 2015

It's just a compressor, it doesn't provide any decompression support.

JadeNB · on Sept 4, 2015

I love the fact that there is a ZIP quine, too, and I'll admit I immediately thought of that, but it's not clear to me what responsibility the library has to, or what the expectation on the user's part there is that it will, 'address' quines in any particular way.

hinkley · on Sept 3, 2015

So would this be useful for compressed virtual memory? It only does compress so it would only halve the memory usage but that's something.

andrewf · on Sept 3, 2015

If you're a webserver and want to serve clients which support gzip encoding, you need to be DEFLATE compatible.

For something like virtual memory where you'd control both the compressor and decompressor, there's little point in clever DEFLATE-compatible compressors. Instead, just pick any compressor which has the compression/CPU tradeoff you want.

logicallee · on Sept 3, 2015

Ah yes, for that elusive "malloc-safe" spec requirement.

tcas · on Sept 3, 2015

Having a fast malloc less compression library is extremely useful in resource constrained embedded systems world, where either memory usage should be minimized (you can still run modern Linux kernels on systems with 32MB of RAM, probably less), or where traditional memory management isn't available, such as a microcontroller.

In the microcontroller case you can get a 32bit ARM Cortex M0 for $2.00 qty 1 these days, allowing the bit-twiddling you find in compression libraries to compile successfully. Having a fast malloc-safe compression library is extremely useful. lz4 is an example of one which you can run on a microcontroller successfully, but isn't as compatible as zlib is.

logicallee · on Sept 4, 2015

hey, this answer was really interesting! I was actually making a (poor) joke about the quixotic idea of requiring a desktop program to be just fine with malloc failing, no biggie. your answer is much more interesting.

baruch · on Sept 4, 2015

Large embedded systems also like malloc-less solutions. In such systems you essentially have explicit use for all the memory you have and you don't like to buy extra memory that is unused (reduces your margins). When you need to account for all memory needs and you use allocation schemes you need to create memory pools that are not too large (create waste) and not too small (performance bottle or deadlock source).

One such example are large SAN storage systems. We can have gigabytes or even a terrabyte of ram and still have very little spares (little by our standards < 100MB for diagnostics and general OS nuisance).

warmwaffles · on Sept 4, 2015

May not be useful to me, but it is still cool either way.

ladon86 · on Sept 4, 2015

How would one use this with HAProxy?