Could we just post gzipped titles and distribute a plugin to overcome the limita...

version_five · on March 28, 2022

What kind of overhead does gzip have? I'd be interested to know how many characters you could fit into 80 compressed characters. Some mapping of (2 byte?) unicode characters to 3ish lowercase letters could be effective. Is there any standard way like that?

dheera · on March 28, 2022

This is an interesting question. I haven't researched the overheads.

Separately, I feel like if you account for grammatical rules, it is possible to eliminate certain "filler" words in a lossy fashion but add them back later based on grammatical rules. For example, you don't need to say "in mice", you could just say "mice", the meaning is obvious, and "in" could be fixed in post-processing at the client end.

You could also quite possibly eliminate all vowels and still reconstruct everything accurately.

version_five · on March 28, 2022

> For example, you don't need to say "in mice", you could just say "mice", the meaning is obvious, and "in" could be fixed in post-processing at the client end.

> You could also quite possibly eliminate all vowels and still reconstruct everything accurately.

My guess is that 10 minutes after that is rolled out, someone will have found a collision that decompresses to some kind of dirty joke

jazzyjackson · on March 28, 2022

There's base2048 [0], which can cram 11 bits into each code point, or 110 bytes into 80 characters.

[0] https://github.com/qntm/base2048

cbhl · on March 28, 2022

Pretty sure the entire page is already served with Content-Encoding: gzip.

An 80 character limit is great for scanning by humans, though.

dheera · on March 28, 2022

Yeah but this is less about actual transfer efficiency and more about novel ways to overcome the limits of what is allowed to be submitted.