Card Trick Leads to New Bound on Data Compression (2010)

colanderman · on Aug 17, 2014

What a surprisingly content-free article from MIT.

What bound did they break? By how much? In what domain does the bound apply?

(There is no one "compression bound" for lossless compression; there exist infinitely many lossless compression schemes which can compress at least one sequence of arbitrary length to a single bit, and every lossless compression scheme must have an expected compression ratio over all data of 1:1. Hence knowing the domain constraints is important!)

judk · on Aug 17, 2014

It's not surprising, it's technologyreview, which spun out from student newspaper to linkbaity web tabloid a few years ago.

In this case, however the original paper (already linked in the HN comments) is itself a linkbaity title with content-free abstract.

3327 · on Aug 17, 2014

Surprisingly "empty" article. "Well folks we've just lowered the bound on data compression - there you have it."

No reference, No Paper not even an example of how it applies. I hope an editor notices it and they pull it out - pretty embarrassing.

grkvlt · on Aug 17, 2014

> Ref: arxiv.org/abs/1011.4609 Bounds from a Card Trick

At the bottom of the article

kerkeslager · on Aug 17, 2014

Yeah, I mean, if you ignore domain constraints, the following is a great compression algorithm:

    $infile = $ARGS[1]
    $outfile = $ARGS[2]
    touch $outfile

This was one of my favorite examples in Data Structures and Algorithms in college.

colanderman · on Aug 17, 2014

Except that's not lossless like I qualified.

PeterWhittaker · on Aug 17, 2014

The card trick referred to in the article is likely the one developed by Alex Stone and described in his book Fooling Houdini, [1].

He knew the most important part of any act is showmanship, and he knew that performing well at top-level magic contests required both showmanship and less known tricks, tricks that an expert audience wouldn't be able to puzzle out quickly or easily.

It took him weeks (months? been a while since I read the book) to memorize the sequence of cards in his prepared deck (sequence generated by computer) and weeks (months?) more to go from being to do the trick to being able to perform the trick.

It's a good read - and what I just described only comes in at the end, after his intense and occasionally life threatening research into the world of magic.

[1] http://foolinghoudini.com/

derf_ · on Aug 17, 2014

The actual paper with the theorems: http://arxiv.org/abs/1011.4609

judk · on Aug 17, 2014

It says "submitted to a journal" in 2010,

Does that mean that the paper was accepted, or was rejected 4 years ago and never published?

_delirium · on Aug 17, 2014

It looks like it was published in 2012, but the author hasn't updated its arXiv metadata: http://www.sciencedirect.com/science/article/pii/S1570866711...

nly · on Aug 17, 2014

De Bruijn sequences are cool. Last time I came across them was a fast way to do bit scans. On the older Intel Atoms, an optimised De Bruijn routine to count the numbers of bits set in a word, is competitive with the x86 intrinsic.

https://chessprogramming.wikispaces.com/BitScan

andrelaszlo · on Aug 17, 2014

I found out about de Bruijn sequences while brute forcing a digital door lock (it was my own door, my landlord changes the code sometimes). It only works when you are able to keep entering digits until you hit the correct sequence. If the thing resets every four digits, it's useless of course.

A four digit key will yield a 10003 digit de Bruijn sequence so if you push 2.5 keys/s it will take you at most 7 hours to hit the correct key. Doing all keys in sequence 0000, 0001, ..., 9999 will take you at most 28 hours.

Did I find the code? Yes, I asked the neighbor.

qprofyeh · on Aug 17, 2014

If the card trick analogy is correct and it depends on probability bounds, that would mean that this new data compression algorithm is lossy and would be rather suited for video compression etc. and not general purpose data, right?

thomasahle · on Aug 17, 2014

No, algorithms people don't do lossy compression (Human/Computer Interaction people tend to do those). However you can do algorithms that fail with a certain probability, then you just repeat it until it succeeds and calculate the expected cost. Specially within information theory, there are a lot of strange techniques dependent on things like this.

judk · on Aug 17, 2014

What a weird way to create artificial boundaries in thr science community.

amk_ · on Aug 17, 2014

I think Thomas meant that designing a lossy compression scheme inherently involves making decisions about what parts of the data it is OK to lose -- i.e., in audio processing it might be acceptable to drop frequencies outside of the range of human hearing from the compressed output.

tacotime · on Aug 17, 2014

But you can't "just repeat it until it succeeds" because there's a deterministic chance (1/128 in the case of the cards) that a De Bruijn cycle occurs more than once in the same sequence. So to just repeat it until it succeeds you would need to keep rearranging the sequence until you eliminate those redundant cycles.. which obviously doesn't work cleanly because now you have to encode those rearrangements somehow.

lun4r · on Aug 17, 2014

De Bruin sequences: http://en.wikipedia.org/wiki/De_Bruijn_sequence

malkia · on Aug 17, 2014

Really the subject of the article was the article itself.

judk · on Aug 17, 2014

[2010]

Aerospark · on Aug 17, 2014

I liked Pied Piper's backstory better.

sytringy05 · on Aug 17, 2014

He probably made the card analogy to cover the real way he came up with the algorithm

iopq · on Aug 17, 2014

One weird card trick that will change the way you compress your data!