Hacker News new | past | comments | ask | show | jobs | submit login
Card Trick Leads to New Bound on Data Compression (2010) (technologyreview.com)
80 points by lun4r on Aug 17, 2014 | hide | past | favorite | 23 comments



What a surprisingly content-free article from MIT.

What bound did they break? By how much? In what domain does the bound apply?

(There is no one "compression bound" for lossless compression; there exist infinitely many lossless compression schemes which can compress at least one sequence of arbitrary length to a single bit, and every lossless compression scheme must have an expected compression ratio over all data of 1:1. Hence knowing the domain constraints is important!)


It's not surprising, it's technologyreview, which spun out from student newspaper to linkbaity web tabloid a few years ago.

In this case, however the original paper (already linked in the HN comments) is itself a linkbaity title with content-free abstract.


Surprisingly "empty" article. "Well folks we've just lowered the bound on data compression - there you have it."

No reference, No Paper not even an example of how it applies. I hope an editor notices it and they pull it out - pretty embarrassing.


> Ref: arxiv.org/abs/1011.4609 Bounds from a Card Trick

At the bottom of the article


Yeah, I mean, if you ignore domain constraints, the following is a great compression algorithm:

    $infile = $ARGS[1]
    $outfile = $ARGS[2]
    touch $outfile
This was one of my favorite examples in Data Structures and Algorithms in college.


Except that's not lossless like I qualified.


The card trick referred to in the article is likely the one developed by Alex Stone and described in his book Fooling Houdini, [1].

He knew the most important part of any act is showmanship, and he knew that performing well at top-level magic contests required both showmanship and less known tricks, tricks that an expert audience wouldn't be able to puzzle out quickly or easily.

It took him weeks (months? been a while since I read the book) to memorize the sequence of cards in his prepared deck (sequence generated by computer) and weeks (months?) more to go from being to do the trick to being able to perform the trick.

It's a good read - and what I just described only comes in at the end, after his intense and occasionally life threatening research into the world of magic.

[1] http://foolinghoudini.com/


The actual paper with the theorems: http://arxiv.org/abs/1011.4609


It says "submitted to a journal" in 2010,

Does that mean that the paper was accepted, or was rejected 4 years ago and never published?


It looks like it was published in 2012, but the author hasn't updated its arXiv metadata: http://www.sciencedirect.com/science/article/pii/S1570866711...


De Bruijn sequences are cool. Last time I came across them was a fast way to do bit scans. On the older Intel Atoms, an optimised De Bruijn routine to count the numbers of bits set in a word, is competitive with the x86 intrinsic.

https://chessprogramming.wikispaces.com/BitScan


I found out about de Bruijn sequences while brute forcing a digital door lock (it was my own door, my landlord changes the code sometimes). It only works when you are able to keep entering digits until you hit the correct sequence. If the thing resets every four digits, it's useless of course.

A four digit key will yield a 10003 digit de Bruijn sequence so if you push 2.5 keys/s it will take you at most 7 hours to hit the correct key. Doing all keys in sequence 0000, 0001, ..., 9999 will take you at most 28 hours.

Did I find the code? Yes, I asked the neighbor.


If the card trick analogy is correct and it depends on probability bounds, that would mean that this new data compression algorithm is lossy and would be rather suited for video compression etc. and not general purpose data, right?


No, algorithms people don't do lossy compression (Human/Computer Interaction people tend to do those). However you can do algorithms that fail with a certain probability, then you just repeat it until it succeeds and calculate the expected cost. Specially within information theory, there are a lot of strange techniques dependent on things like this.


What a weird way to create artificial boundaries in thr science community.


I think Thomas meant that designing a lossy compression scheme inherently involves making decisions about what parts of the data it is OK to lose -- i.e., in audio processing it might be acceptable to drop frequencies outside of the range of human hearing from the compressed output.


But you can't "just repeat it until it succeeds" because there's a deterministic chance (1/128 in the case of the cards) that a De Bruijn cycle occurs more than once in the same sequence. So to just repeat it until it succeeds you would need to keep rearranging the sequence until you eliminate those redundant cycles.. which obviously doesn't work cleanly because now you have to encode those rearrangements somehow.



Really the subject of the article was the article itself.


[2010]


I liked Pied Piper's backstory better.


He probably made the card analogy to cover the real way he came up with the algorithm


One weird card trick that will change the way you compress your data!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: