Hacker News new | past | comments | ask | show | jobs | submit login
Shuffling (datagenetics.com)
52 points by stansmith on Nov 28, 2014 | hide | past | favorite | 15 comments



While understanding the how/why of algorithms is great, a word of advice: If this were a real project and your standard library has a shuffle() function, just use that one and get on with life.

Save yourself from having to worry about your hand-crafted one being subtly wrong. Save your peers from having to debug your hand-crafted one years after you've left the company. This tip goes for basically any routine that already comes built-into your language or libraries. Spend your time writing code that doesn't already exist. I'm looking at you, co-worker from 5 or so years ago who felt he had to re-implement strings, linked lists, and sorting.


This is great, I never had intuitive understanding of why the naive shuffle was bad other than a vague notion of the fallibility of pseudo-random number generators. Seeing a math equation is one thing but seeing a tree of all possible outcomes visualized in front of you not only makes the bias blatant, but also makes it obvious why the bias is occurring.


Thanks for kind comments.


I love the idea of combining this with the idea of proving code.

Like, "how can I prove my approach 'shuffles' an array?" ... I mean, just because it looks right... doesn't mean it is.


Really useful to remind yourself of this!


Here is a demo of shuffling using a combination of Hindu & Riffle shuffle http://karmadude.github.io/CardShuffler/

I heard someone say once that we have to shuffle a deck of cards 7 times to get a good shuffle. This demo was setup to try to simulate how we shuffle cards, and turns out shuffling around 5 times gives a pretty good shuffle.


To paraphrase, 7 is reasonably able, 14 is guaranteed.

Knuth cites this in TAOVP vol 2, from Aldous And Diaconis, AMM 1986, pp 333-348.


There was a Google Code Jam problem this year precisely about detecting the difference between the fair shuffle and the "terribly, terribly wrong" shuffle: https://code.google.com/codejam/contest/2984486/dashboard#s=...


Coincidentally, shuffling was the subject of this week's puzzle on "Interview Cake": https://www.interviewcake.com/question/shuffle


What motivated that choice of psudocode? I found it very hard to read.


It's looks very close to Visual Basic to me ... :)


Yes, it is VB6.

I did actually put quite a bit of thought into how I was going to put in the code samples. I elected to use BASIC like syntax because IMHO, this is close to as common denominator as possible and most universally understood (this is not an invite to start a flame war!)

I also elected to start by arrays at 1, rather than 0. The purest in my wanted to use zero, but to make it more universally understood, I elected to use one (because it also relates to the real world and card #1). I think more people will understand it this way.

The other thing I almost did not do (but changed my mind at the last minute), was to in-line the swap function and enumerate all three steps. I almost wrote it with Swap(i,s) function instead. Half of me thinks using a function is easier to understand, the other half likes the in-line for explaining it.

It's not as if the logic is very hard, there's one loop.

I'm curious what others think. Was it hard to understand? I'm always keen to learn more. Could I have explained it easier with the changes above?


As a civilian (i.e. non-programmer) I found the example code understandable. I too am wondering what alternative presentation method the post above yours would suggest.

Nice article.


Coming from python and not knowing VB I found the Do Loop syntax tricky at first. I am used to a loop explaining how long it will run for at the top (similar to the for loop here).

Being pseudocode I found it odd to mix loop syntax. One loop has its conditions at the top and the other at the bottom. I guess that is just taste though.


numbering should start at 0. http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EW...

He also made comments about using Basic. Seems you made the wrong choices...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: