Hacker News new | past | comments | ask | show | jobs | submit login
How can a computer deal a poker hand? (fermatslibrary.com)
76 points by johnaston on Sept 30, 2016 | hide | past | favorite | 30 comments



> After solving a challenging problem, I solve it again from scratch, retracing only the insight of the earlier solution. I repeat this until the solution is as clear and direct as I can hope for. Then I look for a general rule for attacking similar problems, that would have led me to approach the given problem in the most efficient way the first time. Often, such a rule is of permanent value.

This has been one of my principles all my life, and I was just thinking recently about the fact that I've never seen it written down anywhere. (Also, about the fact that it really is a question of personal style, because I've read through rough work/notebooks from very successful people who don't seem to think that way at all.)


Isn't that a bit like "Plan to throw the first one away"?

https://en.wikiquote.org/wiki/Fred_Brooks


Seems more like: "next time, how can I not have to throw the first one away?"

Maybe you never get to that point, but you could at least have to throw away less, or be closer to a final version the first time around.


It's crazy that a sofware engineering book from the 70's is still pretty relevant


True wisdom is timeless. Assuming we're still around for another two and a half millenia, I imagine the Art of War will still be read and studied.


Buy the cheapest one the first time. Then keep a list of all your regrets and wishes, and refer to this when buying the next one.


I solve every programming problem this way. The first few iterations have lumps that seem necessary, but still lumpy. The final iteration is nothing but a small set of aesthetically pleasing rules that seem self-evident in hindsight.


I'm the same. Not not wrong about hindsight, why does it look so painfully obvious after I have gone through 3-4 iterations haha.


> It is easy to generate a random sequence of inte- gers in the range l..N, so long as we don’t mind dupli-cates

Is it? I was surprised to learn of modulo bias [1]. Does the "Randint(1, J)" implementation take it into account? It's a very easy mistake to make and can have an impact on the uniformity of the shuffle.

If you haven't heard of it it has to do with the relationship between the range spanned by your system's RAND_MAX and J. If RAND_MAX is not a multiple of J, the highest range (RAND_MAX-J..RAND_MAX) will not deliver the entire span of J.

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#M...


Any random number generator library worth its salt should have a properly implemented RandInt(1, J) function available, instead of just a RandInt(1, RAND_MAX) primitive.

Behind the scenes, mind you, it's easy to implement such things from random number generators of a fixed range. For example, it could be done as follows (for convenience, I'll speak as though our fundamental randomness primitive just produces random bits, though all sorts of things would work just as well):

We ought keep track, internally to our random number generator library, persistently of some interval [Low, High) in [0, 1) (which can be initialized to all of [0, 1) itself); any time a RandInt(1, J) is required, we partition [0, 1) into J many equal intervals, find which one [Low, High) lies entirely within, use the corresponding number as the value to return, and also modify [Low, High) by applying to it the affine transformation which would take that particular subinterval to all of [0, 1). If ever, while doing this, we find [Low, High) spans multiple subintervals, we first do the following until it does not: generate a random bit and then replace [Low, High) with either its lower or higher half accordingly.

Essentially, we are using [Low, High) to keep track of an underspecified value uniformly distributed throughout [0, 1), and then pulling out leading "digits" in arbitrary bases of this value as required by the user, zooming our window in accordingly after doing so. Random bits are used to further and further specify this value, and thus no randomness ever goes to waste. At all times, we will have made the minimum number of invocations of random bits necessary to power the amount of random integers of whatever varying sizes asked for so far.


> pulling out leading "digits" in arbitrary bases of this value

Generally this bookkeeping isn't worth the complexity. Modern PRNGs are super efficient: PCG is about 3x faster than Mersenne Twister.


Well, supposing one wanted "true" randomness and had access to a source of such, but still cared to be efficient about using it.


Sure, but in practice I think we'd be hard pressed to find someone with such stringent requirements.

For all practical purposes, using your entropy stream to seed ChaCha is more than good enough. Want something with a proof? Seed a Blum Blum Shub generator.


That article is very interesting, in a strange way. It points out so many simple, obvious things about card shuffling that aren't visible to a programmer from a high level view


This looks very familiar. I think I read part of it back in the 80's when I was first learning programming. I think it was in reference to "How to truly 'shuffle' a deck of cards using a computer algorithm. Lots of discussion on techniques to do close to true random shuffling of ordered decks.


It's Bentley's _Programming pearls_ column from the _Communications of the ACM_, which was collected into a series of books, such as _Programming Pearls_ and _More Programming Pearls_, where you probably saw it.


Is this a mathematically sound shuffle? It removes a random element from a sorted deck and then adds the element to a new deck. Python code below:

import random

def shuffle():

    random.seed()   
    sorted_deck = range(52)  # [0..51]
    shuffled_deck = []
    slots_avail = 52
    while slots_avail:

        idx = random.randint(0, slots_avail - 1)
        shuffled_deck.append(sorted_deck[idx])
        del sorted_deck[idx]
        slots_avail -= 1

    return shuffled_deck


What you've described is essentially the Fisher-Yates shuffle implemented slowly (O(n²) implementation). The O(n) implementation would be:

  deck = range(52)
  for i in range(len(deck) - 1, 0, -1):
      j = random.randint(0, i)
      deck[i], deck[j] = deck[j], deck[i]
Intuitively, what the latter implementation is doing is keeping the sorted_deck and shuffled_deck variables in the same array, moving the partition each iteration. The big problem with this method is that it is very easy to make a mistake in its implementation, and any slight mistake will lead to horribly biased outputs.


It looks correct to me, but keep in mind that deleting elements from the middle of an array is O(n), so overall your algorithm is O(n^2). You could fix this by always swapping your chosen element with the last element of the sorted_deck before removing it because removing from the end of an array is O(1).


Try your algorithm with a smaller n first, instead of n=52, such as n=2 or n=3, and slots_avail=n.

For n=3 it generates the permutation {1,2,3} with probability that is 11% higher than the correct probability.


Would you mind sharing your proof of this? I don't think you are correct. It seems to me that OP's algorithm is correct and will yield all permutations with equal probability.

All he's doing is picking a random card from the sorted deck and moving it to the top of the un-sorted deck. The sorted deck then becomes one card smaller.

For N = 2, In position 1 you choose card 1 with 50% and card 2 with 50% probability, and card 2 is just the remaining card. Thus

    p(12) = 50%
    p(21) = 50%
For N = 3, In position 1 you choose card 1 with p=1/3, card 2 with p=1/3, and card 3 with p=1/3:

    p(1xx) = 33% = 1/3
    p(2xx) = 33%
    p(3xx) = 33%
Then using the remaining 2 cards you just do the N=2 case from before, and so you have:

    p(123) = 50% x 33% = 1/6
    p(132) = 50% x 33%
    etc


You're right, I think I misread the algorithm.


So I collected data for 100,000 runs

delta = (abs(count - mean) / mean) * 100.0

sd is standard deviation

>>> shuffle.test(100000, 2)

(0, 1) count: 50127, delta: 0.3%

(1, 0) count: 49873, delta: 0.3%

mean: 50000.0, sd: 179.6, sd/mean 0.4%

>>> shuffle.test(100000, 3)

(0, 1, 2) count: 16873, delta: 1.2%

(0, 2, 1) count: 16506, delta: 1.0%

(1, 0, 2) count: 16667, delta: 0.0%

(1, 2, 0) count: 16761, delta: 0.6%

(2, 0, 1) count: 16498, delta: 1.0%

(2, 1, 0) count: 16695, delta: 0.2%

mean: 16666.7, sd: 146.0, sd/mean 0.9%


random.sample(range(52), 52)


Great article from a classic series that every programmer should read.

Here's an odd thing -- when I copy'n'pasted the awk implementation the pasted copy had errors reminiscent of OCR. For example the (j became Cj, the ARGV[2] become ARGV121.

What's going on here? Is the website detecting the copy and make the browser get something else? Is it Chrome - or something else doing some guessing?


I think the document is scanned from a paper copy. The pixels shown on the screen is from the scanned image, but the text for copy-and-paste was created by OCR when the document was scanned and prepared. You often see this with pdf files, I guess the software used to create them has OCR functionality built in.

If you use "inspect element" in your web browser you can see how it is displayed. Each page is a <div> which contains first a big image (with the scanned document page), and then a <span> for each word, which are positioned using absolute positioning to be placed over the corresponding word in the image. The <span>s each have "color: transparent", so you can't see them but you can still select them.


Been reading on Fermat's Library for a while now. Love the product!


Does anyone know if this paper is part of the book Programming Pearls also by John Bentley?


It's in "More Programming Pearls" from the same author (who, by the way, is called Jon Bentley, not John Bentley)


Thanks!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: