Norvig: The Odds of Finding a Set in The Card Game SET

keenerd · on April 29, 2010

Peter, we need to talk.

http://docs.python.org/library/itertools.html

"combinations()" is a built in. No need to re-implement it. Also, nested [... for ... for ... for] is silly. Just use "product()", documented on the same page.

[''.join(card) for card in product('123', 'RGP', '@O=', 'OSD')]

They say that six months after learning the itertools module, people give up on Python for Lisp. Which would be hilarious, in your case.

abyx · on April 29, 2010

I've asked him about this, here is his reply:

"[...] about itertools -- I guess I got the impression that itertools is deprecated in 3.0, or at least that Guido doesn't much like it, so I have stayed away from it. And itertools.combinations is only in 2.6, so I think I will leave things as is, for those still stuck in 2.5."

keenerd · on April 29, 2010

Combinations is also in +3.0. Otherwise a valid concern. (Been burned by that one myself.)

I can't find the source, but there might be some confusion between map/filter/reduce and itertools. Guido does not like map/filter/reduce and wants to get rid of them.

Itertools is a huge part of Python 3.0. Everything is an iterator in the language. And the docs underwent a major rewrite with the 3.0 release. It is not going to be deprecated.

JeffJenkins · on April 29, 2010

It's entirely possible he wrote this before itertools existed. Like in the IAQ where he says things like:

"Better to stick with x += 1, which was added in Python 2.0. "

( http://norvig.com/python-iaq.html )

rkowalick · on April 29, 2010

The really interesting thing about set is that it has a very mathematical interpretation. You can think of sets as points in affine 4-space over Z_3. There is actually a very interesting paper on the whole thing:

http://www.math.rutgers.edu/~maclagan/papers/set.pdf

ableal · on April 29, 2010

That URL was stuck for a bit for me, this newer one worked better: http://www.warwick.ac.uk/staff/D.Maclagan/papers/set.pdf

Thanks for the tip. Paper from 2003; amusing bit in the tail-end Acknowledgments:

We would never have spend so many hours thinking about the “set-Problem” without Joe Buhler’s goading remark that it was shocking that two Berkeley students couldn’t work out the answer. Josh Levenberg helped then by writing our original brute-force program, before we came up with a non-computer proof of “20”.

vitolds · on April 29, 2010

super relevant for me. I have been playing this game with my 6 yo daughter and was frustrated many times that we could not find a set even though the odds said that we should. Finally the mystery is solved.

dmd · on April 29, 2010

There's a bunch of Set solvers online (like http://www.stevenolte.com/set/set.html ) for when you get really stuck.

Also, http://3e.org/set/ is great for practice.

tshaddox · on April 29, 2010

Is brute force the only way to solve the game? I only glanced over the rules so I'm not sure, but it seems like a solution would involve finding subgraph isomorphisms (which is NP-complete) or something along those lines.

rubyrescue · on April 29, 2010

Agreed, this totally validates our experience of the game.

He mentions in the article that the game has 205 5-star reviews out of 238. I heartily recommend this game - it's one of the most mentally challenging games that is equally accessible to children from about 6 all the way to adults - pretty much everyone competes at the same level.

jarin · on April 29, 2010

You just need to stick it out. There's always a set there somewhere.

bdr · on April 29, 2010

There's a factor he doesn't mention, which is which set is removed from the table. Norvig's code removes the first set it finds. Another strategy would be to identity all sets and remove one at random. And who knows how humans play? It's not obvious whether these are equivalent regarding the rate at which the table gets "worse".

keenerd · on April 29, 2010

I've seen moderately serious folk play. They will recognize multiple sets simultaneously and either go for the one closest to them or the one with three cards tightly clustered. And they move fast! Watching them play is like watching a mathematically inclined cobra fight.

bdr · on April 29, 2010

Some more thought (can't edit below due to noprocrast :). As he suggests at the end, the distance between points (cards) on the table seems to be important. If we have some set with a given mean distance between point (or some better measure of clustering), and we choose a set at random, since there are more sets made of distant points, the remaining points will be on average closer together. This is our badness measure (closeness of points on the table) probabilistically getting worse. I'm sure he got that far. The hard part is choosing a distance measure that's amenable to analysis under random-set-removal, but you could probably get a weak result just on mean distance, no? Can't sleep, sorry for rambling.

bdr · on April 29, 2010

Should read "If we have some table".

chipsy · on April 29, 2010

The general reasoning for why the variation is so big isn't that hard.

Set asks you to look for cards that sit at the edges of the bell curve. Most combinations will be "mostly similar" or "mostly different" but not entirely one way or the other.

When you start with a 12-card no-set combination, that means that your first 12 cards were already of a low "quality" and sit in the middle of the curve; the 3 cards added can't push the distribution to an extreme by themselves.

Edit: to think of it another way, inverse the problem; Of the usable 12-card combinations, any number of cards could be added and they'd still have at least one set. Hence you're filtering most good 15-card sets when you do the opposite.

zach · on April 29, 2010

In this kind of "tough cards" situation it seems that there are two overrepresented categories (here, doubles and ovals). So the cards that counter those attributes (e.g. half-shaded singles) would be able to make many sets. But since a card can only make one set and take off two other cards, the ability of those cards to rebalance the tableau is limited.

So when you get in one of these situations where the next three cards are feast or famine, it tends to persist. Having a persistent cluster like that gives you a lot more chance to run into the no-set case.

kwantam · on April 29, 2010

A while ago I was bored on a car trip and coded up a Set game in Haskell. Later, I converted it to a really simple CGI game. I had grand plans for a scoreboard and head-to-head play that fell by the wayside when I moved on to another project, but the original version is still playable:

http://web.jfet.org/~kwantam/cgi-bin/HSetHTML.cgi

If anyone wants the code for all this I can post it...

Sukotto · on April 29, 2010

I find the first example confusing.

The author says "A set is defined as three cards in which each of the four features of the card (color, shape of symbols, number of symbols, and shading) have either all the same value or all different values."

Then the first example of what is a set shows color/shape/number different and shading as the same. Which contradicts the rule he just quoted.

What am I missing here?

mtinkerhess · on April 29, 2010

For each feature, the cards have to be either all different or all the same. What you can't have is two cards of one color and one of another, or two cards of one shade and one of another.

Sukotto · on April 29, 2010

Ah so, for each quanta, the quanta must be identical across all three cards, or different on each card.

All red, or only one red. all solid, or only one solid.

Thanks for explaining that me.

JustAGeek · on April 29, 2010

I love Peter Norvig's posts on stuff like this and I'm looking forward to checking out the code he wrote to do this analysis.

zackattack · on April 29, 2010

I'm going to study that python code. It's great.