Rock-Paper-Scissors: You vs. the Computer

Natsu · on March 5, 2011

It's interesting. I played on "veteran" and lost quite badly over ~20 rounds in spite of trying my human best to pick randomly. Then I enlisted my computer's help.

perl -e "@a = qw/rock paper scissors/;print $a[int rand 3];"

34 wins, 31 ties, 35 losses

I had assumed there would be more bias in it if it were trying to exploit human patterns and that choosing randomly like this would in turn allow me to exploit it as it looked for nonexistent patterns in my choices.

Apparently there wasn't that much bias to take advantage of or it learned that I was choosing randomly and compensated by doing the same. Or maybe I just didn't choose a large enough sample and got unlucky, I'm not too sure. I will say that I was winning a lot more during the first half, though.

akjj · on March 5, 2011

I don't understand what you were hoping for with your random choices: If you choose randomly, you'll win, lose, and tie each one third of the time, no matter what your opponent's strategy is. Your results are consistent with that prediction.

Natsu · on March 5, 2011

I was thinking that there should be a non-random variance to take advantage of if they're trying to exploit human non-randomness.

But maybe I committed a statistical fallacy somewhere in that line of reasoning, because it sure didn't win.

lkozma · on March 5, 2011

There is, but you are not taking advantage of it as long as you choose randomly.

Imagine the other tries to exploit your imagined human non-randomness and plays rock all the time. You play totally random, so on the long term you win/lose/tie roughly same number of times.

You would be taking advantage if you played paper all the time.

Natsu · on March 5, 2011

> There is, but you are not taking advantage of it as long as you choose randomly.

You're right. I was just thinking that 1/3, 1/3, 1/3 was the Nash equilibrium and not really thinking the whole thing through. Once I gave it a little more thought, I came up with the same rock example you gave.

heyitsnick · on March 5, 2011

In most toy games you would be right; playing the GTO/nash equil. solution would naturally benefit from an imbalanced/exploitive style on an opponent. It's only in games like rock-papers-scissors where this isn't the game.

kronusaturn · on March 5, 2011

Yes, you're misunderstanding the statistics a bit. If you're choosing your moves randomly, your win rate will always be 1/3 no matter how non-random the computer's moves are... even if it keeps choosing the same thing over and over.

nostrademons · on March 5, 2011

I just played 30 rounds. 10-12-8. As far as I can tell, it's pretty much random.

nicpottier · on March 5, 2011

If you play a few rounds an option appears to let you see how it is picking.

It seems to use a history depth of 5, both for you and it. So it creates a 'fingerprint' from your past 5 moves and it's past five moves, and then picks what is the most winning move in it's history given that same fingerprint.

That strategy seems to work reasonably well, I tried to just go all paper for a while and it responded with all scissors. Ouch.

heyitsnick · on March 5, 2011

It's more complex than that; it can pick up on simple patterns. If you pick a pattern such as R>P>S and continue the cycle, by the end of the 2nd cycle it will have learnt the pattern and will beat you every time.

nicpottier · on March 5, 2011

Hrmm... How would that pattern not be captured by looking at the previous five plays?

Seems like it would act exactly as you've just described no?

heyitsnick · on March 5, 2011

Not the way you describe it, no. If my last 5 plays are "RPSRP", and my next play is scissors. but looking at these previous five plays, both P and S would be the best plays against this range (2 wins out of 5). However the correct next response predicting the pattern is R (which does the worst against this range with only 1 win in 5).

Edit: Ok re-reading your initial post I think I misunderstood you. However it still doesn't explain how it could pick up the pattern quick enough - there are 3 possible responses each time, so until it's tried all 3 responses it won't know which is best given the fingerprint. However it will pick up the pattern by the 2nd cycle.

I think it's using a more sophisticated learning strategy than you suggest.

KeithMajhor · on March 5, 2011

The "fingerprint" is used to search a "history" of games against other humans. Based on the move that most commonly follows that "fingerprint" the computer selects the winning move. I'm assuming the "history" comes from within the game itself which is probably played by lots of people like you who want to test it with one of the two possible cycles (RPS or RSP). I suspect that the reason it picks up on the cycle so quickly is because all the curious tinkerers have biased the data it uses to make moves.

heyitsnick · on March 6, 2011

All my tests were done on the 'naive' bot that had no prior knowledge of play.

nicpottier · on March 5, 2011

I'm sure it uses some simple heuristics when first starting out, perhaps even guessing that you play at an even distribution.

But once it builds up a database, I imagine that is far more effective.

justincormack · on March 5, 2011

18-5-2 using two windows, one to see what it would do and then choosing what beats that in the other. The moves drop off the bottom so this strategy gradually loses efficiency, but its enough to beat it...

mathgladiator · on March 5, 2011

Great way to beat a deterministic machine!

vampirical · on March 5, 2011

Hmm interesting, I tried to play as I would against a human and got some interesting results.

Over 20 games I end up tied in wins/loses with the untrained version plus greater than half straight ties. It seemed I should be able to add one more layer of "predict" to my crazy rock-paper-scissors reasoning and win solidly but it didn't seem to change the results at all. I wonder if it just progressed evenly with me.

On the other hand, I firmly beat the trained version with 80-90% win and mostly ties making up the remainder.

I'd be interested to see the algorithm behind it as I get the feeling the untrained version was making good choices for all the wrong reasons and with a slightly more complex game it would show.

NinetyNine · on March 5, 2011

The algorithm for "veteran" is really simplistic. It takes your last x throws, searches it's list of throw histories from other people, and determines what the most likely next throw is. I'm currently trying to duplicate this in one line of Ruby.

hardy263 · on March 6, 2011

A statistics teachers from another class in my high school had a rock-paper-scissors game at the end of the year. But the catch was, you had to bet your mark in the game. If you won, he adds 1% to your final mark. If you lost, you lose 1% from your final mark. Of course, there was a statistics lesson at the end to be learned from it.

Even though it appears to be that you both have an equal chance to win (both of you have 33%), why would someone offer you an equal game unless he had some sort of advantage that you aren't aware of?

The trick was to look at the opponents hand while it was still going down. If you're throwing rock or scissors, your hand will be clenched the whole way down. If you were throwing paper, you will probably open your hand halfway down your throw.

So when he sees you opening your hand, midway he will change his to scissors. And if it doesn't stay open, then he will use rock, since between rock and scissors, rock will either win or tie. It required a bit of hand dexterity, but he practiced wing-chun, so it wouldn't have been a problem for him. Of course, the trick doesn't work 100% of the time, so you were allowed a maximum of two tries discounting ties, preventing kids from getting winning streaks.

It was a pretty interesting lesson, and I was glad that I skipped my statistics class to go to his.

athom · on March 5, 2011

I'd like to take this opportunity to present the coolest take on RPS I've EVER seen: a board game, invented by oddball web cartoonist Tailsteak! He presents the rules in this strip:

http://leftoversoup.com/archive.php?num=34

and you can play the game online at an older site, here: http://tailsteak.com/archive.php?num=198

Unfortunately, he doesn't seem to have come up with a computer player engine that I've seen, so you're stuck playing against yourself. Maybe if someone's looking for a challenge... ;)

Now that I think of it, this guy should get more attention here. The creator of webcomic 1/0 (http://www.undefined.net/1/0/) is a man of many interesting ideas in a variety of fields, well worth a moment of the young entrepreneur's time.

Go check him out!

[EDIT: Before anyone asks, NO! He is NOT me!]

JCB_K · on March 5, 2011

And this is exactly why we all should be playing Rock-Paper-Scissors-Lizard-Spock.

nostrademons · on March 5, 2011

Or RPS-101:

http://www.umop.com/rps101/rps101chart.html

JCB_K · on March 5, 2011

Ok. This wins.

fjh · on March 5, 2011

Neat. Does anyone know if there are data sets for games of rock paper scissors available, the kind the computer seems to be using? I would really like to play around with one.

gurtwo · on March 5, 2011

I picked the seconds on my digital wrist watch, modulo 3, as a 'random' choice. 16-16-20. Far better than when I made the choice myself.

edanm · on March 5, 2011

After playing a few rounds, you can click on "See what the computer is thinking" to see the reasoning behind the next choice.

SeanDav · on March 5, 2011

Played Veteran - First few rounds were very even 8-4-8 then I kind of figured out the algorithm and stopped on 19-10-10.

Nice little program.

BasDirks · on March 5, 2011

5-14-1 veteran.

zerd · on March 5, 2011

13-5-6 veteran :)

KeithMajhor · on March 5, 2011

I started rationalizing how you could have won by so much and then I thought about the old apple ad campaign: http://www.premisemarketing.com/images/uploads/apple-think-d...

BasDirks · on March 8, 2011

Typical game:

+SCISSORS =PAPER +PAPER +ROCK =SCISSORS =ROCK +SCISSORS +PAPER -SCISSORS =SCISSORS =PAPER +PAPER +PAPER -ROCK +ROCK +SCISSORS +SCISSORS

win @ repeat: 5, win @ change: 5, loss @ repeat: 0, loss @ change: 2, draw @ repeat: 1, draw @ change: 4

First 5 throws are 100% predictable and its results are used by the program for subsequent predictions. It's too easy to set up traps.

CognitiveLens · on March 5, 2011

10-6-4 veteran, first attempt.

tybris · on March 5, 2011

8-1-1 against Veteran. Now what?