I've seen a couple of oblique references to this in other comments, but I'll go a little deeper: if you want to truly understand the prisoner's dilemma scenario and how it relates to human, or more generally, animal behavior, there is incredible coverage of the topic in Richard Dawkins' The Selfish Gene. He pioneered the idea of running PD "competitions" in an attempt to discover the best evolutionary strategy for PD, and how these strategies might predict large-scale behavior in animal populations. Each competitor (other scientists and researchers) submitted a strategy for handling the PD in the form of written code that could face off against other strategies. Each round of competition consisted of thousands of iterations of the game. Some strategies were simple and some were incredibly complex. Across two competitions, the second with 60 competing strategies, the simple and "non-cynical" strategy called tit-for-tat won handily. (Tit-for-tat is mentioned briefly in the original article without any explanation.) This is the strategy:
- Unless provoked, the agent will always cooperate.
- If provoked, the agent will retaliate.
- The agent is quick to forgive.
- The agent must have a good chance of competing against the opponent more than once.
The fact that this strategy was a consistent winner in his competitions has led Dawkins to argue that "nice guys finish first" and could be a partial explanation for certain forms of so-called "altruistic" animal behavior -- behavior that seemingly gives no benefit to the altruist, except for the fact that too may benefit when it is their turn to be helped. See also:
Thank you for the correction. I read the book about 10 years ago and completely forgot that he wasn't talking about himself. Finally had a chance to look at the chapter again now and indeed he is discussing Axelrod's competitions in great depth.
> - The agent must have a good chance of competing against the opponent more than once.
I was actually thinking about this the other day. I was wondering how much the population size and matching algorithm in the iterated Prisoner's Dilemma matter to the strategy outcome. I suspect it's quite a lot, given that if you never play the same opponent again, strategies that defect often should come out quite well, but if you always play the same opponent, strategies that cooperate should do well. If grouping is bought into the system, such that there are multiple levels of subgroups with the smaller the group both are members in having higher chances of replaying in the same round, you might see interesting emergent behavior.
I also wonder if that might allow for special elements (the "sociopath") to have interesting divergent strategies that work well as long as they don't achieve more than a small fraction of the total population and keep a relatively large amount of interactions to strangers.
http://ncase.me/trust/ is a great explainer that explores that question. How do different strategies compare depending on number of repeat interactions, population diversity and noise.
Thanks, I think someone linked to that here the other week, but I didn't have a chance to go through it at the time. It's very well done, so I'm glad you supplied the impetus to revisit it. :)
That seems a bit extreme for a browser app. Are you using a docking station, and have your laptop closed? I find mine sometimes overheats and shuts down when it's docked and closed and I'm doing something extreme like playing a game, but propping the lid open about an inch or two helped immensely with airflow.
Yes, if "sociopaths" are rare then most of their games will be against opponents who haven't played against them before. One suspects this also means that if the number of games goes up dramatically (say we move to the city and enter an occupation with lots of daily transactions), initial cooperation will decrease as sociopaths are seen more often.
It's not crucial that you meet the same partner (opponent) again. It's enough for you to be in a population where there is a good chance that the partner will cooperate if you do. For some definition of "good chance" the right strategy is to try to cooperate yourself and then fall back on something else if it fails.
And this is interesting because it gives us a toy model of moralistic "Do unto others" (as opposed to rational exchange).
Can you explain why cooperation still works when you don't repeat partners? (Assuming I've interpreted you correctly.)
The reason cooperative strategies can work is that actions depend on previous actions. If you are not meeting the same partner, then your partner's actions will not depend on your previous actions, and you will always do better by defecting.
Now that I think about it, you are right that for PD, there has to be some one-on-one iteration too. I think my automatic mental model was that agents came together and played a few rounds with one partner, and then switch to some other partner.
In this case you can afford to be nice in the first round, and then cut your losses. But even that strategy is only optimal if most of the rest of the community is nice in the first round too. If it is full of defectors, then you should defect in the first round too.
"From Rusticus... I learned... with respect to those who have offended me by words, or done me wrong, to be easily disposed to be pacified and reconciled, as soon as they have shown a readiness to be reconciled."
I thought Tit-for-tat was the strategy that always repeated opponents last move. "Tit-for that with forgiveness" was the strategy that would occasionally cooperate even after an opponents defection.
I copied those bullet points from the wikipedia article but yeah the wording is a little inexact. I think "quick to forgive" just means "retaliation only has a memory of one move".
Two people can hunt deer together, but if they are alone, they can only hunt rabbits. The person belonging to the Envious group will choose to hunt rabbits because he or she will be at least equal to the other hunter, or maybe even better; the Optimist will choose to hunt deer because that is the best option for both hunters; the Pessimist will go for rabbits because that way he or she is sure to catch something; and the hunter who belongs to the Trusting group will cooperate and choose to hunt deer, without a second thought.
The largest group, accounting for 30%, being the Envious -- those who don't actually mind what they achieve, as long as they're better than everyone else;
Next are the Optimists -- who believe that they and their partner will make the best choice for both of them -- on 20%.
Also on 20% are the Pessimists -- who select the option which they see as the lesser of two evils -- and the Trusting group -- who are born collaborators and who will always cooperate and who don't really mind if they win or lose.
There is a fifth, undefined group, representing 10%,
Note that the SD is not actually a dilemma. The PD is a dilemma because you ALWAYS do better for yourself in any given iteration by defecting no matter what your opponent does. In the SD, if your opponent defects, you still do better for yourself by cooperating. So of course people will cooperate more under those circumstances. The only thing left to argue about is what the payoff matrix for real-world situations looks like.
You're correct in most of your description, but it's STILL a dilemma. It's a game of chicken. The dilemma is wanting to wait to see what the other player decides, so the dilemma is everyone refusing to decide.
You're right. In fact, upon reflection, it is the PD which is not really a dilemma but rather a paradox. The logically correct move in PD (assuming a purely utilitarian quality metric) is always clear: if the game is non-iterated (or iterated with a known horizon) then defect. Otherwise play tit-for-tat. This is true no matter what your opponent does. In the SD the logically correct move depends on your prior on what your opponent is likely to do.
In fact the strategy is simpler if the other player is exactly as rational as you: simply cooperate. The other player, being exactly as rational as you, will follow the same reasoning as you do to come to the same conclusion, and will do exactly the same thing as you. I would always cooperate with a copy of myself, for instance.
There are variants of the prisoner's dilemma where you can check the 'rationality' of your opponent: the prisoners' decisions are chosen by computer programs, and each program is given the other program's source code as an argument. This is often referred to as "program equilibrium", and an optimal strategy is to cooperate iff running the other program with our own source code as an argument results in cooperation.
This runs into problems of computability though: if we run the opponent's code to see what it does, and it turns out that their program runs our code to see what we do, we can get stuck in an infinite loop.
Like most incomputable problems, we can take a conservative approach: try to prove that it cooperates, using some incomplete computable algorithm; defect if we can't prove it. One simple algorithm is to check if the source code is the same as ours; more sophisticated algorithms would perform static analysis of the code.
I never really got my head around the Parametric Bounded Löb paper at https://arxiv.org/abs/1602.04184 . Clever people have persuaded me that it was wrong (for a certain reason related to the existence of nonstandard models of PA) and that it was true.
TLDR: In Prisoner's Dilemma, defecting is a dominant strategy, and both players defecting is the only Nash equilibrium. In Snowdrift, there is no single dominant strategy, and no Nash Equilibrium exists at all.
By design, the Snowdrift game rewards cooperation more than Prisoner's Dilemma, so it's no surprise that people cooperate more in Snowdrift. The real question is which game better models reality.
The Axelrod library is a Python implementation of the Prisoner's dilemma (tournaments, evolutionary dynamics etc) with over 200 strategies. It can also be used to study other games (http://axelrod.readthedocs.io/en/stable/tutorials/advanced/g...) so it would be straightforward to study the snowdrift game using it :)
Re some of the other comments about Robert Axelrod's tournament and evolution:
https://snowdrift.coop is an in-progress fundraising platform for public goods (particularly free/libre/open software and cultural works) based specifically on addressing the cooperation needed to solve the snowdrift dilemma…
I just invented a different snowdrift game, modeled on something that happened to me a couple of years ago.
You live in a townhouse next to another family that shares a driveway which turns onto a medium traffic density one-way street (there are traffic lights a few hundred feet away which control flow). Gaps in traffic tend to exist around when the lights change.
The frequency of snowstorms this year is higher than normal, and the overworked snowplows have made a snow bank too tall and wide to see around in your car. The only way onto the street is to turn blindly and hope that either no one is coming, or that they see you in time to stop.
Your salary is $x/hour, and it would take two hours to manually trim the snowbank to the point where you can see above it. Thus, it would cost $2x to remove the snowbank (assuming you could work overtime and earn the money).
Each party makes a choice if they would shovel the bank alone, with help, or not at all. If they both do it, they both lose $1x in OT wages. If one does it alone, $2x. But if neither do it, the hazard remains and each party rolls a 6-sided die. Anybody rolling a 1 loses $kx (where k is larger than x), and no longer have a car to participate with.
That covers one day. It will take 7 days for the city to remove the snow bank if no one else does. Any parties with a car will repeat this 6 additional times until the snow bank is removed.
Then rank the dollars lost and award real world prizes for the top positions. If there are 10 people and the top 5 get the exact same prize, shoveling might become attractive (through co-operation, as it is extremely likely at least one of the two parties per trial will be eliminated).
First, the point of these discussions is explicitly to stay within the lines. We all know there's a billion elaborations that you could make, but those are different problems. Worthy and interesting problems, even, but different problems.
Second, there isn't always a great option for that, no.
Not when applied without wisdom, but it's an excellent predictor of subsets of human behaviour, and a subset of those generalise well enough to satisfactorily explain existing phenomena. Simply because one has to be selective doesn't make it unworkable. It does make for misleading headlines, though.
Simple laziness theory dictates that one should open a window and use a mirror on a telescoping pole (aka "inspection mirror") to better see around the snowdrift, and make liberal use of your vehicle's horn as you make your somewhat-less-than-blind turn.
If you really want to cooperate, buy two, and sell one to your neighbor at cost.
I came across an interactive applet last month (on Reddit maybe?) at http://ncase.me/trust/
It explores IPD with a few different strategies, and I found it quite interesting. I'd be curious to see how the various mechanics of it would work with ISD.
As mentioned elsewhere in this thread, the payoff for these games never takes into consideration reputation effects. That is, what's the result of other participants observing your actions? Are they more likely to trust you or not trust you when you interact?
I mention this because it's clear that we're changing the way we trust and interact with each other online. Whether it's virtue signaling or trolling, we're adopting new strategies for "winning"... whatever that prize may be.
See my other comment. In the competitions discussed in The Selfish Gene, PD strategies are allowed to have memory and the simulation is run thousands of times per round. However it is pretty consistently the case that long-term memory is not an outright win for either the competitor or the opponent, compared to the simple "tit-for-tat" strategy which only remembers the previous iteration.
Of course the PD only models certain kinds of human/animal interactions. Knowing that a particular person is a cheater or is vindictive or whatever is obviously important for evolutionary strategy as a whole. But the question is more about heterogenous populations. Is there an optimal strategy that wins more than any other across a wide range of opposing strategies, including if everyone else is actually using the same strategy?
The benefits of cooperation can't be modeled so simplistically, since the concepts of both long-term reputation and long-term retribution need to be taken into account.
You don't need any model to explain inconsistency. People are not consistent because they are learning the rules and results of the game as they play. The players' first move might be somewhat random, but every subsequent move is based upon their previous experiences with the game. That is why you do these things iteratively.
Sometimes I wish life was just like that Black Mirror episode with reputation scores for everybody. Nobody wants to cooperate because there is hardly any long term reputation to worry about when interacting with most people (ie cutting off that car in the highway, being rude to that stranger next to you)
The point here was that credit scores do not only rate your own behavior, but also those of people around you. If your neighborhood is bad, your credit score might reflect that instead of your own performance as a debtor. And if your neighborhood becomes hip and gains a better reputation, your credit score might improve even though you're still the same person.
Capitalism sometimes rewards those who provide value, but more often than not has no problem rewarding people for spouting self-righteousness.
If you think that everyone who gets rewarded in a capitalist society do so because they provide value then you have to ask: What value do scam artists provide?
Society, and even biology, has invented all sorts of mechanism to reward/punish you for actions. The emotion of shame, for example, is a strong incentive to behave as expected, even among strangers. Shame (and pride) are probably evolutionary adaptions to make the connection between an action and the resulting change in your reputation more immediate.
On the other end of the scale, criminal law punishes you for large transgressions, and sometimes prizes reward you for good deeds (although the latter is mostly achieved via market mechanisms).
Both criminal law and awards also work via reputation, by publicising your actions to a wide audience.
It's also interesting to contrast the discussion in this threat to any number of HN threats on freedom of speech in the last weeks. There, the idea that someone's reputation could be tarnished by their participation in a neo-nazi torch parade was generally considered to be the end of freedom.
I don't want to be gratuitously negative here. So let's start with the fact that it's great to have a model of cooperation as something out there to be improved upon, or refuted, or defaulted to in the absence of obviously better models.
However, if the idea is that a model is good at "explaining" cooperation by setting up a hypothetical where people choose it a lot, then I guess the "best" model would make people choose cooperate all the time.
So you can set up the points for cooperation to be infinity and the points for all non-cooperation to negative infinity. And you can wrap it all up in a hypothetical story about what the choices mean. Now you have a perfect model where people choose cooperate 100% of the time. But I don't feel like I'm any closer to understanding how cooperation evolved in humans.
> However, if the idea is that a model is good at "explaining" cooperation by setting up a hypothetical where people choose it a lot...
I don't think that's the point. I think the point is that SD is closer to the average case for cooperation than the PD, so it might be a better tool, in general, for talking about and reasoning about cooperation. (Both PD and SD have the nice property that they are simple and straightforward, and both have the problem that they are only approximations.)
The article suggests that a higher percentage of people choosing the "cooperate" option means that the model in question does a better job explaining. Explaining is the article author's chosen word. But maybe you are right that the author is missing the point, I'm not sure.
But even that's a tough sell for me, since the ideas you suggest as substitutes for the term "explain" are things I would put under the umbrella of what is meant by explaining.
These experiments are about trying to explain what you mean by "cooperate". Your points feel right, but even if they're true they aren't very helpful in actually explaining behavior. For example, your explanation seems to directly apply to adoption, but most people don't choose to adopt despite the enormous number of orphans in the world. Why not?
> These experiments are about trying to explain what you mean by "cooperate".
It's not clear to me what you mean by this.
In the example of the experiment in the example cooperate means the snow is cleared faster.
In my example cooperate means raise the young together.
> but most people don't choose to adopt
Because our biology drives us to want to have our own offspring. It seems obvious that if our biology drove us to want to raise other people's offspring we wouldn't be having this conversation. Although this has been an extraordinarily successful strategy for dogs and cats.
> Because our biology drives us to want to have our own offspring.
But just a few minutes ago you said our biology drives us to "cooperate to raise the young." Do you see how this topic might deserve some more study? Basically, my only issue with your comment is "The end."
You explain why but not how. Is it a spandrel from some other evolutionary trait that eventually became something important? Is it a process of higher-order brain functions that we have? Is there an alternative method that brought this about (stuck in a proverbial room with only a fruit basket and no way out)?
You can't talk about explaining Prisoner's Dilemma these days without mentioning Nicky Case's excellent "Evolution of Trust". My 5yo simply loves the game.
Btw, if you are into data viz or simply enjoy Bret Victor-inspired interactive programming, Nicky has tons of similar experiments worth checking out [2].
I get why game theory is an attractive model to explain human behavior. But I always wonder why they invent the game, and try to explain behavior in terms of a particular model, rather than using real behavior to fit model parameters, so you could get (pseudo-) empirical numbers for "payoffs" etc.
Or maybe another way of putting it is that clearly rational and human behavior varies with the payoff structure, so it would make sense to include that as another variable.
Uhhh... what made PD important was that it was, precisely, the extreme of non-cooperation; a situation where cooperation was logically possible, but not rational. EVERY other game where cooperation is at least possible is better at promoting cooperation, not just one. If peeps have thought iterated PDs are the acme of cooperation, that's pretty crazy, too.
Incredible that only 48% of people got out of their car to shovel the snow... how can you feel good about yourself, sitting in your car, watching somebody else clear a path for you, and not helping?
"... which involved 96 participants ... Each pair repeated (“iterated”) both games 12 times, though were initially told the number of repetitions was randomly determined. The researchers created global competition by revealing that the players with the four highest pay-offs would receive monetary awards."
What people say they would do in a simulation might be different to what they would. Maybe they didn't bring a shovel. Maybe there's already 96 people at the snow face and you'd only be getting in the way.
Exactly. In the real world, this is likely iterated dozens of times every winter, against neighbors you'll see repeatedly.
In a college classroom, where someone tells you you'll get fifty bucks if you score high on a game? If they tell me I get bonus points for stealing their shovel, why wouldn't I choose that option? Nothing is at stake.
Also, the real world is rarely so symmetric. Most of the time, one party has diminished ability to retaliate, and so gets defected against much more often.
People have been doing Prisoner's Dilemma with different payoff amounts since the dawn of PD, and of course this can result in qualitatively different strategies.
But sticking to just the rules, rather than the results. Is is the snowdrift game just a payoff score tweak? Or is there some structural difference that I missed?
For me the point of Prisoner's Dilemma is that math alone can't give you the answer. The answer lies in your ability to trust the other side and to believe they will trust you.
I always considered the pay-offs in PD to be variables with various positive or negative values. That would make SD is just a subset of PD with different pay-offs.
Firstly, from the article: “In principle, natural selection predicts individuals to behave selfishly” is a faulty premise. Obviously humans who cooperate to raise children to sexual maturity will have more descendants.
Secondly, in these simulated games it seems likely to me the participants would be, at least to some extent, randomly selecting cooperate / defect because a potential monetary reward for participation isn't the same as "cooperate or everyone freezes to death in their car tonight", or whatever real-world consequences might apply where you don't get "12 iterations".
The way I see is: while our worldview is "everything is a competition" that is how we will interpret what we see. Evolution and economics being the most strident examples. Anything that doesn't automatically fit the worldview needs explaining by science, which is code for "publish or perish". If we had a different worldview I'm sure we would shoehorn everything to fit.
"while our worldview is "everything is a competition" that is how we will interpret what we see."
If you think that the idea that competition is rife is merely a product of our world view, stop competing for a while and see what happens. You may have to think carefully about what all constitutes "competition"; you come from a very long line of survivors and there's a lot of competitive behaviors that come naturally to you. Squeezing them all out may take some work.
Competition is a second-order effect; the primary cause is the limited nature of desirable resources, and the ability of resources consumers to step up their rate of consumption exponentially in the face of an increase of resources. Unless you can prove that resources are not limited here and now, you're going to get competition.
Scarce resources can be coped with by both cooperation and competition (and other modes?). If there's limited water, we can share it - thus enabling us to have more people to hunt/gather resources. If you keep all the water you'll be stronger, but perhaps you'll hunt less effectively, struggle to gather enough resources, etc..
Competition isn't always the best way to cope with limited resources.
Competition between different groups within which there is cooperation, that's another option, of course.
>> “In principle, natural selection predicts individuals to behave selfishly” is a faulty premise.
I'm fairly certain that cooperating to raise your own offspring is still considered selfish behavior. Something like taking resources from your offspring to give to someone else would be more in line with the definition of selfless (e.g. taking food from your own malnourished child to give to another child).
If you want to think clearly about these issues, you need to be very careful to define "altruistic" and "selfishness" for yourself properly. It seems to be a common temptation to assume as you implicitly do that an altruistic or selfishness-free act must net harm at all time scales the person so acting, but if you try to map that definition back to the real world, you find that it doesn't isn't as useful a model as you'd like for the behaviors you care about. I have phrased that very carefully; it is not as if it is an objectively wrong definition, because it's pretty hard for such a thing to exist. It's just that it's not exactly what we're trying to figure out here.
> I'm fairly certain that cooperating to raise your own offspring is still considered selfish behavior.
Cooperation is selfish.
I think this strengthens my point about how the dominant worldview colours our thoughts.
It could be argued that even a sports game, the epitome of competition, requires magnitudes of order more cooperation in order for it come about that the game can be played.
But then it could be argued that it is the selfish desire of the individual to want to play that drives the cooperation to compete.
This reminds me how all(?) Wikipedia articles can lead to philosophy.
The "individual" in evolutionary game theory is generally considered the gene, not the human. A lot of facts of life - like raising offspring at all, or death - make zero sense if you consider the actor to be a person, but make a lot of sense if you consider the actor to be a gene.
I would agree with that but it doesn't appear to be what Kümmerli is saying:
> “In principle, natural selection predicts individuals to behave selfishly,” Rolf Kümmerli, co-author of the study, told PhysOrg.com. “However, we observe cooperation in humans and other organisms, where cooperation is costly for the actor but benefits another individual.
I think that's the "faulty premise" that TheSpiceOfLife is referring to. He's saying that building a study to explain cooperation between humans is modeling a phenomena that's a social construct anyway, when the real question is "what genes lead to behavior that ensures the survival and propagation of those genes?" Natural selection operates on the genetic level, not on the human level, so Kümmerli made an error by trying to inject natural selection arguments into a model where the primary actors are humans.
> A lot of facts of life - like raising offspring at all, or death - make zero sense if you consider the actor to be a person, but make a lot of sense if you consider the actor to be a gene.
Raising offspring presents a huge cost to the parents, but every living organism does it because it's the only way to ensure that your genes will propagate on to the next generation after your death. Those that don't are washed out of the gene pool, and irrelevant after one lifespan.
Similarly, death is obviously bad for an individual - it's the termination of your existence. However, it frees up resources for new individuals who may be better adapted to conditions of the time. Species where individuals rarely die may see the whole species die off at once as they get outcompeted by other species whose individuals are better adapted to the environment.
(There's a societal analogue as well here: historically, societies that hold tight to tradition and preserve the internal firms & institutions within them end up being conquered, en masse, by more competitive societies where the individual firms within them either adapt or die.)
I'm aware of the gene-centered view of evolution, but I think that interpretation of the sentence in question is quite a stretch. If that was the intended meaning, then I apologize.
Your first point is pretty reductive. The idea of the selfish gene has been very dominant until recently. Your very example can be reframed as selfish behavior.
> Firstly, from the article: “In principle, natural selection predicts individuals to behave selfishly” is a faulty premise. Obviously humans who cooperate to raise children to sexual maturity will have more descendants.
I think there's an enormous gulf between the way logic-dorks think people behave, and how people actually behave.
All this game theory stuff needs to go in the trash, frankly.
The article is from 2007, but the research provides real value even if the fundamental concept isn't novel. It says that besides the hypothesis that this adjustment has a particular impact, here's data from real research testing that hypothesis.
This does not at all explain human behaviour: Most people would not shovel all the snow themselves while their counterpart chills in their car. And many people cooperate in prisoner's dilemma situations.
What explains human behaviour is that with every action we program ourselves to follow certain strategies and rules and that others can judge to some extent which strategies we programmed ourselves to follow. Thus we have an incentive to follow strategies that lead others to trust us, so that we'll have more opportunities for cooperation in the future.
Always start by cooperating and then do the same as the other party : cooperate if he does, retaliate if he doesn't. It is the most effective strategy in both games.
No it's not. What needs to be explained is why you cooperate if you know that you will never play with the same person again. And the answer is that by doing so you get in the habit of automatically cooperating in similar situations in the future and other people recognize this habit and trust you.
In my opinion the PD has always been worthless because it is based on a series of false paradigms.
Most importantly, this scenario involves only a binary decision. In reality, there are no binary decisions. Every single possible decision involved countless contingencies to consider which are limited only by the awareness and imagination of the person making the decision. Trying to apply this false paradigm to reality is nonsense. You might as well try to discern real-world data about the number of angels that people believe will fit on the end of a pin.
>Compare this with the Prisoner’s Dilemma. For a quick synopsis, two prisoners being questioned each have the choice to either defend the other’s innocence or betray the other’s guilt.
The underlying assumption is that innocence and guilt are the only factors driving the prisoners decision. What if the prisoner's primary motivation is the rejection of coercion by his captors? Making assumptions about motivation completely nullifies any potentially valuable insights about human behavior.
- Unless provoked, the agent will always cooperate.
- If provoked, the agent will retaliate.
- The agent is quick to forgive.
- The agent must have a good chance of competing against the opponent more than once.
The fact that this strategy was a consistent winner in his competitions has led Dawkins to argue that "nice guys finish first" and could be a partial explanation for certain forms of so-called "altruistic" animal behavior -- behavior that seemingly gives no benefit to the altruist, except for the fact that too may benefit when it is their turn to be helped. See also:
https://en.wikipedia.org/wiki/Nice_Guys_Finish_First