J. Bradford DeLong

December 1999

An extended passage from William Poundstone's (1992) marvelous book Prisoners Dilemma (New York: Doubleday: 038541580X). Economists will find it hilarious and thought-provoking. Others will probably find it bizarre and weird. It comes from pp.106-118.

Flood and Dresher devised a simple game where [the Nash equilibrium wasn't such a good outcome for the players].... The researchers wondered if real people playing the game--especially, people who had never heard of Nash or equilibrium points--would be drawn mysteriously to the equilibrium strategy. Flood and Dresher doubted it.

The two researchers ran an experiment that very afternoon. They recruited two friends as guinea pigs, Armen Alchian of UCLA ("AA" below), and RAND's John D. Williams ("JW"). The game was presented purely as a payoff table. The payoffs were:

(AA's payoff, JW's payoff) JW's Strategy 1 [Defect] JW's Strategy 2 [Cooperate]
AA's Strategy 1 [Cooperate] (-1, 2) (1/2, 1)
AA's Strategy 2 [Defect] (0, 1/2) (1, -1)

[The Nash equilibrium is defect-defect, of course]... Alchian is always better off choosing his strategy 2, and Williams is better off choosing his strategy 1. But when both players choose their "better" strategy, both do relatively poorly. They actually do better choosing their "worse" strategies--provided both do it.

The Nash theory suggests the lower left cell as the rational outcome. Neither player can do any better by switching unilaterally: a player is always better off defecting, no matter what the other does....

In the RAND experiment, Alchian and Williams played the game 100 times in succession.... Flood's 1952 paper reports not only the strategies the two players chose but a log of running comments.... The comments were written after each player had decided on a strategy in that particular game, but before the other player's choice was known. Some comments thus refer to the other player's choice on the previous game. For clarity, strategy numbers have been replaced with [C] for "cooperate" and [D] for "defect."

Round AA JW AA's Comments JW's Comments
1 D C JW will play D--sure win. Hence if I play C--I lose Hope he's bright.
2 D C What is he doing?!! He isn't, but maybe he'll wise up.
3 D D Trying mixed? OK, dope.
4 D D Has he settled on D? OK, dope.
5 C D Perverse! It isn't the best of all possible worlds.
6 D C I'm sticking to D since he will mix for at least 4 more times. Oh ho! Guess I'll have to give him another chance.
7 D C   Cagey, isn't he? Well...
8 D D   In time he could learn, but not in ten moves or so.
9 D D If I mix occasionally, he will switch--but why will he ever switch from D?  
10 D D Prediction. He will stick with D until I change from D. I feel like DuPont.

I can guarantee myself a gain of 5, and guarantee that player AA breaks even. On the other hand, with nominal assistance from AA, I can transfer the guarantee of 5 to player AA and make 10 for myself, too. This means I have control of the game to a large extent, so player AA had better appreciate this and get on the bandwagon.

With small amounts of money at stake, I would (as above) try (by using C) to coax AA into mutually profitable actions. With large amounts of money at stake I would play D until AA displayed some initiative and a willingness to invest in his own future. One play of C by AA would change me from D to C, where I would remain until bitten.

On the last play, it would be conservative for me to switch to D, but I wouldn't do so if the evidence suggested that AA was a nice stable personality and not in critical need of just a little extra cash

11 D C   Probably learned by now.
12 C C   I'll be damned. But I'll try again.
13 C C   That's better.
14 C C   Ha!
15 C C   (bliss)
16 D C    
17 C D   The stinker.
18 C D   He's crazy. I'll teach him the hard way.
19 D D I'm completely confused. Is he trying to convey information to me? Let him suffer.
20 D D    
21 D C   Maybe he'll be a good boy now.
22 C C   Always takes time to learn.
23 C C   Tim3.
24 C C    
25 C C    
26 D C    
27 C D   Same old story.
28 D D He wants more C's by me than I'm giving. To hell with him.
29 D D    
30 D D    
31 D C Some start. Once again.
32 C C JW is bent on sticking to D. He will not sahre at all as a price of getting me to stick to C. ---, he learns slow.
33 C C   On the beam again.
34 C C    
35 C C    
36 C C    
37 C C    
38 D C    
39 C D   The ---.
40 D D    
41 D C   Always try to be virtuous.
42 C C   Old stuff.
43 C C    
44 C C    
45 C C    
46 C C    
47 C C    
48 C C    
49 D C He will not share.  
50 C D   He's a shady character and doesn't realize we are playing a 3rd party, not each other.
51 D C    
52 C C   He requires great virtue, but he doesn't have it himself.
53 C C    
54 C C    
55 C C    
56 C C    
57 C C    
58 C C He will not share.  
59 C C He does not want to trick me. He is satisfied. I must teach him to share.  
60 D C   A shiftless individual--opportunist, naive
61 C C    
62 C C   Goodness me! Friendly!
63 C C    
64 C C    
65 C C    
66 C C    
67 D C He won't share.  
68 C D He'll punish me for trying! He can't stand success.
69 D D    
70 D D I'll try once more to share--by taking.  
71 D C   This is like toilet training a child--you have to be very patient.
72 C C    
73 C C    
74 C C    
75 C C    
76 C C    
77 C C    
78 C C  
79 C C    
80 C C    
81 D C    
82 C D   He needs to be taught about that.
83 C C    
84 C C    
85 C C    
86 C C    
87 C C    
88 C C    
89 C C    
90 C C    
91 C C When will he switch as a last minute grab of D? Can I beat him to it as late as possible?  
92 C C   Good.
93 C C    
94 C C    
95 C C  
96 C C    
97 C C    
98 C C  
99 D C    
100 D C    

For all the confusion, mutual cooperation was the most common outcome (sixty of the 100 games). Had Flood and Dresher used a "fair" [i.e., symmetric] payoff table, the cooperation rate might have been higher yet.

Flood and Dresher wondered what John Nash would make of this. Mutual defection, the Nash equilibrium, occurred only fourteen times. When they shoowed their results to Nash, he objected that "the flaw in the experiment as a test of equilibrium point theory is that the experiment really amounts to having players play one large multi-move game. One cannot just as well think of the thing as a sequence of independent games.... There is too much interaction, which is obvious in the results of the experiment."

This is true enough. However, if you work it out, you find that the Nash equilibrium strategy for the multi-move "supergame" is for both players to defect in each of the 100 trials. They didn't do that.

What Poundstone means is that, since both players know that the supergame is going to last for 100 periods, there is no reason for people to cooperate in round 100 to induce subsequent cooperation. Hence--whatever else people do--the Nash equilibrium strategy must be to defect in period 100.

But once you know that the other player will defect in period 100 no matter what you do, the same argument applies to period 99: whatever else people do, the Nash equilibrium strategy must be to defect in period 99.

Thus the situation "unravels." As long as there is a known, certain last period the only Nash equilibrium is to defect, always, from the first period.

And real people don't do that--at least not unless they are John von Neumann or John Nash.

Alchian wound up with +40

Williams wound up with +63

The full 100-round C,C outcome is +50, +100; the full 100-round D,D outcome is 0, +50.

So even though we identify with Williams--as the smart one, the one trying to induce cooperation, the one understnading that it was the two of them playing the umpire--nevertheless, Alchian "won" in that he got much closer to the total possible value of the game for his payoff matrix...

I did indeed find the passage that you quoted from _Prisoners' Dilemma_ to be hilarious. All the more so because Armen Alchian was one of the players. One wonders what a game between, say, Milton Friedman and Paul Samuelson would've been like. Axelrod's computer tournaments were fascinating to read about, but it was even more fun to read about those human players.

Contributed by Mike Tamada <tamada@oxy.edu> on March 27, 2001

It would be interesting to know why Flood and Dresher chose the asymmetrical pay-off that they did, for this introduces an extraneous complication into the game. JW appears to be trying to optimise the joint pay-off, using a strategy somewhat akin to tit-for-tat. On the other hand, AA appears to be trying to maximise his comparative score against JW. A CD outcome puts AA 3 units behind JW, and occurs 7 times. However a CD outcome puts AA 2 units ahead (and is the only outcome that is better for AA than for JW). This occurs 18 times, including games 99 and 100, where it follows a long string of CC outcomes. It could be argued therefore that the two are actually playing different games, and indeed, in one sense, it could be argued that AA won the game that he was playing, even if it wasn't the game that JW was playing.

Of course the Nash equilibrium is still a poor outcome for both games, but Nash's analysis assumes that the players do not co-operate. When game is iterated, there is room for the players to test for co-operation, as has happened here.

There is plenty more on this subject at http://directory.google.com/Top/Computers/Artificial_Life/Iterated_Prisoner_Dilemma/.

Contributed by Anonymous on January 1, 2001.

I thought that Flood and Dresher were trying to test John Nash's belief that the equilibrium should be invariant to monotonic transformations--that what mattered was the best-response nature of the equilibrium strategy, and not the values of alternative payoffs.

But I don't know. It's just a guess.

Contributed by Brad DeLong (delong@econ.berkeley.edu) on January 1, 2001.