Problems
The Prisoner's Dilemma

This may be called the classical problem of game theory. Shortly after being taken up by the RAND Corporation, it was stated in its now familiar form by Al Tucker in 1950. An enormous literature (eg Arrow, Gauthier, Axelrod) has since grown up around it. It sets up a situation in which the rules of the game plus the assumption of self-interest lead to choices by the players which, as even they perceive them, are not in their own best interest. The problem is then to introduce rules or strategies (such as Axelrod's "tit for tat" iterative strategy) that will after all lead to the agreed better outcome. These calculations are felt to be important for the question of altruism in evolutionary theory, or for the genesis of morality (or "enforced co-operation") in society. We find the problem flawed as a model for ethical computation, and we find its implications dubious as social strategy. Some of these objections are noted below.

Problem

Two Sicilian brothers, Angelo and Benigno, have been arrested for a crime carrying a penalty of six years in prison. They are being held in separate cells, and cannot communicate. The rules of the game are these: (1) If both confess, each will receive the prescribed six years. (2) If Angelo alone confesses, he will go free, and Benigno will receive an extended sentence of ten years; the same (mutatis mutandis) for Benigno. (3) If neither confesses, each will receive a two-year sentence on a lesser charge. Graphically, with the outcomes in years of prison time shown in red:

 B Confesses B Does Not A Confesses A = 6, B = 6 (12 yrs) A = 0, B = 10 (10 yrs) A Does Not A = 10, B = 0 (10 yrs) A = 2, B = 2 (4 yrs)

Results

Outcome 1. In the game as stated, both brothers know everything except how the other brother will decide. On that assumption, plus the assumption that Angelo is considering only his own interest, Angelo's best defensive strategy is to confess, since whatever Benigno decides (look at the A values in both of the B columns), this gives the shorter sentence for Angelo. The same calculation applies if we look at the options from Benigno's end. If both brothers make the same calculation and pursue the same defensive strategy, the result is that neither brother is set free, and both serve six years. The total number of years served is twelve.

Outcome 2. Better than this would be for neither to confess. The outcome is that neither is set free, but both serve only a reduced two years. The total number of years served is four; one-third of the total in Outcome 1. Put another way, their mother will see both sons again in two years, not six (as is the outcome in Solution 1) or ten (when one brother confesses, and the other serves an extended term).

The Dilemma lies in the fact that the admittedly best available outcome, if both make "rational choices" (sentences of two years for each) is not the one that they separately reach by making those "rational choices" (six years for each).

Comment

There are many possible objections to the cogency and importance of the problem as here posed. Here are some of them.

1. Culture. The dominant fact, which gives the problem its particular character, is the chance that the other brother may choose Option 1 (confession). Given that possible choice, Option 1 is then the best defense of either against the choice of Option 1 by the other. It is thus fear that enforces the choice of Option 1, and leads to the less than maximal outcome. If the two trust each other, and if they realize that Option 1 is self-defeating, they will prefer Option 2. We thus note that factors of culture (trust among brothers) can supply the missing information, and thus give the better outcome. See Fukuyama on the variable factor of trust in culture.

2. Trust. In the problem as stated, Angelo and Benigno are assumed not to trust each other, but they are assumed to trust whoever is telling them the rules of the game, especially the clause that offers lesser sentences if both confess. Anyone who has read the Han Feidz or the Arthashastra, or who has spent time on the street, will be alert for biased, malicious, incomplete, or asymmetrically transmitted information in general, but especially in court and prison situations. If Angelo and Benigno are capable of suspecting each other, why do they not also suspect a trap in the rules as stated? The problem presumes an unrealistically narrow range of possible doubt, and to that extent operates with humanly unreal protagonists.

3. Law. Angelo might imaginably ask, Who benefits if I accept these rules as true? The obvious answer is: the prosecutor. In Outcome 1 the prosecutor gets two convictions without the need for a trial; in Outcome 2 he gets two convictions in a trial on a minor charge, which should not take much court time. If by some failure of reason on the part of one brother, only one is convicted, the total years of prison time that the prosecutor achieves is still more (ten) than in the milder of the other options (four), and he can always claim that he has convicted the more important of the two. It is presumed that Angelo accepts the legal system as an automatic mechanism for carrying out the stated game rules, and not as a system in its own right, with social expectations to fulfill, and overhead costs to amortize. This presumption is unreal in general terms, and unlikely to be made by any empirically obtainable Angelo. As far as that goes, the presumptions of the problem as a whole are also implausible as describing the operation of any imaginable legal system.

4. Communication. If we agree to assume that the prison officials can be trusted, we may turn to the information aspect. It is a necessary condition of the problem that communication between the brothers is impossible; otherwise they would quickly agree on their mutually best strategy, and the imagined dilemma would vanish. But the situation as described actually permits information transfer. Thus, offers of release in return for confession are usually made to the lesser of two accused, in the hope of securing the conviction of the other. Angelo, knowing this, and knowing himself to be the ringleader, may reason as follows.

"I am the big fish, so hoping to convict me, they would first have made this offer to Benigno. If he accepted, then they already have enough to convict me, and they would not now be offering me a chance of going free. But they are offering me this chance. So Benigno, bless his faithful heart, has refused the offer, and will not confess. Then their last hope is to offer me the chance to betray him, so that they can convict at least one of us. I spit on their hope."

Benigno has thus adopted Option 2, and Angelo now knows all he needs to know in order to adopt Option 2 as well, leading to the supposedly unobtainable Outcome 2. Benigno in real life might be too stupid to figure this out, or too malicious to refuse the chance to harm his brother. But in the problem, Benigno is assumed to be as rational and (but for the fear element) as well-disposed as Angelo. So he does figure it out, and he does seek his own best interest. On those assumptions, he knows that his refusal to confess sends a signal to Angelo. The two can thus after all communicate (communication in one direction suffices), and the supposedly elusive best outcome can be reached.

5. Betrayal. The problem as stated does not optimize self-interest, it optimizes treachery. The best personal outcome (zero years) can be had only by betraying the other brother (Benigno gets not merely the regular sentence, but a longer sentence, if Angelo confesses). Here is the connection with game theory, which assumes a hostile opponent (John Nash showed that even cooperative games can be theoretically reduced to hostile games), and which assumes that maximum advantage comes from cheating the opponent. But the problem also penalizes treachery, with its rule that if both confess, both receive a middle penalty. Reformulating the problem to eliminate this treachery premium, we might have:

Angelo and Benigno have been arrested for a crime with a penalty of six years. If either confesses, implying contrition, he will be given a reduced sentence of two years. Both brothers thus rationally confess (the contrition need not be sincere), and both get the best outcome, namely two years.

Rather than Axelrod's iterative approach, for which, in the problem as stated, individual lifetimes are not long enough, these straightforward rules lead to the supposed best outcome for the individuals. Getting the lesser sentence also sets a precedent, in Angelo's and Benigno's future thinking, for accepting social favors. It is thus itself a device of socialization. The socialization effect (in this version) has been built into the rules of the system itself. More generally, this version of the problem abandons the idea that moral systems are generated by individual choices (a common Western postulate, but a risky one). It recognizes that moral systems are established from the top down (the standard and better documented Chinese view).

6. Crime. The culture of crime is not recognized in the problem. But we may note that the Mafia code of silence would have produced Outcome 2 without either brother reflecting for a single moment about any complicated rules at all, by whomever conveyed. For the problem's "rational actor" presumption, we may thus substitute a "rational subculture" presumption, and the dilemma vanishes.

7. Society. Apart from its individual consequences, which are the only bookkeeping which the problem is aware of, there are wider implications. Thus, beyond any secondary benefits for the individuals we have been considering, the code of silence will also have the result of restoring the brothers to full criminal activity in the minimum aggregate time, a result which maximally benefits criminal society as against civil society. Our spectatorship of the problem needs to allow for the possibility that the brothers are guilty as charged, that their crimes are serious, and that civil society's best benefit lies not in their release but in their maximum incarceration. The problem as posed excludes such aspects as these, and in so doing renders itself merely suburban, and specifically, American suburban. Such a problem lacks cultural generality; that is, it lacks any very wide predictive capacity.

Question for the Americans: Of two interpenetrating societies, if the bad society has a higher index of cooperation than the good society, is the bad society after all better than the good society?

8. Higher Principle. Also overlooked in the usual accounting of the problem is the possibility of higher individual motivation in the prisoners. In 1937 the mathematician Gnedenko was denounced by a colleague for anti-Soviet opinions. His jailers tried to force him, not to confess those opinions, but to denounce another colleague, Kolmogorov, as the ringleader of a subversive movement in the mathematics faculty. Despite daily and brutal interrogations lasting over months, Gnedenko refused to even entertain that possibility. It would be better, for those computing human behavior, to consider what we wish henceforth to be known as the G or Gnedenko Factor.

It is not material to the present argument whether Gnedenko's rejection, of what an econometrician would consider the smart move, was successful for him or not. The point is the existence of that defiance, as an empirical social fact. As a matter of curious interest, however: after months of failure in their attempt to make a case against Kolmogorov, the jailers released Gnedenko, who resumed his mathematical career, supported at times by Kolmogorov. Gnedenko eventually outlived the Soviet system itself.

The Prisoner's Dilemma shares with many other social science "models" a socially remote set of suppositions and nonsuppositions. It contains, and excludes, just enough data to enshrine the axioms of the discipline in question (in the cases of economists, a predatory version of "economic" or "rational" man), but not enough to plausibly represent any real social situation. Thus the prisoners in the problem are assumed to have enough social reality to prefer freedom to prison, but not enough to feel brotherly devotion, or to be concerned for the effect on their mother if one betrays the other, or to take any account of the culture of crime in which both brothers have previously lived. Such "models" are emblems of disciplinary assumption rather than algorithms of viable computation, and computations made with them are nugatory.

This general point has been made repeatedly, over the years, but Sapir's classic 1939 indictment of the general position ("Psychiatric and Cultural Pitfalls in the Business of Getting a Living") may usefully be extracted here:

All special sciences of man's physical and cultural nature tend to create a framework of tacit assumptions which enable their practitioners to work with maximum economy and generality. The classical example of this unavoidable tendency is the science of economics, which is too intent on working out a general theory of value, production, flow of commodities, demand, price, to take time to inquire seriously into the nature and variability of those fundamental biological and psychological determinants of behavior which make these economic terms meaningful in the first place. The sum total of the tacit assumptions of a biological and psychological nature which economics makes get petrified into a standardized conception of "economic man," who is endowed with just those motivations which make the known facts of economic behavior in our society seem natural and inevitable. In this way, the economist gradually develops a peculiarly powerful insensitiveness to actual motivations, substituting lifelike fictions for the troublesome contours of life itself.

The very terminology which is used by the many kinds of segmental scientists of man indicates how remote man has himself become as a necessary concept in the methodology of the respective sciences.

The phrase segmental scientists of man really says it all.

Contact The Project / Exit to Resources Page