######### Card Hero LETTERS #########
Letters to the editors

Vol. 6, NO. 2 / September 2021

To the editors:

In his essay, William Press retells the story of how he, together with the late mathematician and physicist Freeman Dyson, discovered that there are nefarious strategies within the prisoner’s dilemma game that allow a player to control the opponent’s winnings. As Press discusses, the game creates a scenario in which two agents are arrested on suspicion of committing a crime and are interrogated separately. They are each offered a deal: “Rat out your fellow conspirator and walk away with a light sentence, or rot in jail for a prolonged period.” The dilemma here is that cooperating with your fellow suspect is irrational because the temptation to defect, by giving up the accused counterpart, is too great. But if a cooperative act is not rewarded, how could cooperation evolve to be so ubiquitous in biology, then?

It turns out that the answer to this question is already contained in the description of the prisoner’s dilemma, where it is prescribed that the defendants are interrogated separately, to prevent them from communicating with each other. If they could, they surely would reach a pact where they pledge to each other that they will stay quiet; maybe they would seal the pact by swearing that if one should break the accord, the other would retaliate in the worst way once released. And indeed, when acts of cooperation between people and animals are analyzed in detail, it is clear that the parties are cooperating only when they can communicate. Robert Trivers described a tiny cleaner fish might rid the gills of a much larger fish from ectoparasites, which is a dangerous undertaking for the one doing the cleaning as the temptation for a quick meal always looms large for the bigger fish. But there clearly is an understanding between the two: the smaller one will not clean just anyone but only the type it is associated with. Furthermore, the cleaner fish signals to the host fish when the cleaning is finished, so the host can continue swallowing other small fish.1 There is no doubt that this act of cooperation, and likely all others, relies on two-way communication.

It turns out that the nefarious zero determinant (ZD) strategies that Press and Dyson discovered also involve communication.2 In this iterated version of the prisoner’s dilemma game, players decide whether to cooperate or defect only after examining the evidence of the preceding play: What did my opponent play, given what I played at the same time? Using the information gleaned from the previous round of play, an agent can make better decisions by only cooperating if there is evidence that the opponent might also cooperate. Of course, basing decisions on only the last play does not guarantee future success, but this bit of information is quite valuable as it allows a player to discriminate: “I will cooperate with you, but not you.” Even better, using more than just the last play increases the chance for a positive payoff even more, because the player’s predictions become more accurate.3

It is worth pointing out that not all ZD strategies use information in an unfair manner to manipulate the opponent. There are so-called “generous” ZD strategies, not discussed by Press and Dyson, that use the same ZD trick, but make sure that the opponent obtains the highest payoff possible: they will cooperate with another cooperating player, but if either of the two players deviates from the mutual pact, the generous player loses more than the opponent.4 Evolution favors such altruistic information-driven behavior, but certainly does not favor the mean coercive strategies of Press and Dyson. As I discovered with Arend Hintze soon after Press and Dyson’s article appeared, a ZD strategy may win every game played against another non-ZD strategy, but ultimately lose the war.5 The reason for this seemingly contradictory outcome is that in an evolving population, you have to play not only against agents that are different from you, but against versions of yourself: your own offspring. Gradually, while causing injury to others, your short-term benefit of exploitation backfires as you dispense the same punishment on your kin. In the meantime, your opponent that cooperates with others of their ilk increases in number, to the detriment of the misers that wage ZD war against each other.

I am glad that Dyson liked this outcome. It certainly has the whiff of comeuppance, and the word karma almost inevitably was used in the media stories written about our article, “Evolutionary Instability of Zero-Determinant Strategies Demonstrates that Winning Is Not Everything.” Indeed, the fundamental result is that selfish strategies are not stable in an evolutionary sense: they are doomed for ultimate extinction.6

Why is it then that in the evolutionary arena mean ZD strategies lose, but generous ZD strategies succeed? The answer lies again in communication theory: ZD strategies that engage in extortion do not acquire enough information to gainfully predict the opponent’s behavior. I have derived a general theorem that states that the amount of information that is necessary for cooperation to be evolutionarily stable is

$${{I\left( {X:Y} \right)} \over {H\left( X \right)}} \ge {c \over b},$$

given a benefit b and a cost c for a cooperative move, where b ≥ c ≥ 0.

In this equation, I(X : Y) is the average information extracted by player X about Y’s strategy, and H(X) is how much there is to know about an opponent—that is, it is the uncertainty about the identity of any strategy in the population. Incidentally, this rule mirrors David Queller’s rule for inclusive fitness.7 But unlike Queller’s rule, which has been criticized because it cannot deal with non-additive payoffs and for arbitrarily mapping decisions to numerical values,8 the information-theoretic counterpart has neither of these problems since information is probabilistic and non-additive. The famous tit-for-tat rule, for example—which, as it happens, is a communicating ZD strategy—has an information capacity of 1 bit and will dominate most populations, in particular in its generous form, without winning a single battle.

Press and Dyson’s serendipitous and seminal work on ZD strategies has thus shined a spotlight on how to escape from the trap that is set by dangling a reward for selfish behavior. A judicious use of information—gathered from any source in general, but from previous play in particular in the memory-one version of the iterated prisoner’s dilemma—can ensure that cooperation pays, as long as the fraction of information, normalized by how much there is to know, exceeds the cost-to-benefit ratio. Using this information in a selfish and mean manner might allow the player to predict the actions of a subset of the population—i.e., those that are unlike him—but fails to account for running into that person’s selfish clones. Using information generously, however, while remaining vigilant about noncooperators ultimately pays off.


  1. Robert Trivers, “Evolution of Reciprocal Altruism,” The Quarterly Review of Biology 46, no. 1 (1971): 35–57, doi:10.1086/406755. 
  2. In all fairness, strategies of the ZD type were discovered earlier in an article by Maarten Boerlijst, Martin Nowak, and Karl Sigmund, “Equal Pay for All Prisoners,” The American Mathematical Monthly 104, no. 4 (1997): 303–305, doi:10.1080/00029890.1997.11990641. Here the strategy was called the “equalizer,” since both players that use this strategy force each other to take the same payoff, but the fundamental determinant condition, and the fact that a player using this strategy can force another player to accept a lower payoff, was new. 
  3. Dimitris Iliopoulos, Arend Hintze, and Christoph Adami, “Critical Dynamics in the Evolution of Stochastic Strategies for the Iterated Prisoner’s Dilemma,” PLOS Computational Biology 6 (2010): e1000948, doi:10.1371/journal.pcbi.1000948. 
  4. Alexander Stewart and Joshua Plotkin, “From Extortion to Generosity, the Evolution of Zero-Determinant Strategies in the Prisoner’s Dilemma,” Proceedings of the National Academy of Sciences 110, no. 38 (2013): 15,348–53, doi:10.1073/pnas.1306246110. 
  5. Christoph Adami and Arend Hintze, “Evolutionary Instability of Zero-Determinant Strategies Demonstrates That Winning Is Not Everything,” Nature Communications 4 (2013): 2,193, doi:10.1038/ncomms3193. 
  6. Our original article carried the title “Winning Isn’t Everything: Evolutionary Instability of Zero Determinant Strategies,” but the editors of Nature Communication vetoed the title as too whimsical. 
  7. David Queller, “Kinship, Reciprocity and Synergism in the Evolution of Social Behavior,” Nature 318 (1985): 366–67, doi:10.1038/318366a0; and David Queller, “Quantitative Genetics, Inclusive Fitness, and Group Selection,” American Naturalist 139, no. 3 (1992): 540–58, doi:10.1086/285343. 
  8. Martin Nowak, Corina Tarnita, and Edward Wilson, “The Evolution of Eusociality,” Nature 466 (2010): 1,057–62, doi:10.1038/nature09205. 

Christoph Adami is a Professor of Microbiology, Molecular Genetics, Physics, and Astronomy at Michigan State University.

More Letters for this Article


Endmark

Copyright © Inference 2024

ISSN #2576–4403