How Models Fail
A Critical Look at the History of Computer Simulations of the Evolution of Cooperation

Eckhart Arnold

1 Introduction
2 The empirical failure of simulations of the evolution of cooperation
3 Justificatory narratives
    3.1 Axelrod's narrative
    3.2 Schüßler's narrative
    3.3 The story of “slip stream altruism”
    3.4 The social learning strategies tournament
4 Bad excuses for bad methods and why they are wrong
5 History repeats itself: Comparison with similar criticisms of naturalistic or scientistic approaches

3.4 The social learning strategies tournament

The last example of a justificatory narrative does not concern the RPD model, but a simulation enterprise that is similar in spirit to Axelrod's. The authors of this study explicitly refer to Axelrod for the justification of their approach (Rendell et al. 2010a, 208-209). The model at the basis of the “Social Learning Strategies tournament” is a 100-armed bandit model (Rendell et al. 2010b, 30ff.). Just like the RPD it is a highly stylized and very sparse model: The model assumes an environment with 100 cells representing foraging opportunities. The payoff from foraging is distributed exponentially: few high payoffs, many low or even zero payoffs. In each round of the game the players can choose between three possible moves: INNOVATE where they receive information about the payoff opportunity in a randomly picked cell; EXPLOIT where players forage one of their known cells to receive a payoff; OBSERVE where a player receives slightly imprecise information about the foraging opportunities that other players are exploiting. Arbitrarily many players can occupy one cell. The resources never expire, but the environment changes over time so that the players’ information about good foraging opportunities gets outdated after a while. The payoffs drive a population dynamical model where players live and die and are replaced by new players depending on the success of the existing players.

The most important result of the tournament was that – under the conditions of this specific model – the best strategies relied almost entirely on social learning, i.e. playing OBSERVE. It almost did not make any sense at all to play INNOVATE.[5] Other than that the ratio between OBSERVE moves and EXPLOIT moves was crucial to success. Too few OBSERVE moves would lead to sticking with poor payoffs. Too many OBSERVE moves would mean that payoffs would not be gathered often enough which results in a lower average payoff. Finally, the right estimate of expected payoffs was important. The winning strategy and the second best strategy used the same probabilistic standard formula to estimate the expected payoff values (Rendell et al. 2010a, 211).

The authors themselves make every effort to present their findings as a sort of scientific novelty. For that purpose they employ a framing narrative that links their model with an important research question, prior research and successful (or believed to be successful) past role models. The broader research question, mentioned in the beginning of the paper, to which the model is related is how cultural learning has contributed to the success of humans as a species: “Cultural processes facilitate the spread of adaptive knowledge, accumulated over generations, allowing individuals to acquire vital life skills. One of the foundations of culture is social learning,...” (Rendell et al. 2010a, 208). Surely, this is a worthwhile scientific question.

As to the prior research they refer to theoretical studies. These, however, only “have explored a small number of plausible learning strategies” (Rendell et al. 2010a). Therefore, the tournament was conducted which gathers a contingent but large selection of strategies. The tournament’s results are then described as “surprising results, given that the error-prone nature of social learning is widely thought to be a weakness of this form of learning ... These findings are particularly unexpected in the light of previous theoretical analyzes ..., virtually all of which have posited some structural cost to asocial learning and errors in social learning.” (Rendell et al. 2010a, 212).

Thus, the results of the tournament constitute a novelty, even a surprising novelty. The surprising character of the results is strongly underlined by the authors of the study: “The most important outcome of the tournament is the remarkable success of strategies that rely heavily on copying when learning in spite of the absence of a structural cost to asocial learning, an observation evocative of human culture. This outcome was not anticipated by the tournament organizers, nor by the committee of experts established to oversee the tournament, nor, judging by the high variance in reliance on social learning ..., by most of the tournament entrants.” (Rendell et al. 2010a, 212) Again, however, it is not surprising, but to be expected that one reaches results that differ form previous research if one uses a different model.

Axelrod’s tournament plays an important role as historical paragon in the framing narrative: “The organization of similar tournaments by Robert Axelrod in the 1980s proved an extremely effective means for investigating the evolution of cooperation and is widely credited with invigorating that field.” (Rendell et al. 2010a, 208). But as mentioned earlier, the general conclusions that Axelrod drew from his tournament had already turned out not to be tenable and the research tradition he initiated did not really yield any empirically applicable simulation models. Nonetheless, the authors seem to consider it as an advantage that: “Axelrod’s cooperation tournaments were based on a widely accepted theoretical framework for the study of cooperation: the Prisoner’s Dilemma.” (Rendell et al. 2010a, 209). However, the wide acceptance of the Prisoner’s Dilemma model says more about fashions in science than about the explanatory power of this model. Although not as widely accepted as the Prisoner’s Dilemma, the authors are confident that “the basic generality of the multi-armed bandit problem we posed lends confidence that the insights derived from the tournament may be quite general.” (Rendell et al. 2010a, 212). But the generality of the problem does not guarantee that the conclusions are generalizable beyond the particular model that was used to describe the problem. Quite the contrary, the highly stylized and abstract character of the model raises doubts whether it will be applicable without ambiguity in many empirical instances. The generality of the model does not imply – nor should it, as I believe, lend any confidence in that direction to the cautious scientist – that it is of general relevance for the explanation of empirical instances of social and asocial learning. This simply remains to be seen. If anything at all then it is its robustness with respect to changes of the parameter values that lends some confidence in the applicability of the tournament’s results. Robustness is of course only one of several necessary prerequisites for the empirical applicability of a model.

Summing it up, it is mostly in virtue of its framing narrative that the tournament’s results appear as a novel, important or surprising theoretical achievement. If one follows the line of argument given here, however, then the model – being hardly empirically grounded and not at all empirically validated – represents just one among many other possible ways of modeling social learning. In this respect it is merely another grain of dust in the inexhaustible space of logical possibilities.

[5] This was partly due to an inadvertency in the design of the model, where OBSERVE moves could - due to random errors - serve much the same function as INNOVATE moves. The authors of the study did, however, verify that their results are not just due to this particular effect (Rendell et al. 2010b, 21f.).

t g+ f @