What's wrong with social simulations?

Eckhart Arnold

Table of Contents

1 Introduction

2 Simulation without validation in agent-based models

3 How a model works that works: Schelling’s neighborhood segregation model

4 How models fail: The Reiterated Prisoner’s Dilemma model

5 An ideology of modeling

6 Conclusions

Bibliography

3 How a model works that works: Schelling’s neighborhood segregation model

Moving from the general finding to particular examples, I now turn to the discussion of Thomas Schelling’s neighborhood segregation model. Schelling’s neighborhood segregation model (Schelling 1971) is widely known and has been amply discussed not only among economists but also among philosophers of science as a role model for linking micro-motifs with macro-outcomes. I will therefore say little about the model itself, but concentrate on the questions if and, if so, how it fulfills my criteria for epistemically valuable simulations.

Schelling’s model was meant to investigate the role of individual choice in bringing about the segregation of neighborhoods that are either predominantly inhabited by blacks or by whites. Schelling considered the role of preference based individual choice as one of many possible causes of this phenomenon – and probably not even the most important, at least not in comparison to organized action and economic factors as two other possible causes (Schelling 1971, 144).

In order to investigate the phenomenon, Schelling used a checkerboard model where the fields of the checkerboard would represent houses. The skin color of the inhabitants can be represented for example by pennies that are turned either heads or tails.[4] Schelling assumed a certain tolerance threshold concerning the number of differently colored inhabitants in the neighborhood, before a household would move to another place. A result that was relatively stable among the different variants of the model he examined was that segregated neighborhoods would emerge – even if the threshold preference for equally colored neighbors was far below 50%, which means that segregation emerged even if the inhabitants would have been perfectly happy to live in an integrated environment with a mixed population. As Aydinonat (2007) reports, the robustness of this result has been confirmed by many subsequent studies that employed variants of Schelling’s model. At the end of his paper Schelling discusses “tipping” that occurs when the entrance of a new minority starts to cause the evacuation of an area by its former inhabitants. In this connection Schelling also mentions an alternative hypothesis according to which inhabitants do not react to the frequency of similar or differently colored neighbors but on their on expectation about the future ratio of differently colored inhabitants. He assumes that this would aggravate the segregation process, but he does not investigate this hypothesis further (Schelling 1971, 185-186) and his model is built on the assumption that individuals react to the actual and not the future ratio of skin colors.

Is this model scientifically valuable? Can we draw conclusions from this model with respect to empirical reality and can we check whether these conclusions are true? Concerning these questions the following features of this model are important:

The assumptions on which the model rests can be tested empirically. The most important assumption is that individuals have a threshold for how many neighbors of a different color they tolerate and that they move to another neighborhood if this threshold is passed. This assumption can be tested empirically with the usual methods of empirical social research (and, of course, within the confinements of these methods). Also, the question whether people base their decision to move on the frequency of differently colored neighbors or on their on expectation concerning future changes of the neighborhood can be tested empirically.
The model is highly robust. Changes of the basic setting and even fairly large variations of its input parameters, e.g. tolerance threshold, population size, do not lead to a significantly different outcome. Therefore even if the empirical measurement of, say, the tolerance threshold, is inaccurate, the model can still be applied. Robustness in this sense is directly linked to empirical testability. It should best be understood as a relational property between the measurement (in-)accuracy of the input parameters and the stability of the output values of a simulation.[5]
The model captures only one of many possible causes of neighborhood segregation. Before one can claim that the model explains or, rather, contributes to an explanation of neighborhood segregation, it is necessary to identify the modeled mechanism empirically and to estimate its relative weight in comparison with other actual causes. While the model shows that even a preference for integrated neighborhoods (if still combined with a tolerance limit) can lead to segregation, it may in reality still be the case that latent or manifest racism causes segregation. The model alone is not an explanation. (Schelling was aware of this.)
Besides empirical explanation another possible use of the model would be policy advice. In this respect the model could be useful even if it does not capture an actual cause. For public policy must also be concerned about possible future causes.
Assume for example, that manifest racism was a cause of neighborhood segregation, but that due to increasing public awareness racism is on the decline. Then the model can demonstrate that even if all further possible causes, e.g. economic causes, be removed as well, this might still not result in desegregated neighborhoods[6] - provided, of course, that the basic assumption about a tolerance threshold is true.

Thus, for the purpose of policy advice a model does not need to capture actual causes. It can be counter-factual, but it must still be realistic in the sense that its basic assumptions can be empirically validated. Therefore, while the purpose of policy advice justifies certain counter-factual assumptions in a model, it cannot justify unrealistic and unvalidated models. This generally holds for models that are meant to describe possible instead of actual scenarios.

Schelling did not validate his model empirically. But for classifying the model as useful it is sufficient that it can be validated. Now, the interesting question is: Can the model be validated and is it valid? Recent empirical research on the topic of neighborhood segregation suggests that inhabitants react to anticipated future changes in the frequency of differently colored neighbors rather than the frequency itself (Ellen 2000, 124-125). An important role is played by the fear of whites that they might end up in an all-black neighborhood. Thus, the basic assumption of the model that individuals react upon the ratio of differently colored inhabitants in their neighborhood is wrong and one can say that the model is in this sense falsified.[7]

The strong emphasis that is placed on empirical validation here stands in contrast to some of the epistemological literature on simulations and models. Robert Sugden, noticing that “authors typically say very little about how their models relate to the real world”, treats models like that of Schelling (which is one of his examples (Sugden 2000, 6-8)) as “credible counterfactual worlds” (Sugden 2009, 3) which are not intended to raise any particular empirical claims. Even though the particular relation to the real world is not clear, Sugden believes that such models can inform us about the real world. His account suffers from the fact that he remains unclear about how we can tell a counter-factual world that is credible from one that is incredible, if there is no empirical validation.

A possible candidate for stepping in this gap of Sugden’s account is Kuorikoski’s and Lehtinen’s concept of “derivational robustness analysis” (Kuorikoski/Lehtinen 2009). According to this concept conclusions from unrealistic models to reality might be vindicated if the model remains robust under variations of its unrealistic assumptions. For example, in Schelling’s model the checkerboard topography could be replaced by other different topographies (Aydinonat 2007, 441). If the model still yields the same results about segregation, we are – if we follow the idea of “derivational robustness analysis” – entitled to draw the inductive conclusion that the model’s results would still be the same if the unrealistic topographies were exchanged by the topography of some real city, even though we have not tested it with a real topography. A problem with this account is that it requires an inductive leap of a potentially dangerous kind: How can we be sure that the inductive conclusion derived from varying unrealistic assumptions holds for the conditions in reality which differ from any of these assumptions?

Some philosophers also dwell on the analogy between simulations and experiments and consider simulations as “isolating devices” similar to experiments (Maeki 2009). But the analogy between simulations and experiments is rather fragile, because other than experiments simulations are not empirical and do not allow us to learn anything about the world apart from what is implied in the premises of the simulation. In particular, we can – without some kind of empirical validation – never be sure whether the causal mechanism modeled in the simulation represents a real cause isolated in the model or does not exist in reality at all.

Summing it up, it is difficult, if not impossible, to claim that models can inform us about reality without any kind of empirical validation. Schelling’s model, however, appears to be a scientifically useful model, at least in the sense that it can be validated (or falsified for that matter). The most decisive features of the model in this respect are its robustness and the practical feasibility of identifying the modeled cause in empirical reality. Next we will see how models fare when these features are not present.

[4] Schelling’s article was published before personal computers existed. Today one would of course use a computer. A simple version of Schelling’s model can be found in the netlogo models library (Wilensky 1999).

[5] There are of course different concepts of robustness. I consider this relational concept of robustness as the most important concept. An important non-relational concept of robustness is that of derivational robustness analysis (Kuorikoski/Lehtinen 2009). See below.

[6] But then, would we really worry about segregated neighborhoods, if the issue wasn't tied to racial discrimination and social injustice? After all, ethnic or religious groups in Canada also often live in segregated areas (“Canadian mosaic”). But other than in the U.S. this is hardly an issue. Therefore, Schelling's model - for all its epistemological merits that are discussed here - really seems to miss the point in terms of scientific relevance. Discrimination is the important point here, not segregation. But Schelling's model induces us to frame the question in a way that makes us miss the point. (This comment has been added later as the result of some discussions I had on this point. E.A., March 25th 2016.)

[7] There are two senses in which a model (or more precisely: a model-based explanation) can be falsified: a) if the model’s assumptions are empirically not valid as in this case and b) if the causes the model captures are (i) either blocked by factors not taken into account in the model or (ii) cannot be disentangled from other possible causes or (iii) turn out to be irrelevant in comparison with other, stronger or otherwise more important causes for the same phenomenon. The connection between the model’s assumptions and its output, being a logical one, can, of course, not be empirically falsified.