Tools or Toys?
On Specific Challenges for Modeling and the Epistemology of Models in the Social Sciences

Eckhart Arnold

1 Introduction
2 The role of models in science
3 Why computer simulations are merely models and not experiments
4 The epistemology of simulations at work: How simulations are used to study chemical reactions in the ribosome
5 How do models explain in the social sciences?
6 Common obstacles for modeling in the social sciences
7 Conclusions

4 The epistemology of simulations at work: How simulations are used to study chemical reactions in the ribosome

The example from simulations in the natural sciences that I am going to discuss comes from the field of biochemistry. It concerns ongoing research about how peptid bonds between amino acids are formed in the ribosome molecule.[6] The ribosome is a macro molecule in the cells of living organisms which assembles amino acids to proteins according to the information on the messenger RNA (mRNA), which in turn is a copy of the genetic information stored in the cell's DNA. The process proceeds roughly as follows: The ribosome receives a new transfer RNA (tRNA) molecule with an attached amino acid at a specific location called the ribosome's amino site. At the amino site the tRNA molecule is bound to the chunk of mRNA that is currently “read” by the ribosome. (Which kind of tRNA molecule and therefore which amino acid can enter the amino site depends on the chunk of mRNA that occupies the amino site during this step of the whole process of protein formation.) The amino site is spatially close to the peptide site of the ribosome, where another tRNA molecule is located, the amino acid of which is already attached to the evolving protein. In a process called peptide bond formation the amino acid of the “new” tRNA at the amino site is connected to the amino acid of the “old” tRNA at the peptide site. Finally, the tRNA at the peptide site is released (having given away its amino acid) and the ribosome moves forward along the mRNA chain so that the tRNA that was received at the amino site now occupies the peptide site. This whole process is catalysed by the ribosome. Just how the peptide-bond formation is mediated is a question that researchers currently investigate. With the means available today it is extremely difficult if not impossible to investigate this question experimentally. Experimental data is only available on certain features of the reaction, most notably on the reaction barriers (i.e. the difference in energy levels that must be surpassed so that the reaction takes place). Therefore, molecular dynamics simulations are used to study how the peptide bond formation is catalysed by the ribosome.

In the following I am going to look at one such simulation study (Kaestner/Sherwood 2010). The questions that concern me here is under what kind of “epistemic situation” these simulation studies take place and whether the previously established epistemological categories can roughly capture this situation. In order to answer these questions, we shall work our way backwards from the results that were optained in this study to how these results were optained.

The results that were found in the study are:

  1. The ribosome performs its catalytic function of reducing the energy barrier of the peptidyl bounding reaction “by the electrostatic influence of the environment rather than just a favorable positioning of the reactants. The high concentration of mobile ions in the ribosome was found to be the key to the catalytic activity of the ribosome” (Kaestner/Sherwood 2010, p.\ 304). The conclusion was reached by comparing the simulations of the reaction in the ribosome with simulations of the reaction in the gas-phase.[7] The average reaction barrier that was found in the simulations was “in good agreement with experimental data” (Kaestner/Sherwood 2010, p.\ 304).
  2. Both of the two different reaction mechanisms (“direct proton transfer” and “proton shuttle”) that were studied may indeed account for the proton transfer. “Both were found to have similar activation energies. They may compete in the real system.” (Kaestner/Sherwood 2010, p.\ 304) At least the simulation results do not allow to exclude one of these results definately for the time being.
  3. The possible occurrence of a certain “tetrahedral intermediate” in the course of the reaction is “irrelevant for the reaction mechanism”. This conclusion could be drawn from the simulations, because “no minimum corresponding to a tetrahedral intermediate was found on the free-energy surface” as it should have been the case if it played a vital role in the reaction. The diagnosis of irrelevancy is furthermore strengthened by results in the literature. (Kaestner/Sherwood 2010, p.\ 300)
  4. For one scenario a discrepancy between simulation results and experimental data occurred: “ The free-energy simulations for the direct proton-transfer mechanism resulted in a significantly higher free energy of activation than the potential energy barrier.” (Kaestner/Sherwood 2010, p.\ 304)

How were these results arrived at and how do the above mentioned “sources of credibility” come into play here? In order investigate the process of peptid bond formation the researchers conducted series of computer simulations of the ribosome of the Thermus Thermophilus bacteria. The fundamental scientific theory upon which these simulations rest is quantum mechanics. Needless to say that quantum mechanics is a both quantitatively and qualitatively extremely well confirmed scientific theory with no competitors in the applicable areas of physics and chemistry. Researchers believe this theory to realistically describe on the most basic level just how things happen in physics. Unfortunately, quantum mechanics is computationally much too expensive to simulate a whole ribosome molecule. (The ribosome of Thermus contains roughly 2.6 millions of atoms.) A feasible approach to keep computational costs in check is, therefore, to use combined quantum mechanics and molecular mechanics simulations (QM/MM-simulations) where only the crucial parts of the reaction are rendered with quantum mechanics. This was also done here. Molecular mechanics is not a fundamental theory but can be considered as something of a simplified theoretical approximation that works good enough for some purposes. Just as quantum mechanics it has demonstrated its suitability in many application cases. Thus, as far as the credibility of the background theories goes, we have here the ideal case of extremely powerful and at the same time very well-confirmed background theories that cover the phenomenon under study. This situation seems to be typical for some areas of the natural sciences though not for all of them (e.g. climate simulations).

Apart from the background theories, there is quite a bit of factual background knowledge that enters into the simulations. The basic function of the ribosome has been understood since the midst of the 20th century and its structure is known since the 1970s. Thus, if scientists simulate the ribosome today they can draw on a wealth of more or less reliable background knowledge that has already been collected. Just how reliable some parts of this knowledge are, is almost impossible to judge for a non-expert. A non-expert can at best rely on the general trustability standards of the science concerned. In order to do so, some general knowledge of the science concerned is still necessary. The only other alternative for assessing the reliability of scientific knowledge as a complete non-expert would be to wait for technical applications of this knowledge, the success or failure of which is obvious even for the most ignorant and uneducated person. It suffices, however, if at least experts - if in doubt - are able to trace back the assumed background knowledge to its sources.[8] The simulation study discussed here, could not have been done if a model of the ribosome did not already exist. Also, the background knowledge was important for deciding which alternative mechanisms of peptide bond formation (“direct transfer”, “proton shuttle”) to examine in the first place. And it was used as a source of credibility by notifying agreement with results in the literature.

The application of a theory to a particular problem is by no means a trivial task and often requires no less inventiveness than the development of a new theory. In the example case discussed here, numerous different problems had to be solved and quite a range of different technologies had to be applied. This is where what Eric Winsberg calls the “tricks of trade” (Winsberg 2001, p.\ 444) come into play and where the simulation relies on what I have termed the credibility of simulation techniques before (see point 2.2 on page 2.2). I am going to point out just a few of these:

  1. In order to build a “hybrid” QM/MM-simulation it must be decided which parts of the reaction are to be included in the quantum mechanics part and which are calculated with molecular dynamics and how these parts are to be linked (Kaestner/Sherwood 2010, p.\ 295). This choice is still considered by the scientists as rather straight forward, though.
  2. While the ribosome model used already existed, the simulation system needed to undergo a complicated procedure of preparation and equilibration (Kaestner/Sherwood 2010, p.\ 295).
  3. The simulations made use of the so-called density functional theory (DFT), an approximation to quantum mechanics. For some parts of the simulation the respective calculations would have required far too much time. For these parts, the simplified “semi-empirical” SCC-DFTB method had to be used. Although it has been compared to DFT and found to deliver similar results, the use of this less exact method is considered as one possible explanation for the discrepancy with empirical data which was detected at one point (Kaestner/Sherwood 2010, p.\ 304).
  4. Finally, the simulation is realized within the “Chemshell” simulation framework (Chemshell), which of course also falls under the heading of “simulation techniques”. Summing it up, the simulation makes use of well reputed techniques, and where in doubt (as in the case of SCC-DFTB), further testing is done.

The simulation study discussed here does not exclusively rely on background theory, background knowledge and simulation techniques. Where possible and in so far as it is possible its results are compared to experimental data. (Experimental data does introduce questions of reliability of its own, which would lead too far to go into here. But even if it is not totally reliable, the comparison with empirical data is meaningful, because the experimental results are generated independently and if they match the simulation results then this does at least add some mutual “holistic” credibility to both of them.) The fact that the experimentally determined reaction barrier matches the barrier found in the simulations (within the error bar), strengthens the credibility of the first two above mentioned results concerning the role of mobile ions and the relative importance of the two alternative mechanisms of proton transfer in the peptid bond formation. The third result, concerning a “thetrahydral intermediate”, seems to be more or less a purely theoretical result. The fourth result in turn is obviously due to empirical testing. Just as if done by the book (and as it would please philosophers of science such as Karl Popper or Imre Lakatos), the contradictory empirical evidence is taken as a discrepancy (though for good reasons not already a total disconfirmation) that demands explanation and gives rise to new research questions.

Where does this all leave us? First of all, the exmple (hopefully) shows that the above stated categories (“sources of credibility”) allow by and large for an analysis of the epistemic situation of a typical computer simulation. All three sources of credibility come into play here, and at the same time nothing important seems to have been left out. The example, furthermore, seems to support the contention that even when empirical data is too sprase to conclude what mechanisms are at work in the target system from empirical data alone, it may still be good enough for the validation of computer simulations that serve as a tool to identify these mechanisms. If this can be granted then the example provides evidence for the “synergy of sources of credibility” as stated above (point 2.2 on page 2.2).

This is not to say that merely on the basis of such an analysis an evaluation of the credibility of the simulation would be possible. (This would require expert knowledge of the field under study and of the technologies employed in order to evaluate each of the sources of credibility in this particular case.) It is merely meant that these categories help us to understand the general research logic underlying simulation studies such as this one. Although the simulation study itself is quite complicated, its research logic seems to be very straight forward: The simulation is meant to simulate more or less “realistically” how the process of peptid bond formation takes place. It is built upon powerful and empirically confirmed background theories as well as on background knowledge. It employs well-reputed or otherwise tested simulation techniques. Where possible and so far as possible, the results are compared to empirical data. Discrepancies to the empirical data are properly taken care of. In fact, the basic research logic is so clear that there does not even need to be much debate about it. In the following we will see that quite the opposite is true for the research logic of many models and simulations in the social sciences.

[6] I am greatly indepted to Professor Johannes Kästner from the Institute of Theoretical Chemistry at the University of Stuttgart for explaining this fascinating area of research to me. Needless to say that what I write here is my own summary of this research for which I take the full responsibility.

[7] Drawing a rough analogy one could say that the reaction in the gas-phase amounts to what in the social sciences what be termed the “null hypothesis”. Regarding in how far the comparison is warranted, the authors state: “Of course, here the comparison was done with respect to the gas phase. A fairer comparison may be with the reaction in water. However, we doubt the validity of common continuum solvation models for this system, as many of the interactions are hydrogen bonds and interaction with ions (see below) that cannot be covered by continuum solvation models. Taking solvation in water or salt solutions into account explicitly is computationally rather demanding and, therefore, outside the scope of this work.” (Kaestner/Sherwood 2010, p. 8)

[8] Still, the problem of assessing reliability should not be taken lightly. In the social sciences there exist whole simulation-traditions which at no point seem to have a secure foundation in reality (see Arnold (2008)).

t g+ f @