can we trust computer simulations?
How can one validate the outcome of a validation model? Or can we even imagine validation of this outcome? This was the starting question for the conference I attended in Hannover. Which obviously engaged me to the utmost. Relating to some past experiences like advising a student working on accelerated tests for fighter electronics. And failing to agree with him on validating a model to turn those accelerated tests within a realistic setting. Or reviewing this book on climate simulation three years ago while visiting Monash University. Since I discuss in details below most talks of the day, here is an opportunity to opt away!
Bill Oberkampf, co-author with C. Roy—who could not deliver a second keynote lecture on the same topic thanks to travel issues—, of Verification and Validation in Scientific Computing (2010) presented his principles as skepticism, testing and assessment, and pragmatism. But what is evidence? This would need to be defined first before speaking of verification or validation, the later being hard to buy since we can never truly validate a model but at best not invalidate it. I agree with his statement that the most important issue is predictive accuracy and the associated predictive uncertainty, for a model is almost always wrong and hence it makes little sense to try to validate it as “truth”. Oberkampf tried to make a distinction between statistical inference and physical inference, apparently because statistics (a) is more concerned with variance than bias and (b) cannot extrapolate! Which was kind of contradictory with the following statements that engineers had to use expertise and knowledge to build this extrapolation since this meant using prior knowledge, essentially… He however started discussing Tony O’Hagan’s views on model uncertainty but did not seem to acknowledge the potential of non-parametric models, with statements like increasing the number of parameters would decrease the impact of the choice of a model. Or not, since increasing to infinity would indeed lead to non-parametric models. When advocating imprecise probabilities as a necessity Oberkampf also attacked the way Bayesians model model uncertainty as random. (On a general basis, I am not convinced by imprecise probabilities, as shortly discussed in The B Choice.) Another issue: calibration was opposed to validation (or estimation?) in the talk and endowed with a negative value, a stigma, which I did not fully understood. Is it because it seems more “subjective” than inference?
“Computer simulation has a distinct epistemology. In other words, the techniques simulationists use to attempt to justify simulation are unlike anything that usually passes for epistemology in the philosophy of science literature. I would like to focus on three of the unusual features of this epistemology: it is downward, it is autonomous due to a scarcity of data, and it is motley.” Eric Winsberg (2001)
As a philosopher, Claus Beisbart who is one of the organisers of this workshop (and edited the book Probabilities in Physics a a few years ago) started the philosophical debate about the notion of validation per se. First, is there a truth behind the outcome of a computer simulation? The answer is somewhat frequentist in that repeating a computer experiment brings a representation of the system behind those experiments. (Truth is anyway such a loaded term that it should better be avoided.) The next step is moving from “truth” to “confirmation”, referring to Hempel’s Bayesian epistemology. Versus Popper’s falsificationism. Then even more narrowly to models (and simulations) being useful for making inference about a system. Interesting quote from Eric Winsberg above, even though I feel it drifts away from the central question. Back to the theme, Claus estimates a general theory of validation is not possible, but also that we can beyond mere testing. Again, this leaves the validity of a prediction open. In that it seems to restrict validation to a model rather than to the real phenomenon.
“Verification can be thought of as solving the chosen equations correctly, while validation is choosing the correct equation in the first place.” C. Roy (2005)
Eric Winsberg (great artpiece on his book cover!) came back to his quote after lunch. With a claim that validation and verification cannot be completely separated. Opposing statements like Roy’s statement above. (The same Roy coauthor of Verification and Validation in Scientific Computing.) He then went into a fairly realistic representation of model building under computational constraints. And then questioned the mathematical nature of verification and the physical nature of validation in such a complex construct. This weakens very much the validation of a theoretical model. And reinforces his argument in favour of a specific epistemology of computer experiments. The interesting point for me is how he pointed the gap between justifying the computer model versus justifying the analytical model.
In the application session, Aki Lehtinen working on voting models started his talk by trying to explain why economists shun simulation (and hence why his papers keep being rejected for this very reason!). With added difficulties of his own making (!) due to working on supercomputers and in Fortran, which seriously shrink the pool of reviewers able to reproduce his results and check his code. Which also begs for the way to validate (as journals) big computer codes. Which brings us back to the topic of this workshop. The next talk was by Alan Calder about uncertainty quantification in astrophysics. Which was interesting (for me) as a reminder in star evolution from cloud to star to red giant to white dwarf with the three branches in the temperature-luminosity (HR) diagram. But less clear from a simulation-validation perspective, except showing the MESA simulator was poorly reproducing the HR diagram. Wilfred van Gunsteren gave an overview of validating simulations in molecular dynamics. With the specific feature of offering several levels of granularity. And a neat picture where simulation results followed experimental ones along years, showing a bias in most of those studies. And a list of the seven sins of academic publication. A nice talk that however drifted somewhat away from validation to reproducibility. Heinke Schlünzen gave the last talk of the day on atmospheric models. Giving us more reasons for evaluating computer models and establishing a generic protocol to do so. Also mentioning the issue of multiple granularity (in scale, resolution, scope, range).