## Topological sensitivity analysis for systems biology

**M**ichael Stumpf sent me Topological sensitivity analysis for systems biology, written by Ann Babtie and Paul Kirk, *en avant-première* before it came out in PNAS and I read it during the trip to NIPS in Montréal. (The paper is published in open access, so everyone can read it now!) The topic is quite central to a lot of debates about climate change, economics, ecology, finance, &tc., namely to assess the impact of using the wrong model to draw conclusions and make decisions about a real phenomenon. (Which reminded me of the distinction between mechanical and phenomenological models stressed by Michael Blum in his NIPS talk.) And it is of much interest from a Bayesian point of view since assessing the worth of a model requires modelling the “outside” of a model, using for instance Gaussian processes as in the talk Tony O’Hagan gave in Warwick earlier this term. I would even go as far as saying that the issue of assessing [and compensating for] how wrong a model is, given available data, may be the (single) most under-assessed issue in statistics. We (statisticians) have yet to reach our Boxian era.

In Babtie et al., the space or universe of models is represented by network topologies, each defining the set of “parents” in a semi-Markov representation of the (dynamic) model. At which stage Gaussian processes are also called for help. Alternative models are ranked in terms of fit according to a distance between simulated data from the original model (sounds like a form of ABC?!). Obviously, there is a limitation in the number and variety of models considered this way, I mean there are still assumptions made on the possible models, while this number of models is increasing quickly with the number of nodes. As pointed out in the paper (see, e.g., Fig.4), the method has a parametric bootstrap flavour, to some extent.

What is unclear is how one can conduct Bayesian inference with such a collection of models. Unless all models share the same “real” parameters, which sounds unlikely. The paper mentions using uniform prior on all parameters, but this is difficult to advocate in a general setting. Another point concerns the quantification of how much one can trust a given model, since it does not seem models are penalised by a prior probability. Hence they all are treated identically. This is a limitation of the approach (or an indication that it is only a preliminary step in the evaluation of models) in that some models within a large enough collection will eventually provide an estimate that differs from those produced by the other models. So the assessment may become altogether highly pessimistic for this very reason.

“If our parameters have a real, biophysical interpretation, we therefore need to be very careful not to assert that we know the true values of these quantities in the underlying system, just because–for a given model–we can pin them down with relative certainty.”

In addition to its relevance for moving towards approximate models and approximate inference, and in continuation of yesterday’s theme, the paper calls for nested sampling to generate samples from the posterior(s) and to compute the evidence associated with each model. (I realised I had missed this earlier paper by Michael and co-authors on nested sampling for system biology.) There is no discussion in the paper on why nested sampling was selected, compared with, say, a random walk Metropolis-Hastings algorithm. Unless it is used in a fully automated way, but the paper is rather terse on that issue… And running either approach on 10⁷ models in comparison sounds like an awful lot of work!!! Using importance [sampling] nested sampling as we proposed with Nicolas Chopin could be a way to speed up this exploration if all parameters are identical between all or most models.

December 17, 2014 at 9:23 am

Dear Christian,

Many thanks for your discussion of our PNAS paper Babtie et al. ; Paul Kirk and Ann Babtie are joint first authors). The central question really is, how we can choose from many models (thousands or millions)? Outside the physical sciences it is hard to see how we can come up with good models

ab initio. A bit of educated guesswork is typically involved in the biological (and social, economic etc) sciences. But these guesses, if incorrect, can cloud our analysis. In scientific analysis (unlike in machine learning/big data) we do really care about the models as they provide the best way to get mechanistic insights into the inner workings of e.g. biological systems.In the present context the paper argues that analysis should be based on a universe of models; we can define this by proposing alterations to a base-line candidate models. These can be generated automatically (limiting ourselves to simple e.g. 1-edge, 2-edge … alterations to the underlying network structure). Computational cost is a minor concern compared to getting the basic facts wrong because an inappropriate model was used.

How to deal with this many models from a purely Bayesian perspective is obviously challenging but almost certainly worthwhile.

It is probably useful to stress that TSA in its present form addresses a recurring scientific problem in a pragmatic (but I believe statistically sound) manner. One advantage, as in all parametric bootstrap flavoured approaches, is that the approach is capable of dealing with different inference techniques (ranging from optimization to proper inference). Nested sampling is for us a convenient approach largely because it is implemented very efficiently for dynamical systems in SYSBIONS (see Johnson, Kirk and Stumpf, Bioinformatics ); to cope with millions of models, however, does require us to determine a smaller set of models, and for this we use the simple optimization shown in Figure S1; after this we perform nested sampling on a smaller number of models that are compatible with the observed system behaviour. At this stage TSA should also, however, straightforwardly integrate with ABC and exact approaches, or software such as Copasi.

I agree with you that there is a lot of interesting work left to be done to map out the statistical foundations for dealing with many models.