Archive for February, 2009

Computational Methods for Bayesian Model Choice

Posted in Statistics, University life with tags , , on February 26, 2009 by xi'an

Next Monday, I am starting a series of four (advanced graduate) lectures about computational methods for Bayesian model choice that tries to summarise what we know about Bayes factors and evidence from a computational viewpoint. This is therefore quite related with the seminar I gave in Montréal

and the seminar I gave in Montpellier

The textbook I use as support is Chen, Shao and Ibrahim’s Monte Carlo Methods in Bayesian Computation because it is more focussed on this specific issue than Monte Carlo Statistical Methods or Bayesian Core. The courses will take place at CREST-ENSAE, Monday 3 from 11am to 1pm, Thursday 5 from 10am till 1pm, Monday 9 from 10am till 1pm and Thursday 12 from 11am till 1pm.

Good size swans and turkeys

Posted in Books, Statistics with tags , , , , on February 24, 2009 by xi'an

In connection with The Black Swan, Nassim Taleb wrote a small essay called The Fourth Quadrant on The Edge. I found it much more pleasant to read than the book because (a) it directly focus on the difficulty of dealing with fat tail distributions and the prediction of extreme events, and (b) it is delivered in a much more serene tone than the book (imagine, just the single remark about the Frenchs!). The text contains insights on loss functions and inverse problems which, even though they are a bit vague, do mostly make sense. As for The Black Swan, I deplore (a) the underlying determinism of the author, which still seems to believe in an unknown (and possibly unachievable) model that would rule the phenomenon under study and (b) the lack of temporal perspective and of the possibility of modelling jumps as changepoints, i.e. model shifts. Time series have no reason to be stationary, the less so the more they depend on all kinds of exogeneous factors. I actually agree with Taleb that, if there is no information about the form of the tails of the distribution corresponding to the phenomenon under study—assuming there does exist a distribution—, estimating the shape of this tail from the raw data is impossible.

The essay is followed by a technical appendix that expands on fat tails, but not so deeply as to be extremely interesting. A surprising side note is that Taleb seems to associate stochastic volatility with mixtures of Gaussians. In my personal book of models, stochastic volatility is a noisy observation of the exponential of a random walk, something like\nu_t={\exp(ax_{t-1}+b\epsilon_t)},thus with much higher variation (and possibly no moments). To state that Student’s t distributions are more variable than stochastic volatility models is therefore unusual… There is also an analysis over a bizillion datasets of the insanity of computing kurtosis when the underlying distribution may not have even a second moment. I could not agree more: trying to summarise fat tail distributions by their four first moments does not make sense, even though it may sell well. The last part of the appendix shows the equal lack of stability of estimates of the tail index{\alpha},which again is not a surprising phenomenon: if the tail bound K is too low, it may be that the power law has not yet quicked in while, if it is too large, then we always end up with not enough data. The picture shows how the estimate widely varies with K around its theoretical value for the log-normal and three Pareto distributions, based on a million simulations. (And this is under the same assumption of stationarity as above.) So I am not sure what the message is there. (As an aside, there seems to be a mistake in the tail expectation: it should be

\dfrac{\int_K^\infty x x^{-\alpha} dx}{\int_K^\infty x^{-\alpha} dx} = \dfrac{K(\alpha-1)}{(\alpha-2)}

if the density decreases in\alpha\cdotsIt is correct when\alphais the tail power of the cdf.)power estimate

Blade Runner

Posted in Books, Kids with tags , , on February 23, 2009 by xi'an

Over the past weekend, I watched Blade Runner with my kids, as I was forced to inactivity by the demise of my mailbox! I had not watched the movie for twenty years, since the time I was a postdoc at Cornell and enjoying the student movie club, so it was almost like watching Blade Runner for the first time. (In particular, except for the cut of the final scene, I could not spot changes from the 1977 version.)

The atmosphere of the movie has not changed, though, in its oppressiveness. The play on lights is a major factor for this feeling with no natural light ever used but instead side glares that enter buildings periodically, including the apartment of the detective, Deckard (which makes it appear less private, in a Big Brother kind of way), or wax candles for the magnate Tyrell. The sci-fi touch is somewhat light, except for the obligatory flying cars (in 2019?!), which is just as good because this does not age well (like, the computer screens already appear antiquated or the phones are fixed phones, not cell phones). The themes are highly reminiscent of Philip K. Dick‘s universe, with an asianised LA, including origamis (just as in The Man in the High Castle), a permanent ambiguity/paranoia about the status/feeling of the characters (it is never clear in the movie that Deckard is not a replicant), the dubious nature of humanity, and a pessimistic view of the future civilisations. I did not remember, though, the strong connections with the films noirs of the 50′s, from the light—and the omnipresent cigarette smoke diffracting this light—to the costumes, and obviously to the hard-boiled attitude of Deckard. Even though I found the interpretation of Harrisson Ford somehow missing in depth (but this may be part of the ambiguity about his true nature, human versus replicant), I still agree with my former impression of Blade Runner being truly a cult film. (Unsurprisingly, my kids found the movie terrible, if only for the “poor” special effects!)

Model choice by Kullback projection (2)

Posted in Statistics with tags , , , , on February 20, 2009 by xi'an

Yesterday I talked about the paper of Nott and Cheng at the Bayesian model choice group and [with the help of the group] realised that my earlier comment on the paper

There is however one point with which I disagree, namely that the predictive on the submodel is obtained in the current paper by projecting a Monte Carlo or an MCMC sample from the predictive on the full model, while I think this is incorrect because the likelihood is then computed using the parameter for the full model. Using a projection of such a sample means at least reweighting by the ratio of the likelihoods…

was not completely accurate. The point is [I think] correct when considering the posterior distribution of the projected parameters. Thus, using a projection of an MCMC sample corresponing to the full model will not result in a sample from the posterior distribution of the projected parameters. On the other hand, projecting the MCMC sample in order to get the Kullback-Leibler distance posterior distribution as done in the applications of Section 7 of the paper is completely kosher, since this is a quantity that only depends on the full model parameters. Since Nott and Cheng do not consider the projected model at any time (even though Section 3 is slightly unclear, using a posterior on the projected parameter), there is nothing wrong in their paper and I do find quite interesting the idea that the lasso penalty allows for a simultaneous exploration of the most likely submodels without a recourse to a more advanced technique like reversible jump. (The comparison is obviously biased as the method does not provide a true posterior on the most likely submodels, only an approximation of their probability. Simulating from the constrained projected posterior would require extra steps.)

ABC methods for model choice in Gibbs random fields

Posted in Statistics with tags , , , , , , on February 19, 2009 by xi'an

1tqga from Thermotoga maritimaWe have resubmitted to Bayesian Analysis a revised version of our paper ” ABC methods for model choice in Gibbs random fields” available on arXiv. The only major change is the addition of a second protein example in the biophysical illustration. The core idea in this paper is that, for Gibbs random fields and in particular for Ising models, when comparing several neighbourhood structures, the computation of the posterior probabilities of the models/structures under competition can be operated by likelihood-free simulation techniques akin to the Approximate Bayesian Computation (ABC) algorithm often discussed here. The point for this resolution is that, due to the specific structure of Gibbs random field distributions, there exists a sufficient statistic across models which allows for an exact (rather than Approximate) simulation from the posterior probabilities of the models. Obviously, when the structures grow more complex, it becomes necessary to introduce a true ABC step with a tolerance threshold\mathbf{\epsilon}in order to avoid running the algorithm for too long. Our toy example shows that the accuracy of the approximation of the Bayes factor can be greatly improved by resorting to the original ABC approach, since it allows for the inclusion of many more simulations. In the biophysical application to the choice of a folding structure for two proteins, we also demonstrate that we can implement the ABC solution on realistic datasets and, in the examples processed there, that the Bayes factors allow for a ranking more standard methods (FROST, TM-score) do not.


Get every new post delivered to your Inbox.

Join 598 other followers