Archive for model choice

Oxford, Oxfordshire

Posted in pictures, Statistics, University life with tags , , , , , , on February 23, 2012 by xi'an

Second Oxonian post of the week! And second English trip of the year. I will give a seminar lecture this afternoon in the Statistics Departement on ABC model choice, using the same slides as in Cambridge last month. (Following another ABC talk by Richard Wilkinson a few weeks ago.)

ABC [PhD] course

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , , , , on January 26, 2012 by xi'an

As mentioned in the latest post on ABC, I am giving a short doctoral course on ABC methods and convergence at CREST next week. I have now made a preliminary collection of my slides (plus a few from Jean-Michel Marin’s), available on slideshare (as ABC in Roma, because I am also giving the course in Roma, next month, with an R lab on top of it!):

and I did manage to go over the book by Gouriéroux and Monfort on indirect inference over the weekend. I still need to beef up the slides before the course starts next Thursday! (The core version of the slides is actually from the course I gave in Wharton more than a year ago.)

English trip (1)

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , on January 25, 2012 by xi'an

Today, I am attending a workshop on the use of graphics processing units in Statistics in Warwick, supported by CRiSM, presenting our recent works with Randal Douc, Pierre Jacob and Murray Smith. (I will use the same slides as in Telecom two months ago, hopefully avoiding the loss of integral and summation signs this time!) Pierre Jacob will talk about Wang-Landau.

Then, tomorrow, I am off to Cambridge to talk about ABC and model choice on Friday afternoon. (Presumably using the same slides as in Provo.)

The (1) in the title is in prevision of a second trip to Oxford next month and another one to Bristol two months after! (The trip to Edinburgh does not count of course, since it is in Scotland!)

About Fig. 4 of Fagundes et al. (2007)

Posted in R, Statistics, University life with tags , , , , , , , , on July 13, 2011 by xi'an

Yesterday, we had a meeting of our EMILE network on statistics for population genetics (in Montpellier) and we were discussing our respective recent advances in ABC model choice. One of our colleagues mentioned the constant request (from referees) to include the post-ABC processing devised by Fagundes et al. in their 2007 ABC paper. (This paper contains a wealth of statistical innovations, but I only focus here on this post-checking device.)

The method centres around the above figure, with the attached caption

Fig. 4. Empirical distributions of the estimated relative probabilities of the AFREG model when the AFREG (solid line), MREBIG (dashed line), and ASEG (dotted line) models are the true models. Here, we simulated 1,000 data sets under the AFREG, MREBIG, and ASEG models by drawing random parameter values from the priors. The density estimates of the three models at the AFREG posterior probability = 0.781 (vertical line) were used to compute the probability that AFREG is the correct model given our observation that PAFREG = 0.781. This probability is equal to 0.817.

which aims at computing a p-value based on the ABC estimate of the posterior probability of a model.

I am somehow uncertain about the added value of this computation and about the paradox of the sentence “the probability that AFREG is the correct model [given] the AFREG posterior probability (..) is equal to 0.817″… If I understand correctly the approach followed by Fagundes et al., they simulate samples from the joint distribution over parameter and (pseudo-)data conditional on each model, then approximate the density of the [ABC estimated] posterior probabilities of the AFREG model by a non parametric density estimate, presumably density(), which means in Bayesian terms the marginal likelihoods (or evidences) of the posterior probability of  the AFREG model under each of the models under comparison. The “probability that AFREG is the correct model given our observation that PAFREG = 0.781″ is then completely correct in the sense that it is truly a posterior probability for this model based on the sole observation of the transform (or statistic) of the data x equal to PAFREG(x). However, if we only look at the Bayesian perspective and do not consider the computational aspects, there is no rationale in moving from the data (or from the summary statistics) to a single statistic equal to PAFREG(x), as this induces a loss of information. (Furthermore, it seems to me that the answer is not invariant against the choice of the model whose posterior probability is computed, if more than two models are compared. In other words, the posterior probability of the AFREG model given the sole observation of PAFREG(x). is not necessarily the same as the posterior probability of the AFREG model given the sole observation of PASEG(x)…) Although this is not at all advised by the paper, it seems to me that some users of this processing opt instead for simulations of the parameter taken from the ABC posterior, which amounts to using the “data twice“, i.e. the squared likelihood instead of the likelihood…  So, while the procedure is formally correct (despite Templeton’s arguments against it), it has no added value. Obviously, one could alternatively argue that the computational precision in approximating the marginal likelihoods is higher with the (non-parametric) solution based on PAFREG(x) than the (ABC) solution based on x, but this is yet to be demonstrated (and weighted against the information loss).

Just as a side remark on the polychotomous logistic regression approximation to the posterior probabilities introduced in Fagundes et al.: the idea is quite enticing, as a statistical regularisation of ABC simulations. It could be exploited further by using a standard model selection strategy in order to pick the summary statistics that are truly contributed to explain the model index.

ABC model choice not to be trusted [3]

Posted in R, Statistics with tags , , , , , on January 31, 2011 by xi'an

On Friday, I received a nice but embarrassing email from Xavier Didelot. He indeed reminded me that I attended the talk he gave at the model choice workshop in Warwick last May, as, unfortunately but rather unsurprisingly giving my short span memory!, I had forgotten about it! Looking at the slides he joined to his email, I indeed remember attending the talk and expecting to get back to the results after the meeting. As I went from Warwick to Paris only to leave a day after for Benidorm, and the Valencia 9 meeting, in such a hurry that I even forgot my current black notebook, the plans of getting back to the talk got forgotten so completely that even reading the tech report (now appeared in Bayesian Analysis) could not rescind them!

Here are some of Xavier’s comments, followed by my answers: Continue reading