Archive for semi-automatic ABC

summary statistics for ABC model choice

Posted in Statistics with tags , , , , , , , , , on March 11, 2013 by xi'an

countryside near Kenilworth, England, March 5, 2013A few days ago, Dennis Prangle, Paul Fernhead, and their co-authors from New Zealand have posted on arXiv their (long-awaited) study of the selection of summary statistics for ABC model choice. And I read it during my trip to England, in trains and planes, if not when strolling in the beautiful English countryside as above.

As posted several times on this ‘Og, the crux of the analysis is that the Bayes factor is a good type of summary when comparing two models, this result extending to more model by considering instead the vector of evidences. As in the initial Read Paper by Fearnhead and Prangle, there is no true optimality in using the Bayes factor or vector of evidences, strictly speaking, besides the fact that the vector of evidences is minimal sufficient for the marginal models (integrating out the parameters). (This was a point made in my discussion.) The implementation of the principle is similar to this Read Paper setting as well: run a pilot ABC simulation, estimate the vector of evidences, and re-run the main ABC simulation using this estimate as the summary statistic. The paper contains a simulation study using some of our examples (in Marin et al., 2012), as well as an application to genetic bacterial data. Continue reading

semi-automatic ABC [reply]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on June 5, 2012 by xi'an

When I came back from LGM2012 in Trondheim, I found the latest issue of Series B on my desk. It is much thicker than in “my” days, with about 250 pages in this June 2012 issue! (One reason is that it contains two Read Papers with their discussions, amounting to 110 pages of the journal.) The first Read Paper was “Catching up faster by switching sooner” by van Erven, Grünwald and de Rooij, that we discussed with Nicolas Chopin. There are also comments (among others!) from Stephen Lauritzen, Iain Murray, and Aki Vehtari, who also spoke about Bayesian model evaluation tools at LGM2012. The second Read Paper is Fearnhead’s and Prangle’s semi-automatic ABC that I discussed last December. I have already posted about this Read Paper and used some of the discussion in preparing my ABC PhD class in Roma.  However, the remark we made in our discussion with Jean-Michel Marin that the Bayes factor would not be a pertinent summary statistic for model choice is wrong, as shown by Dennis Prangle in his poster at the workshop in Bristol. And, when reading the reply by Paul Fearnhead and Dennis Prangle, I do not see a satisfactory answer to my demand of more formal conditions for Theorem 2 and its corollary, the convergence of the noisy ABC posterior to the true parameter (page 425), to apply. (Such results exist in indirect inference.)

ABC for missing data

Posted in Statistics, University life with tags , , , , , , on February 2, 2012 by xi'an

I received this email a few days ago:

I am an hard-core reader of your blog and thanks to your posts I have recently started being interested to ABC (and Bayesian methods in general). I am writing you to ask for suggestions on the application of the semi-automatic ABC à la Fearnhead & Prangle. The reason why I am contacting you instead of addressing the authors is because (i) you have been involved in the RSS reading of their paper and (ii) you are an authority on ABC, and therefore you are probably best suited and less biased on such issue. I am applying ABC with the semi-automatic statistics selection provided in Fearnhead and Prangle (2012) to a problem which can be formalized as a hidden Markov model. However I am unsure of whether I am making a huge mistake on the following point: let’s suppose we have an unobserved (latent) system state X (depending on an unknown parameter θ) and a corresponding “corrupted” version which is observed with some measurement error, e.g.

Y = X + ε,

where ε is the measurement error independent of X, ε is N(0, σ²), say. Now their setup does not consider measurement error, so I wonder if the following is correct. Since I can simulate n times some X’ from p(X|θ) am I allowed to use the corresponding “simulated” n corrupted measurements

Y’ = X’ + ε’

(where ε’ is a draw from p(ε|σ)) into their regression approach to determine a (vector of) summary statistic S=(S1,S2) for (θ,σ)? I mean the Y’ are draws from a p(y|X’,θ,σ) which is conditional on X’. Is this allowed? Wilkinson (2008) is the only reference I have found considering ABC with measurement-error (the ones by Dean et al (2011) and Jasra et al (2011) being too technical in my opinion to allow a practical implementation) and since he does not consider a summary statistics-based approach (e.g. Algorithm D, page 10) of course he is not in need to simulate the corrupted measurements but only the latent ones. Therefore I am a bit unsure on whether it is statistically ok to simulate Y’ conditionally on X’ or if there is some theoretical issue that makes this nonsense.

to which I replied

…about your model and question, there is no theoretical difficulty in simulating x’ then y’given x’ and a value of the parameters. The reason is that

y' \sim \int f(y',x'|\theta) \text{d}x' = f(y'|\theta)

.the proper marginal as defined by the model. Using the intermediate x’ is a way to bypass the integral but this is 100% correct!…

a reply followed by a further request for precision

Although your equation is clearly true, I am not sure I fully grasp the issue, so I am asking for confirmation. Yes, as you noticed I need a

y’ ~ f(y’|θ,σ)

Now it’s certainly true that I can generate a draw x’ from f(x’|θ,σ) and then plug such x’ into f(y’|x’,θ,σ) to generate y’. By proceeding this way I obtain a draw (y’,x’) from f(y’,x’|θ,σ). I think I understood your reasoning, on how by using the procedure above I am actually skipping the computation of the integral in:

\int f(y',x'|\theta, \sigma) \text{d}x'.

Is it basically the case that the mechanism above is just a classic simulation from a bivariate distribution, where since I am interested in the marginal f(y’|θ,σ) I simulate from the joint density f(y’,x’|θ,σ) and then discard the x’ output?

which is indeed a correct interpretation. When simulating from a joint, the marginals are by-products of the simulation.

Follow

Get every new post delivered to your Inbox.

Join 551 other followers