Matt Moores, Tony Pettitt, and Kerrie Mengersen arXived a paper yesterday comparing different computational approaches to the processing of hidden Potts models and of the intractable normalising constant in the Potts model. This is a very interesting paper, first because it provides a comprehensive survey of the main methods used in handling this annoying normalising constant Z(β), namely pseudo-likelihood, the exchange algorithm, path sampling (a.k.a., thermal integration), and ABC. A massive simulation experiment with individual simulation times up to 400 hours leads to select path sampling (what else?!) as the (XL) method of choice. Thanks to a pre-computation of the expectation of the sufficient statistic E[S(Z)|β]. I just wonder why the same was not done for ABC, as in the recent Statistics and Computing paper we wrote with Matt and Kerrie. As it happens, I was actually discussing yesterday in Columbia of potential if huge improvements in processing Ising and Potts models by approximating first the distribution of S(X) for some or all β before launching ABC or the exchange algorithm. (In fact, this is a more generic desiderata for all ABC methods that simulating directly if approximately the summary statistics would being huge gains in computing time, thus possible in final precision.) Simulating the distribution of the summary and sufficient Potts statistic S(X) reduces to simulating this distribution with a null correlation, as exploited in Cucala and Marin (2013, JCGS, Special ICMS issue). However, there does not seem to be an efficient way to do so, i.e. without reverting to simulating the entire grid X…
Archive for Brisbane
Dirk Kroese (from UQ, Brisbane) and Joshua Chan (from ANU, Canberra) just published a book entitled Statistical Modeling and Computation, distributed by Springer-Verlag (I cannot tell which series it is part of from the cover or frontpages…) The book is intended mostly for an undergrad audience (or for graduate students with no probability or statistics background). Given that prerequisite, Statistical Modeling and Computation is fairly standard in that it recalls probability basics, the principles of statistical inference, and classical parametric models. In a third part, the authors cover “advanced models” like generalised linear models, time series and state-space models. The specificity of the book lies in the inclusion of simulation methods, in particular MCMC methods, and illustrations by Matlab code boxes. (Codes that are available on the companion website, along with R translations.) It thus has a lot in common with our Bayesian Essentials with R, meaning that I am not the most appropriate or least
unbiased reviewer for this book. Continue reading
Last evening, I read a nice paper with the above title by Drovandi, Pettitt and McCutchan, from QUT, Brisbane. Low count refers to observation with a small number of integer values. The idea is to mix ABC with the unbiased estimators of the likelihood proposed by Andrieu and Roberts (2009) and with particle MCMC… And even with a RJMCMC version. The special feature that makes the proposal work is that the low count features allows for a simulation of pseudo-observations (and auxiliary variables) that may sometimes authorise an exact constraint (that the simulated observation equals the true observation). And which otherwise borrows from Jasra et al. (2013) “alive particle” trick that turns a negative binomial draw into an unbiased estimation of the ABC target… The current paper helped me realise how powerful this trick is. (The original paper was arXived at a time I was off, so I completely missed it…) The examples studied in the paper may sound a wee bit formal, but they could lead to a better understanding of the method since alternatives could be available (?). Note that all those examples are not ABC per se in that the tolerance is always equal to zero.
The paper also includes reversible jump implementations. While it is interesting to see that ABC (in the authors’ sense) can be mixed with RJMCMC, it is delicate to get a feeling about the precision of the results, without a benchmark to compare to. I am also wondering about less costly alternatives like empirical likelihood and other ABC alternatives. Since Chris is visiting Warwick at the moment, I am sure we can discuss this issue next week there.
Today, Ewan Cameron arXived a paper that generalises our Robert and Marin (2010) paper on the measure theoretic difficulties (or impossibilities) of the Savage-Dickey ratio and on the possible resolutions. (A paper of mine’s I like very much despite it having neither impact nor quotes, whatsoever! Until this paper.) I met Ewan last year when he was completing a PhD with Tony Pettitt at QUT in astrostatistics, but he
also worked did not work on this transdimensional ABC algorithm with application to worm invasion in Northern Alberta (arXive I reviewed last week)… Ewan also runs a blog called Another astrostatistics blog, full of goodies, incl. the one where he announces he moves to… zoology in Oxford! Anyway, this note extends our paper and a mathematically valid Savage-Dickey ratio representation to the case when the posterior distributions have no density against the Lebesgue measure. For instance for Dirichlet processes or Gaussian processes priors. Using generic Radon-Nykodim derivatives instead. The example is somewhat artificial, superimposing a Dirichlet process prior onto the Old faithful benchmark. But this is an interesting entry, worth mentioning, into the computation of Bayes factors. And the elusive nature of the Savage-Dickey ratio representation.
We (Kerrie Mengersen, Pierre Pudlo, and myself) have now revised our ABC with empirical likelihood paper and resubmitted both to arXiv and to PNAS as “Approximate Bayesian computation via empirical likelihood“. The main issue raised by the referees was that the potential use of the empirical likelihood (EL) approximation is much less widespread than the possibility of simulating pseudo-data, because EL essentially relies on an iid sample structure, plus the availability of parameter defining moments. This is indeed the case to some extent and also the reason why we used a compound likelihood for our population genetic model. There are in fact many instances where we simply cannot come up with a regular EL approximation… However, the range of applications of straight EL remains wide enough to be of interest, as it includes most dynamical models like hidden Markov models. To illustrate this point further, we added (in this revision) an example borrowed from the recent Biometrika paper by David Cox and Christiana Kartsonaki (which proposes a frequentist alternative to ABC based on fractional design). This model ended up being fairly appealing wrt our perspective: while the observed data is dependent in a convoluted way, being a superposition of N renewal processes with gamma waiting times, it is possible to recover an iid structure at the same cost as a regular ABC algorithm by using the pseudo-data to recover an iid process (the sequence of renewal processes indicators)…The outcome is quite favourable to ABCel in this particular case, as shown by the graph below (top: ABCel, bottom: ABC, red line:truth):
This revision (started while visiting Kerrie in Brisbane) was thus quite beneficial to our perception of ABC in that (a) it is indeed not as universal as regular ABC and this restriction should be spelled out (the advantage being that, when it can be implemented, it usually runs much much faster!), and (b) in cases where the pseudo-data must be simulated, EL provides a reference/benchmark for the ABC output that comes for free… Now I hope to manage to get soon out of the “initial quality check” barrage to reach the Editorial Board!
Last year at this time, Peter Sarnak toured Australia talking about randomness in number theory and Moebius randomness in dynamics. Recently, he pointed a paper on the arXiv in which he claims that the distribution of [integer coordinate] points on the sphere of radius √n which satisfy
is random as n goes to infinity (the paper is much more precise). You mentioned tests which look for non-randomness. How does one test for a non-random distribution of points on the sphere?
Interesting question, both for linking two AMSI Lecture tours (Peter Sarnak’s schedule sounded more gruelling than mine!) and for letting me get a look at this paper. Plus for the connection with probabilistic number theory. This paper indeed stands within the area of randomness in number theory rather than random generation and I do not see an obvious connection here, but the authors of the paper undertake a study of the randomness of the solutions to the above equation for a fixed n using statistics and their limiting distribution. (I am not certain of the way points are obtained over a square on Fig. 1, presumably this is using the spherical coordinates of the projections over the unit sphere in R3.) Their statistics are the electrostatic energy, Ripley’s point pair statistic, the nearest neighbour spacing measure, minimum spacing, and the covering radius. The most surprising feature of this study is that this randomness seems to be specific to the dimension 3 case: when increasing the number of terms in the above equation, the distribution of the solutions seems more rigid and less random…