**W**hile preparing crêpes at home yesterday night, I browsed through the most recent issue of Significance and among many goodies, I spotted an article by McKay and co-authors discussing the simulation of a British vs. German naval battle from the First World War I had never heard of, the Battle of the Dogger Bank. The article was illustrated by a few historical pictures, but I quickly came across a more statistical description of the problem, which was not about creating wargames and alternate realities but rather inferring about the likelihood of the actual income, i.e., whether or not the naval battle outcome [which could be seen as a British victory, ending up with 0 to 1 sunk boat] was either a lucky strike or to be expected. And the method behind solving this question was indeed both Bayesian and ABC-esque! I did not read the longer paper by McKay et al. (hard to do while flipping crêpes!) but the description in Significance was clear enough to understand that the six summary statistics used in this ABC implementation were the number of shots, hits, and lost turrets for both sides. (The answer to the original question is that indeed the British fleet was lucky to keep all its boats afloat. But it is also unlikely another score would have changed the outcome of WWI.) [As I found in this other history paper, ABC seems quite popular in historical inference! And there is another completely unrelated arXived paper with main title The Fog of War…]

## Archive for ABC

## ABC at sea and at war

Posted in Books, pictures, Statistics, Travel with tags ABC, Approximate Bayesian computation, Battle of the Dogger Bank, counterfactuals, crêpes, first World War, history, Jutland, naval battle, Significance, The Fog of War, wargame on July 18, 2017 by xi'an## g-and-k [or -h] distributions

Posted in Statistics with tags ABC, ABC de Sevilla, benchmark, Dennis Prangle, g-and-k distributions, MCMC, numerical derivation, numerical integration, quantile distribution, Wasserstein distance on July 17, 2017 by xi'an**D**ennis Prangle released last week an R package called gk and an associated arXived paper for running inference on the g-and-k and g-and-h quantile distributions. As should be clear from an earlier review on Karian’s and Dudewicz’s book quantile distributions, I am not particularly fond of those distributions which construction seems very artificial to me, as mostly based on the production of a closed-form quantile function. But I agree they provide a neat benchmark for ABC methods, if nothing else. However, as recently pointed out in our Wasserstein paper with Espen Bernton, Pierre Jacob and Mathieu Gerber, and explained in a post of Pierre’s on Statisfaction, the pdf can be easily constructed by numerical means, hence allows for an MCMC resolution, which is also a point made by Dennis in his paper. Using the closed-form derivation of the Normal form of the distribution [i.e., applied to Φ(x)] so that numerical derivation is not necessary.

## MCM 2017 snapshots [#2]

Posted in Books, pictures, Running, Statistics, University life with tags ABC, ABC consistency, abcrf, Art Owen, GNU C library, MCM 2017, Mersenne Twisters, Monte Carlo Statistical Methods, Montréal, R, random forests, ratio of uniform algorithm on July 7, 2017 by xi'an**O**n the second day of MCM 2017, Emmanuel Gobet (from Polytechnique) gave the morning plenary talk on regression Monte Carlo methods, where he presented several ways of estimating conditional means of rv’s in nested problems where conditioning involves other conditional expectations. While interested in such problems in connection with ABC, I could not see how the techniques developed therein could apply to said problems.

By some of random chance, I ended up attending a hard-core random generation session where the speakers were discussing discrepancies between GNU library generators [I could not understand the target of interest and using MCMC till convergence seemed prone to false positives!], and failed statistical tests of some 64-bit Mersenne Twisters, and low discrepancy on-line subsamples of Uniform samples. Most exciting of all, Josef Leydold gave a talk on ratio-of-uniforms, on which I spent some time a while ago (till ending up reinventing the wheel!), with highly refined cuts of the original box.

My own 180 slides [for a 50mn talk] somewhat worried my chairman, Art Owen, who kindly enquired the day before at the likelihood I could go through all 184 of them!!! I had appended the ABC convergence slides to an earlier set of slides on ABC with random forests in case of questions about that aspect, although I did not plan to go through those slides [and I mostly covered the 64 other slides] As the talk was in fine more about an inference method than a genuine Monte Carlo technique, plus involved random forests that sounded unfamiliar to many, I did not get many questions from the audience but had several deep discussions with people after the talk. Incidentally, we have just reposted our paper on ABC estimation via random forests, updated the abcrf R package, and submitted it to Peer Community in Evolutionary Biology!

## MCM 2017

Posted in Statistics with tags ABC, ABC algorithm, ABC consistency, Bayesian model choice, curse of dimensionality, Hilbert curve, MCM 2017, Montréal, population genetics, Québec, random forests, summary statistics, Wasserstein distance on July 3, 2017 by xi'an## exciting week[s]

Posted in Mountains, pictures, Running, Statistics with tags ABC, ABC validation, École Normale Supérieure, Bayesian nonparametrics, BNP11, Domaine Coste Moynier, Grés de Montpellier, mixtures of distributions, PCI Comput Stats, PCI Evol Biol, Peer Community, Pic Saint Loup, Saint Christol, Université de Montpellier, Wasserstein distance on June 27, 2017 by xi'an**T**he past week was quite exciting, despite the heat wave that hit Paris and kept me from sleeping and running! First, I made a two-day visit to Jean-Michel Marin in Montpellier, where we discussed the potential Peer Community In Computational Statistics (*PCI Comput Stats*) with the people behind PCI Evol Biol at INRA, Hopefully taking shape in the coming months! And went one evening through a few vineyards in Saint Christol with Jean-Michel and Arnaud. Including a long chat with the owner of Domaine Coste Moynier. *[Whose domain includes the above parcel with views of Pic Saint-Loup.]* And last but not least! some work planning about approximate MCMC.

On top of this, we submitted our paper on ABC with Wasserstein distances [to be arXived in an extended version in the coming weeks], our revised paper on ABC consistency thanks to highly constructive and comments from the editorial board, which induced a much improved version in my opinion, and we received a very positive return from JCGS for our paper on weak priors for mixtures! Next week should be exciting as well, with BNP 11 taking place in downtown Paris, at École Normale!!!

## ACDC versus ABC

Posted in Books, Kids, pictures, Statistics, Travel with tags ABC, ACC, ACDC, Bayesian inference, frequentist coverage, Harvard University on June 12, 2017 by xi'an**A**t the Bayes, Fiducial and Frequentist workshop last month, I discussed with the authors of this newly arXived paper, Approximate confidence distribution computing, Suzanne Thornton and Min-ge Xie. Which they abbreviate as ACC and not as ACDC. While I have discussed the notion of confidence distribution in some earlier posts, this paper aims at producing proper frequentist coverage within a likelihood-free setting. Given the proximity with our recent paper on the asymptotics of ABC, as well as with Li and Fearnhead (2016) parallel endeavour, it is difficult (for me) to spot the actual distinction between ACC and ABC given that we also achieve (asymptotically) proper coverage when the limiting ABC distribution is Gaussian, which is the case for a tolerance decreasing quickly enough to zero (in the sample size).

“Inference from the ABC posterior will always be difficult to justify within a Bayesian framework.”

Indeed the ACC setting is eerily similar to ABC apart from the potential of the generating distribution to be data dependent. (Which is fine when considering that the confidence distributions have no Bayesian motivation but are a tool to ensure proper frequentist coverage.) That it is “able to offer theoretical support for ABC” (p.5) is unclear to me, given both this data dependence and the constraints it imposes on the [sampling and algorithmic] setting. Similarly, I do not understand how the authors “are not committing the error of doubly using the data” (p.5) and why they should be concerned about it, standing outside the Bayesian framework. If the prior involves the data as in the Cauchy location example, it literally *uses* the data [once], followed by an ABC comparison between simulated and actual data, that *uses* the data [a second time].

“Rather than engaging in a pursuit to define a moving target such as [a range of posterior distributions], ACC maintains a consistently clear frequentist interpretation (…) and thereby offers a consistently cohesive interpretation of likelihood-free methods.”

The frequentist coverage guarantee comes from a bootstrap-like assumption that [with tolerance equal to zero] the distribution of the ABC/ACC/ACDC random parameter around an estimate of the parameter *given* the summary statistic is identical to the [frequentist] distribution of this estimate around the true parameter [given the true parameter, although this conditioning makes no sense outside a Bayesian framework]. (There must be a typo in the paper when the authors define [p.10] the estimator as minimising the derivative of the density of the summary statistic, while still calling it an MLE.) That this bootstrap-like assumption holds is established (in Theorem 1) under a CLT on this MLE and assumptions on the data-dependent proposal that connect it to the density of the summary statistic. Connection that seem to imply a data-dependence as well as a certain knowledge about this density. What I find most surprising in this derivation is the total absence of conditions or even discussion on the tolerance level which, as we have shown, is paramount to the validation or invalidation of ABC inference. It sounds like the authors of Approximate confidence distribution computing are setting ε equal to zero for those theoretical derivations. While in practice they apply rules [for choosing ε] they do not voice out, but which result in very different acceptance rates for the ACC version they oppose to an ABC version. (In all illustrations, it seems that ε=0.1, which does not make much sense.) All in all, I am thus rather skeptical about the practical implications of the paper in that it seems to achieve confidence guarantees by first assuming proper if implicit choices of summary statistics and parameter generating distribution.

## fast ε-free ABC

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags ABC, ABC in Edinburgh, Arthur's Seat, arXiv, Edinburgh, empirical likelihood, Gaussian mixture, neural network, non-parametrics, Scotland on June 8, 2017 by xi'an**L**ast Fall, George Papamakarios and Iain Murray from Edinburgh arXived an ABC paper on fast ε-free inference on simulation models with Bayesian conditional density estimation, paper that I missed. The idea there is to approximate the posterior density by maximising the likelihood associated with a parameterised family of distributions on θ, conditional on the associated x. The data being then the ABC reference table. The family chosen there is a mixture of K Gaussian components, which parameters are then estimated by a (Bayesian) neural network using x as input and θ as output. The parameter values are simulated from an adaptive proposal that aims at approximating the posterior better and better. As in population Monte Carlo, actually. Except for the neural network part, which I fail to understand why it makes a significant improvement when compared with EM solutions. The overall difficulty with this approach is that I do not see a way out of the curse of dimensionality: when the dimension of θ increases, the approximation to the posterior distribution of θ does deteriorate, even in the best of cases, as any other non-parametric resolution. It would have been of (further) interest to see a comparison with a most rudimentary approach, namely the one we proposed based on empirical likelihoods.