**T**oday, Pierre Jacob posted on arXiv a paper of ours on the use of the Wasserstein distance in statistical inference, which main focus is exploiting this distance to create an automated measure of discrepancy for ABC. Which is why the full title is Inference in generative models using the Wasserstein distance. Generative obviously standing for the case when a model can be generated from but cannot be associated with a closed-form likelihood. We had all together discussed this notion when I visited Harvard and Pierre last March, with much excitement. (While I have not contributed much more than that round of discussions and ideas to the paper, the authors kindly included me!) The paper contains theoretical results for the consistency of statistical inference based on those distances, as well as computational on how the computation of these distances is practically feasible and on how the Hilbert space-filling curve used in sequential quasi-Monte Carlo can help. The notion further extends to dependent data via delay reconstruction and residual reconstruction techniques (as we did for some models in our empirical likelihood BCel paper). I am quite enthusiastic about this approach and look forward discussing it at the 17w5015 BIRS ABC workshop, next month!

## Archive for 17w5025

## inference with Wasserstein distance

Posted in Books, Statistics, University life with tags 17w5025, adaptive Monte Carlo algorithm, Banff, BIRS, Canada, empirical distribution, Harvard University, numerical transport, optimal transport, statistical inference, synthetic data, Wasserstein distance on January 23, 2017 by xi'an## rare events for ABC

Posted in Books, Mountains, pictures, Statistics, Travel, University life with tags 17w5025, ABC, Banff, Banff International Research Station, latent variable models, pseudo-marginal MCMC, rare events, SMC on November 24, 2016 by xi'an**D**ennis Prangle, Richard G. Everitt and Theodore Kypraios just arXived a new paper on ABC, aiming at handling high dimensional data with latent variables, thanks to a cascading (or nested) approximation of the probability of a near coincidence between the observed data and the ABC simulated data. The approach amalgamates a rare event simulation method based on SMC, pseudo-marginal Metropolis-Hastings and of course ABC. The *rare* event is the near coincidence of the observed summary and of a simulated summary. This is so rare that regular ABC is forced to accept not so near coincidences. Especially as the dimension increases. I mentioned *nested* above purposedly because I find that the rare event simulation method of Cérou et al. (2012) has a nested sampling flavour, in that each move of the particle system (in the sample space) is done according to a constrained MCMC move. Constraint derived from the distance between observed and simulated samples. Finding an efficient move of that kind may prove difficult or impossible. The authors opt for a slice sampler, proposed by Murray and Graham (2016), however they assume that the distribution of the latent variables is uniform over a unit hypercube, an assumption I do not fully understand. For the pseudo-marginal aspect, note that while the approach produces a better and faster evaluation of the likelihood, it remains an ABC likelihood and not the original likelihood. Because the estimate of the ABC likelihood is monotonic in the number of terms, a proposal can be terminated earlier without inducing a bias in the method.

This is certainly an innovative approach of clear interest and I hope we will discuss it at length at our BIRS ABC 15w5025 workshop next February. At this stage of light reading, I am slightly overwhelmed by the combination of so many computational techniques altogether towards a single algorithm. The authors argue there is very little calibration involved, but so many steps have to depend on as many configuration choices.