A paper on ABC I read on my way back from Cambodia: Yanzhi Chen and Michael Gutmann arXived an ABC [in Edinburgh] paper on learning the target via Gaussian copulas, to be presented at AISTATS this year (in Okinawa!). Linking post-processing (regression) ABC and sequential ABC. The drawback in the regression approach is that the correction often relies on an homogeneity assumption on the distribution of the noise or residual since this approach only applies a drift to the original simulated sample. Their method is based on two stages, a coarse-grained one where the posterior is approximated by ordinary linear regression ABC. And a fine-grained one, which uses the above coarse Gaussian version as a proposal and returns a Gaussian copula estimate of the posterior. This proposal is somewhat similar to the neural network approach of Papamakarios and Murray (2016). And to the Gaussian copula version of Li et al. (2017). The major difference being the presence of two stages. The new method is compared with other ABC proposals at a fixed simulation cost, which does not account for the construction costs, although they should be relatively negligible. To compare these ABC avatars, the authors use a symmetrised Kullback-Leibler divergence I had not met previously, requiring a massive numerical integration (although this is not an issue for the practical implementation of the method, which only calls for the construction of the neural network(s)). Note also that sequential ABC is only run for two iterations, and also that none of the importance sampling ABC versions of Fearnhead and Prangle (2012) and of Li and Fearnhead (2018) are considered, all versions relying on the same vector of summary statistics with a dimension much larger than the dimension of the parameter. Except in our MA(2) example, where regression does as well. I wonder at the impact of the dimension of the summary statistic on the performances of the neural network, i.e., whether or not it is able to manage the curse of dimensionality by ignoring all but essentially the data statistics in the optimisation.

## Archive for University of Edinburgh

## adaptive copulas for ABC

Posted in Statistics with tags ABC, ABC in Edinburgh, ABC-SMC, curse of dimensionality, Gaussian copula, neural network, post-processing, sequential ABC, University of Edinburgh on March 20, 2019 by xi'an## from Arthur’s Seat [spot ISBA participants]

Posted in Mountains, pictures, Running, Travel with tags ABC in Edinburgh, Arthur's Seat, capital city, conference, Edinburgh, Holyrood, ISBA 2018, Lothians, morning run, Scotland, sunrise, University of Edinburgh, volcanoes on June 27, 2018 by xi'an## Bayesian GANs [#2]

Posted in Books, pictures, R, Statistics with tags ABC in Edinburgh, Bayesian GANs, compatible conditional distributions, Edinburgh, GANs, generative adversarial networks, ISBA 2018, joint posterior, MCMC convergence, Metropolis-within-Gibbs algorithm, Monte Carlo Statistical Methods, normal model, University of Edinburgh on June 27, 2018 by xi'an**A**s an illustration of the lack of convergence of the Gibbs sampler applied to the two “conditionals” defined in the Bayesian GANs paper discussed yesterday, I took the simplest possible example of a Normal mean generative model (one parameter) with a logistic discriminator (one parameter) and implemented the scheme (during an ISBA 2018 session). With flat priors on both parameters. And a Normal random walk as Metropolis-Hastings proposal. As expected, since there is no stationary distribution associated with the Markov chain, simulated chains do not exhibit a stationary pattern,

And they eventually reach an overflow error or a trapping state as the log-likelihood gets approximately to zero (red curve).

Too bad I missed the talk by Shakir Mohammed yesterday, being stuck on the Edinburgh by-pass at rush hour!, as I would have loved to hear his views about this rather essential issue…

## Alan Turing Institute

Posted in Books, pictures, Running, Statistics, University life with tags Alan Turing, Alan Turing Institute, British Library, London, UCL, United Kingdom, University of Cambridge, University of Edinburgh, University of Oxford, University of Warwick on February 10, 2015 by xi'an

**T**he University of Warwick is one of the five UK Universities (Cambridge, Edinburgh, Oxford, Warwick and UCL) to be part of the new Alan Turing Institute.To quote from the University press release, “The Institute will build on the UK’s existing academic strengths and help position the country as a world leader in the analysis and application of big data and algorithm research. Its headquarters will be based at the British Library at the centre of London’s Knowledge Quarter.” The Institute will gather researchers from mathematics, statistics, computer sciences, and connected fields towards collegial and focussed research , which means in particular that it will hire a fairly large number of researchers in stats and machine-learning in the coming months. The Department of Statistics at Warwick was strongly involved in answering the call for the Institute and my friend and colleague Mark Girolami will the University leading figure at the Institute, alas meaning that we will meet even less frequently! Note that the call for the Chair of the Alan Turing Institute is now open, with deadline on March 15. [As a personal aside, I find the recognition that Alan Turing’s genius played a pivotal role in cracking the codes that helped us win the Second World War. It is therefore only right that our country’s top universities are chosen to lead this new institute named in his honour.

by the Business Secretary does not absolve the legal system that drove Turing to suicide….]

## brief stop in Edinburgh

Posted in Mountains, pictures, Statistics, Travel, University life, Wines with tags ABC, ABC model choice, Edinburgh, Fort William, quantile regression, random forests, Scotland, The Grog & Gruel, University of Edinburgh on January 24, 2015 by xi'an**Y**esterday, I was all too briefly in Edinburgh for a few hours, to give a seminar in the School of Mathematics, on the random forests approach to ABC model choice (that was earlier rejected). (The slides are almost surely identical to those used at the NIPS workshop.) One interesting question at the end of the talk was on the potential bias in the posterior predictive expected loss, bias against some model from the collection of models being evaluated for selection. In the sense that the array of summaries used by the random forest could fail to capture features of a particular model and hence discriminate against it. While this is correct, there is no fundamental difference with implementing a posterior probability based on the same summaries. And the posterior predictive expected loss offers the advantage of testing, that is, for representative simulations from each model, of returning the corresponding model prediction error to highlight poor performances on some models. A further discussion over tea led me to ponder whether or not we could expand the use of random forests to Bayesian quantile regression. However, this would imply a monotonicity structure on a collection of random forests, which sounds daunting…

My stay in Edinburgh was quite brief as I drove to the Highlands after the seminar, heading to Fort William, Although the weather was rather ghastly, the traffic was fairly light and I managed to get there unscathed, without hitting any of the deer of Rannoch Mor (saw one dead by the side of the road though…) or the snow banks of the narrow roads along Loch Lubnaig. And, as usual, it still was a pleasant feeling to drive through those places associated with climbs and hikes, Crianlarich, Tyndrum, Bridge of Orchy, and Glencoe. And to get in town early enough to enjoy a quick dinner at The Grog & Gruel, reflecting I must have had half a dozen dinners there with friends (or not) over the years. And drinking a great heather ale to them!

## how many modes in a normal mixture?

Posted in Books, Kids, Statistics, University life with tags Chris Williams, EM algorithm, Miguel Carrera-Perpinan, mixture estimation, modes of a mixture, Scotland, University of Edinburgh on January 7, 2015 by xi'an**A**n interesting question I spotted on Cross Validated today: How to tell if a mixture of Gaussians will be multimodal? Indeed, there is no known analytical condition on the parameters of a fully specified k-component mixture for the modes to number k or less than k… Googling around, I immediately came upon this webpage by Miguel Carrera-Perpinan, who studied the issue with Chris Williams when writing his PhD in Edinburgh. And upon this paper, which not only shows that

- unidimensional Gaussian mixtures with k components have at most k modes;
- unidimensional non-Gaussian mixtures with k components may have more than k modes;
- multidimensional mixtures with k components may have more than k modes.

but also provides ways of finding all the modes. Ways which seem to reduce to using EM from a wide variety of starting points (an EM algorithm set in the sampling rather than in the parameter space since all parameters are set!). Maybe starting EM from each mean would be sufficient. I still wonder if there are better ways, from letting the variances decrease down to zero until a local mode appear, to using some sort of simulated annealing…

**Edit:** Following comments, let me stress this is not a statistical issue in that the parameters of the mixture are set and known and there is no observation(s) from this mixture from which to estimate the number of modes. The mathematical problem is to determine how many local maxima there are for the function