**J**udith Rousseau pointed out to me this NIPS paper by Jeff Miller and Matthew Harrison on the possible inconsistency of Dirichlet mixtures priors for estimating the (true) number of components in a (true) mixture model. The resulting posterior on the number of components does not concentrate on the right number of components. Which is not the case when setting a prior on the unknown number of components of a mixture, where consistency occurs. (The inconsistency results established in the paper are actually focussed on iid Gaussian observations, for which the estimated number of Gaussian components is almost never equal to 1.) In a more recent arXiv paper, they also show that a Dirichlet prior on the weights and a prior on the number of components can still produce the same features as a Dirichlet mixtures priors. Even the stick breaking representation! (Paper that I already reviewed last Spring.)

## Archive for NIPS

## Dirichlet process mixture inconsistency

Posted in Books, Statistics with tags Dirichlet process, mixtures of distributions, NIPS, overfitting, unknown number of components on February 15, 2016 by xi'an## delayed in London [CFE 2015]

Posted in pictures, Statistics, Travel, University life with tags ABC, CFE 2015, delayed acceptance, econometrics, England, London, NIPS, random forests, Sunday, United Kingdom on December 13, 2015 by xi'an**T**oday I am giving a talk at the 9th International Conference on Computational and Financial Econometrics (CFE 2015), in London. The number of parallel sessions there is astounding, which makes me [now] wonder at the appeal of such a large conference and the pertinence of giving a talk in parallel with so many other talks that I end up talking at the same time as Pierre Pudlo, who is presenting our ABC with random forest paper (in the twin CMStatistics 2015!). While I may sound overly pessimistic, or just peeved from missing the second day of workshops at NIPS!, there is no reason to doubt the quality of the talks, given the list of authors (and friends) there. So I am looking forward to see what I can get from this multipurpose econometrics and statistics conference.

## Je reviendrai à Montréal [D-2]

Posted in pictures, Statistics, Travel, University life with tags ABC, ABC in Montréal, Approximate Bayesian computation, Bayesian inference, Canada, London, MCMC, Monte Carlo integration, Monte Carlo Statistical Methods, Montréal, NIPS, NIPS 2015, probabilistic numerics, Robert Charlebois, scalability on December 9, 2015 by xi'an**I** have spent the day and more completing and compiling slides for my contrapuntal perspective on probabilistic numerics, back in Montréal, for the NIPS 2015 workshop of December 11 on this theme. As I presume the kind invitation by the organisers was connected with my somewhat critical posts on the topic, I mostly The day after, while I am flying back to London for the CFE (Computational and Financial Econometrics) workshop, somewhat reluctantly as there will be another NIPS workshop that day on scalable Monte Carlo.

Je veux revoir le long désert

Des rues qui n’en finissent pas

Qui vont jusqu’au bout de l’hiver

Sans qu’il y ait trace de pas

## Je reviendrai à Montréal [NIPS 2015]

Posted in pictures, Statistics, Travel, University life with tags ABC, ABC in Montréal, Approximate Bayesian computation, Bayesian inference, Canada, MCMC, Monte Carlo integration, Monte Carlo Statistical Methods, Montréal, NIPS, NIPS 2015, probabilistic numerics, Québec, Robert Charlebois, scalability on September 30, 2015 by xi'an**I** will be back in Montréal, as the song by Robert Charlebois goes, for the NIPS 2015 meeting there, more precisely for the workshops of December 11 and 12, 2015, on probabilistic numerics and ABC [à Montréal]. I was invited to give the first talk by the organisers of the NIPS workshop on probabilistic numerics, presumably to present a contrapuntal perspective on this mix of Bayesian inference with numerical issues, following my somewhat critical posts on the topic. And I also plan to attend some lectures in the (second) NIPS workshop on ABC methods. Which does not leave much free space for yet another workshop on Approximate Bayesian Inference! The day after, while I am flying back to London, there will be a workshop on scalable Monte Carlo. All workshops are calling for contributed papers to be presented during central poster sessions. To be submitted to abcinmontreal@gmail.com and to probnum@gmail.com and to aabi2015. Before October 16.

Funny enough, I got a joking email from Brad, bemoaning my traitorous participation to the workshop on probabilistic numerics because of its “anti-MCMC” agenda, reflected in the summary:

“Integration is the central numerical operation required for Bayesian machine learning (in the form of marginalization and conditioning). Sampling algorithms still abound in this area, although it has long been known that Monte Carlo methods are fundamentally sub-optimal. The challenges for the development of better performing integration methods are mostly algorithmic. Moreover, recent algorithms have begun to outperform MCMC and its siblings, in wall-clock time, on realistic problems from machine learning.

The workshop will review the existing, by now quite strong, theoretical case against the use of random numbers for integration, discuss recent algorithmic developments, relationships between conceptual approaches, and highlight central research challenges going forward.”

Position that I hope to water down in my talk! In any case,

Je veux revoir le long désert

Des rues qui n’en finissent pas

Qui vont jusqu’au bout de l’hiver

Sans qu’il y ait trace de pas

## accelerating Metropolis-Hastings algorithms by delayed acceptance

Posted in Books, Statistics, University life with tags Andrew Gelman, Hamiltonian Monte Carlo, MALA, Metropolis-Hastings algorithm, Montréal, NIPS, Peskun ordering, prefetching, University of Warwick on March 5, 2015 by xi'an**M**arco Banterle, Clara Grazian, Anthony Lee, and myself just arXived our paper “Accelerating Metropolis-Hastings algorithms by delayed acceptance“, which is an major revision and upgrade of our “Delayed acceptance with prefetching” paper of last June. Paper that we submitted at the last minute to NIPS, but which did not get accepted. The difference with this earlier version is the inclusion of convergence results, in particular that, while the original Metropolis-Hastings algorithm dominates the delayed version in Peskun ordering, the later can improve upon the original for an appropriate choice of the early stage acceptance step. We thus included a new section on optimising the design of the delayed step, by picking the optimal scaling à la Roberts, Gelman and Gilks (1997) in the first step and by proposing a ranking of the factors in the Metropolis-Hastings acceptance ratio that speeds up the algorithm. The algorithm thus got adaptive. Compared with the earlier version, we have not pursued the second thread of prefetching as much, simply mentioning that prefetching and delayed acceptance could be merged. We have also included a section on the alternative suggested by Philip Nutzman on the ‘Og of using a growing ratio rather than individual terms, the advantage being the probability of acceptance stabilising when the number of terms grows, with the drawback being that expensive terms are not always computed last. In addition to our logistic and mixture examples, we also study in this version the MALA algorithm, since we can postpone computing the ratio of the proposals till the second step. The gain observed in one experiment is of the order of a ten-fold higher efficiency. By comparison, and in answer to one comment on Andrew’s blog, we did not cover the HMC algorithm, since the preliminary acceptance step would require the construction of a proxy to the acceptance ratio, in order to avoid computing a costly number of derivatives in the discretised Hamiltonian integration.

## Bayesian optimization for likelihood-free inference of simulator-based statistical models [guest post]

Posted in Books, Statistics, University life with tags ABC, arXiv, Dennis Prangle, dimension curse, Gaussian processes, guest post, NIPS, nonparametric probability density estimation on February 17, 2015 by xi'an[The following comments are from Dennis Prangle, about the second half of the paper by Gutmann and Corander I commented last week.]

**H**ere are some comments on the paper of Gutmann and Corander. My brief skim read through this concentrated on the second half of the paper, the applied methodology. So my comments should be quite complementary to Christian’s on the theoretical part!

ABC algorithms generally follow the template of proposing parameter values, simulating datasets and accepting/rejecting/weighting the results based on similarity to the observations. The output is a Monte Carlo sample from a target distribution, an approximation to the posterior. The most naive proposal distribution for the parameters is simply the prior, but this is inefficient if the prior is highly diffuse compared to the posterior. MCMC and SMC methods can be used to provide better proposal distributions. Nevertheless they often still seem quite inefficient, requiring repeated simulations in parts of parameter space which have already been well explored.

The strategy of this paper is to instead attempt to fit a non-parametric model to the target distribution (or in fact to a slight variation of it). Hopefully this will require many fewer simulations. This approach is quite similar to Richard Wilkinson’s recent paper. Richard fitted a Gaussian process to the ABC analogue of the log-likelihood. Gutmann and Corander introduce two main novelties:

- They model the expected discrepancy (i.e. distance) Δ
_{θ}between the simulated and observed summary statistics. This is then transformed to estimate the likelihood. This is in contrast to Richard who transformed the discrepancy before modelling. This is the standard ABC approach of weighting the discrepancy depending on how close to 0 it is. The drawback of the latter approach is it requires picking a tuning parameter (the ABC acceptance threshold or bandwidth) in advance of the algorithm. The new approach still requires a tuning parameter but its choice can be delayed until the transformation is performed. - They generate the θ values on-line using “Bayesian optimisation”. The idea is to pick θ to concentrate on the region near the minimum of the objective function, and also to reduce uncertainty in the Gaussian process. Thus well explored regions can usually be neglected. This is in contrast to Richard who chose θs using space filling design prior to performing any simulations.

I didn’t read the paper’s theory closely enough to decide whether (1) is a good idea. Certainly the results for the paper’s examples look convincing. Also, one issue with Richard‘s approach was that because the log-likelihood varied over such a wide variety of magnitudes, he needed to fit several “waves” of GPs. It would be nice to know if the approach of modelling the discrepancy has removed this problem, or if a single GP is still sometimes an insufficiently flexible model.

Novelty (2) is a very nice and natural approach to take here. I did wonder why the particular criterion in Equation (45) was used to decide on the next θ. Does this correspond to optimising some information theoretic quantity? Other practical questions were whether it’s possible to parallelise the method (I seem to remember talking to Michael Gutmann about this at NIPS but can’t remember his answer!), and how well the approach scales up with the dimension of the parameters.

## partly virtual meetings

Posted in Kids, pictures, Statistics, Travel, University life with tags ABC in Montréal, Benidorm, carbon impact, flight, Montréal, NIPS, online meeting, Statistics conference, travel support, world meeting on December 29, 2014 by xi'anA few weeks ago, I read in the NYT an article about the American Academy of Religion cancelling its 2021 annual meeting as a sabbatical year, for environmental reasons.

“We could choose to not meet at a huge annual meeting in which we take over a city. Every year, each participant going to the meeting uses a quantum of carbon that is more than considerable. Air travel, staying in hotels, all of this creates a way of living on the Earth that is carbon intensive. It could be otherwise.”

While I am not in the least interested in the conference or in the topics covered by this society or yet in the benevolent religious activities suggested as a substitute, the notion of cancelling the behemoths that are our national and international academic meetings holds some appeal. I have posted several times on the topic, especially about JSM, and I have no clear and definitive answer to the question. Still, there lies a lack of efficiency on top of the environmental impact that we could and should try to address. As I was thinking of those issues in the past week, I made another of my numerous “carbon footprints” by attending NIPS across the Atlantic for two workshops than ran in parallel with about twenty others. And hence could have taken place in twenty different places. Albeit without the same exciting feeling of constant intellectual simmering. And without the same mix of highly interactive scholars from all over the planet. (Although the ABC in Montréal workshop seemed predominantly European!) Since workshops are in my opinion the most profitable type of meeting, I would like to experiment with a large meeting made of those (focussed and intense) workshops in such a way that academics would benefit without travelling long distances across the World. One idea would be to have local nodes where a large enough group of researchers could gather to attend video-conferences given from any of the other nodes and to interact locally in terms of discussions and poster presentations. This should even increase the feedback on selected papers as small groups would more readily engage into discussing and criticising papers than a huge conference room. If we could build a World-wide web (!) of such nodes, we could then dream of a non-stop conference, with no central node, no gigantic conference centre, no terrifying beach-ressort…