## Je reviendrai à Montréal [NIPS 2015]

**I** will be back in Montréal, as the song by Robert Charlebois goes, for the NIPS 2015 meeting there, more precisely for the workshops of December 11 and 12, 2015, on probabilistic numerics and ABC [à Montréal]. I was invited to give the first talk by the organisers of the NIPS workshop on probabilistic numerics, presumably to present a contrapuntal perspective on this mix of Bayesian inference with numerical issues, following my somewhat critical posts on the topic. And I also plan to attend some lectures in the (second) NIPS workshop on ABC methods. Which does not leave much free space for yet another workshop on Approximate Bayesian Inference! The day after, while I am flying back to London, there will be a workshop on scalable Monte Carlo. All workshops are calling for contributed papers to be presented during central poster sessions. To be submitted to abcinmontreal@gmail.com and to probnum@gmail.com and to aabi2015. Before October 16.

Funny enough, I got a joking email from Brad, bemoaning my traitorous participation to the workshop on probabilistic numerics because of its “anti-MCMC” agenda, reflected in the summary:

“Integration is the central numerical operation required for Bayesian machine learning (in the form of marginalization and conditioning). Sampling algorithms still abound in this area, although it has long been known that Monte Carlo methods are fundamentally sub-optimal. The challenges for the development of better performing integration methods are mostly algorithmic. Moreover, recent algorithms have begun to outperform MCMC and its siblings, in wall-clock time, on realistic problems from machine learning.

The workshop will review the existing, by now quite strong, theoretical case against the use of random numbers for integration, discuss recent algorithmic developments, relationships between conceptual approaches, and highlight central research challenges going forward.”

Position that I hope to water down in my talk! In any case,

Je veux revoir le long désert

Des rues qui n’en finissent pas

Qui vont jusqu’au bout de l’hiver

Sans qu’il y ait trace de pas

October 5, 2015 at 4:39 pm

The probnum folk is well aware of quasi MC and they’ve also used it in a couple of papers, so that’s not about pretending that determinism in general is in any way novel.

And why now focus so much on the specific function class/prior informativeness when *in practice* often qMC works nicely (after taking “some” care independent of certain characteristics)? Aside of the methods explicitly exploiting smoothness, of course. The devil is (also) in the implicit constant, not just the asymptotic rate of convergence.

Anyway a.t.m. I find this perspective intriguing irrespective of ever finding out it’s really oversold, or paradigm-changing (sorry for the p-word).

PS: Frances Kuo is a star, there are way lesser known brilliant minds in qMC.

PPS: there’s also some quasi-MCMC work being done, although slowly.

October 2, 2015 at 11:59 am

I do appreciate the difference between a GP emulator and a smoothness assumption, even though I am sure you know much more than I do about this! In fact, one of the aims of probabilistic numerics is to make use of the fact that we have a probability distribution over functions to approximate our uncertainty over the numerical solution. Setting those priors properly is therefore easily the biggest challenge of the method as currently available (atleast in my opinion), and I am definitely not claiming the priors being used are noninformative!

It would definitely be great if you wrote a short 4p mini-paper version of your opinion on this and submitted it to the workshop ;).

October 2, 2015 at 12:01 pm

I second François-Xavier’s suggestion, Dan!

October 5, 2015 at 1:59 am

Oh lord no!

Firstly, pretty much the totality of my thoughts on this is contained here or in the comment on X’s post on the Girolami etc Proc Royal Soc A paper. I don’t like it. I think it’s horrifically oversold. And, like anyone else venturing an opinion “early” in something’s development, I’m perfectly happy to be shown to be wrong.

Secondly, I do my best not to spend time on professional things I don’t enjoy. Writing a four page screed against probabilistic numerics (or a four page measured reflection on why I’m unconvinced about the role of Bayesian analysis in this field) is about as far from “things I find fun” as I can imagine venturing professionally. And dear god going to a workshop around this topic would be, for me, a joyless drudge. (Obviously, those who find it fun should go crazy. I’m a decent statistician and a middling numericist who has his mind on a different set of problems. Not by accident)

Now, if the workshop organisers had (as Sondheim’s version of Georges Seurat suggested*) a link to their tradition and were instead organising a workshop on high dimensional integration and approximation (same problem, important distinction) that involved statisticians, machine learners, functional analysts, approximation theorists and numerical analysts**, then I would be there in a heartbeat. (I still wouldn’t submit because I have nothing not obvious to say on this topic) This is not that workshop.

*its in Montreal. Some North American Francophilia was in order.

** if anyone is looking for one, Frances Kuo from UNSW has produced some seriously interesting, if under-read work, on high dimensional integration for statistical problems.

September 30, 2015 at 6:50 pm

Of the many many things I think about that workshop description (other than the slightly weird thing that whoever wrote it appears to firmly believe no one has ever considered the problem of non-random numerical integration before), I was always pretty certain that the 1/sqrt(n) rate was unbeatable if your only assumption is that the integrand is L^2…

Or at least all of the things that I’ve seen that achieve better rates put much stricter controls on teh space the integrand belongs to.

September 30, 2015 at 7:23 pm

I am open to your detailed comments as I am invited as the skeptic spectator, I think!

October 1, 2015 at 2:13 pm

Daniel, I can confirm that all of the existing probabilistic integration methods do make some kind of assumption on which space the integrand belongs to (usually in terms of RKHS). In the workshop description, this is referred to as “prior assumption” (in ML, people often look at this problem from a Bayesian perspective, by putting a Gaussian Process emulator on the integrand).

I think the point here is that people in ML tend to use Monte Carlo as a default tool without necessarily considering any information they have about the integrand. One of the main reasons for this is probably due to the fact that Monte Carlo methods have been studied for a much longer period of time and people therefore feel safer using them. However one could often use more elaborate tools which include the prior information. This workshop therefore aims at discussing how to develop these methods and how much one could gain from using the additional knowledge.

I am looking forward to your highly critical talk, Christian!

FX

October 1, 2015 at 11:26 pm

FX – A gaussian process emulator is *NOT* the same thing as a smoothness assumption. It is A LOT more informative (especially if you are doing anything more than MAP estimation)

October 1, 2015 at 11:50 pm

On the off chance that someone who doesn’t work with nonparametric models for a loving reads this, I feel I should be more expansive.

Noninformative priors on functions don’t exist.

This isn’t like Santa or a uniform prior on the real line. Neither of these strictly exist, but you can produce a decent facsimile of them.

Non-informative priors on function spaces don’t exist (except in the boring case where you only consider a finite dimensional function space)

So any assumption of a prior on a set of functions is EXTREMELY informative and not nearly like a smoothness assumption.

October 2, 2015 at 7:45 am

Noninformative priors do not exist full stop. For normal means or for functions. Obviously, the larger the space, the more concentrated the prior. And hence the more “informative”. Btw, you mean “for a living”, right?!

October 1, 2015 at 11:55 pm

That probably makes more sense if you replace “facsimile” with “simulacrum”. That’s what I get for being fancy…

October 2, 2015 at 12:36 pm

This has happened to me before. For some reason my phone really likes correcting words to “love”.

The most awkward was when I was sending someone an ad for a (still open) lectureship in Bath and I wanted to write “In case you’re looking for a foreign move…” and my phone decided that what I really wanted to write was “In case you’re looking for a foreign love…”. It was *very* awkward!