je suis revenu de Montréal [NIPS 2015]

After the day trip to Montréal, a quick stop in Paris, and another one in London, I thought back on the probabilistic integration workshop of last week. First, I had a very good time discussing with people there, with no (apparent) adverse reaction to my talk on “estimating constants”. Second, I finally realised what Mark Berliner meant by saying that he was a Bayesian if not a statistician, in a discussion we had in the early 1990’s, in Cornell. Third, I became [moderately] more open to the highly structured spaces used in the approaches discussed by François-Xavier Briol, Arthur Gretton, Roman Garnett, and  Francis Bach. The (RKHS) functional assumptions made in those approaches are allowing for higher and more precise convergence rates, with the question being what happens when the assumptions do not hold. A comment similar to the impact of a Gaussian process as the prior on the integrand in Bayesian quadrature.

probintFrançois-Xavier presented the recently arXived probabilistic integration that Andrew discussed a week ago. (While I obviously have no relevant remark to make about the maths in this paper, I wonder at the difficulty and cost in sequentially selecting the states behind the quadrature. Which presumably is covered in the earlier Frank-Wolfe paper by the same team.) Another discussion with Arthur clarified a wee bit how RKHS can be perceived in practice, with a lingering question on the size of RKHS within the entire space of functions and more importantly the significant impact of the kernel representation on the resulting approximations. Anyway, those are exciting times, when considering that different branches of numerics and probability and statistics come together to improve upon existing techniques and I am once again glad I could took part in this workshop (although sorry I had to miss the ABC workshop that took place in parallel!)

2 Responses to “je suis revenu de Montréal [NIPS 2015]”

  1. Dan Simpson Says:

    I’m not sure if this really helps, but the RKHS, H, of a GP is large enough to capture the mean (prior or posterior), but not large enough to capture any of the variation (Pr[x \in H] = 0).

    But for a space with zero prior mass, it “fills” the space in the sense that if B is an event with probability epsilon, Pr[H \cup B]=1.

    When I visualise it, I think of it as a bit like the set of rational numbers, it’s “everywhere”, but it has too much extra regularity and structure to be able to cover everything.

  2. After thinking a bit about it, I guess the main thing about FW-BQ is the following: if you want to actively pick informative points in this setting to maximally decrease variance, you already have to be sure what RKHS your integrand lives in. If you aren’t sure, then BQ doesn’t help whatsoever.
    The recent paper applies can be applied after samples have been acquired, which is a nice result to have.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s