Archive for BNP11

off to BNP!

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , , , , , , , on October 23, 2022 by xi'an

Today I am off to Chile, to attend the 13th Bayesian non-parametric conference, BNP13. Which follows BNP11 that took place in Paris. And BNP12, which took place in Oxford (just prior to O’Bayes in Warwick, which in retrospect was the wrong strategy as most attendees did not extend their stay…). The programme is quite diverse and exciting, plus involving a lot of friends I had not seen for quite a while (as they weren’t at ISBA in Montréal). And the location is fabulous, sitting by Lake Llanquihue [whose waters may prove too cold!] and facing the [tantalizing] volcán Osorno (2652m). Which was observed by Darwin on his second trip, during a 1835 eruption. (The last eruption was in 1869, hopefully staying the same for the whole week!)

conference carbon footprint

Posted in Kids, pictures, Running, Travel, University life, Wines with tags , , , , , , , , , on August 1, 2017 by xi'an

As a local organiser of the recent BNP 11 conference in Paris, and hence involved in setting and cleaning coffee breaks and [now famous] wine&cheese poster sessions, I was rather shocked by the amount of waste generated by those events, albeit aware of the importance of the social exchanges they induced… And thus got to wonder how the impact of those conference events could be reduced. One solution is the drastic one, namely to provide exactly nothing at all during the breaks between talks and expect anyone hungry or thirsty enough to bring one own’s food or drink. Another one, as suggested by my daughter at the dinner table, is to provide Ecocups, namely reusable plastic glasses that can given to all participants at the beginning of the conference. Or sold (or rented) to those who have not brought their own mug or bottle. (Of course, this may be a poor idea in that manufacturing and shipping a hard-plastic glass that most likely will be discarded after a few days may be more damaging than producing the equivalent number of “disposable” thin plastic glasses. And in the end all this agitation is peanuts compared with the impact of flying participants to the conference. For which I have no handy solution… As biking to the conference location is a privilege very few can enjoy.) Still, and even though this puts another stone in the already rocky organisers’ garden, I wish we could adopt more positive policies at the meetings we organise and sponsor.

Hamiltonian MC on discrete spaces [a reply from the authors]

Posted in Books, pictures, Statistics, University life with tags , , , , , on July 8, 2017 by xi'an

Q. Why not embed discrete parameters so that the resulting surrogate density function is smooth?

A. This is only possible in very special settings. Let’s say we have a target distribution π(θ, n), where θ is continuous and ‘n’ is discrete. To construct a surrogate smooth density, we would need to somehow smoothly interpolate a collection of functions fn(θ) = π(θ, n) for n = 1, 2, …. It is not clear to us how we can achieve this in a general and tractable way.

Q. How to generalize the algorithm to a more complex parameter space?

A. We provide a clear solution to dealing with a discontinuous target density defined on a continuous parameter space. We agree, however, that there remains the question of whether and how a more complex parameter space can be embedded into a continuous space. This certainly deserves a further investigation. For example, a binary tree can be embedded in to an interval [0,1] through a dyadic expansion of a real number.

Q. Physical intuition of discontinuous Hamiltonian dynamics is not clear from a theory of differential measure-valued equation and selection principle.

A. Hamiltonian dynamics with a discontinuous potential energy has long been used by physicists as a natural model for some physical phenomena (also known as “impulsive systems”). The main difference from a smooth system is that a gradient become a “delta function” at the discontinuity, causing an instantaneous “push” toward the direction of lower potential energy. A theory of differential measure-valued equation / inclusion and selection principle is only a mathematical formalization of such physical systems.

Q. (A special case of) DHMC looks like taking multiple Gibbs steps?

A. The crucial difference from Metropolis-within-Gibbs is the presence of momentum in DHMC, which helps guide a Markov chain toward a high density region.

The effect of momentum is evident in the Jolly-Seber example of Section 5.1, where DHMC shows 60-fold efficiency improvement over a sampler “NUTS-Gibbs” based on conditional updates. Also, a direct comparison of DHMC and Metropolis-within-Gibbs can be found in Section S4.1 where DHMC, thanks to the momentum, is about 7 times more efficient than Metropolis-within-Gibbs (with optimal proposal variances).

Q. Unlike HMC, DHMC does not seem to use structural information about the parameter space and local information about the target density?

A. It does. After all, other than the use of Laplace momentum and discontinuity in the target density, DHMC is based on the same principle as HMC — simulating Hamiltonian dynamics to generate a proposal.

The confusion is perhaps due to the fact that the coordinate-wise integrator of DHMC does not require gradients. The gradient of the log density — which may be a “delta” function at discontinuities — plays a clear role if you look at Hamilton’s equations Eq (10) corresponding to a Laplace momentum. It’s just that, thanks to a property of a Laplace momentum and conservation of energy principle, we can approximate the exact dynamics without ever computing the gradient. This is in fact a remarkable property of a Laplace momentum and our coordinate-wise integrator.

Hamiltonian MC on discrete spaces

Posted in Statistics, Travel, University life with tags , , , , , , , , on July 3, 2017 by xi'an

Following a lively discussion with Akihiko Nishimura during a BNP11 poster session last Tuesday, I took the opportunity of the flight to Montréal to read through the arXived paper (written jointly with David Dunson and Jianfeng Liu). The issue is thus one of handling discrete valued parameters in Hamiltonian Monte Carlo. The basic “trick” in handling this complexity goes by turning the discrete support via the inclusion of an auxiliary continuous variable whose discretisation is the discrete parameter, hence resembling to some extent the slice sampler. This removes the discreteness blockage but creates another difficulty, namely handling a discontinuous target density. (I idly wonder why the trick cannot be iterated to second or higher order so that to achieve the right amount of smoothness. Of course, the maths behind would be less cool!) The extension of the Hamiltonian to this setting by a  convolution is a trick I had not seen since the derivation of the Central Limit Theorem during Neveu’s course at Polytechnique.  What I find most exciting in the resolution is the move from a Gaussian momentum to a Laplace momentum, for the reason that I always wondered at alternatives [without trying anything myself!]. The Laplace version is indeed most appropriate here in that it avoids a computation of all discontinuity points and associated values along a trajectory. Since the moves are done component-wise, the method has a Metropolis-within-Gibbs flavour, which actually happens to be a special case. What is also striking is that the approach is both rejection-free and exact, provided ergodicity occurs, which is the case when the stepsize is random.

In addition to this resolution of the discrete parameter problem, the paper presents the further appeal of (re-)running an analysis of the Jolly-Seber capture-recapture model. Where the discrete parameter is the latent number of live animals [or whatever] in the system at any observed time. (Which we cover in Bayesian essentials with R as a neat entry to both dynamic and latent variable models.) I would have liked to see a comparison with the completion approach of Jérôme Dupuis (1995, Biometrika), since I figure the Metropolis version implemented here differs from Jérôme’s. The second example is built on Bissiri et al. (2016) surrogate likelihood (discussed earlier here) and Chopin and Ridgway (2017) catalogue of solutions for not analysing the Pima Indian dataset. (Replaced by another dataset here.)

BimPressioNs [BNP11]

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , on June 29, 2017 by xi'an

While my participation to BNP 11 has so far been more at the janitor level [although not gaining George Casella’s reputation on NPR!] than at the scientific one, since we had decided in favour of the least expensive and unstaffed option for coffee breaks, to keep the registration fees at a minimum [although I would have gladly gone all the way to removing all coffee breaks!, if only because such breaks produce much garbage], I had fairly good chats at the second poster session, in particular around empirical likelihoods and HMC for discrete parameters, the first one based on the general Cressie-Read formulation and the second around the recently arXived paper of Nishimura et al., which I wanted to read. Plus many other good chats full stop, around terrific cheese platters!

View this post on Instagram

Best conference spread ever

A post shared by Shane Jensen (@tastierkakes) on

This morning, the coffee breaks were much more under control and I managed to enjoy [and chair] the entire session on empirical likelihood, with absolutely fantastic talks from Nils Hjort and Art Owen (the third speaker having gone AWOL, possibly a direct consequence of Trump’s travel ban).

%d bloggers like this: