## yes, another potential satellite to ISBA 2018!

Posted in Statistics with tags , , , , , , , , , , on May 22, 2018 by xi'an

On July 2-4, 2018, there will be an ISBA sponsored workshop on Bayesian non-parametrics for signal and image processing, in Bordeaux, France. This is a wee bit further than Warwick (BAYsm) or Rennes (MCqMC), but still manageable from Edinburgh with direct flights (if on Ryanair). Deadline for free (yes, free!) registration is May 31.

## BimPressioNs [BNP11]

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , on June 29, 2017 by xi'an

While my participation to BNP 11 has so far been more at the janitor level [although not gaining George Casella’s reputation on NPR!] than at the scientific one, since we had decided in favour of the least expensive and unstaffed option for coffee breaks, to keep the registration fees at a minimum [although I would have gladly gone all the way to removing all coffee breaks!, if only because such breaks produce much garbage], I had fairly good chats at the second poster session, in particular around empirical likelihoods and HMC for discrete parameters, the first one based on the general Cressie-Read formulation and the second around the recently arXived paper of Nishimura et al., which I wanted to read. Plus many other good chats full stop, around terrific cheese platters!

A post shared by Shane Jensen (@tastierkakes) on

This morning, the coffee breaks were much more under control and I managed to enjoy [and chair] the entire session on empirical likelihood, with absolutely fantastic talks from Nils Hjort and Art Owen (the third speaker having gone AWOL, possibly a direct consequence of Trump’s travel ban).

## exciting week[s]

Posted in Mountains, pictures, Running, Statistics with tags , , , , , , , , , , , , , , on June 27, 2017 by xi'an

The past week was quite exciting, despite the heat wave that hit Paris and kept me from sleeping and running! First, I made a two-day visit to Jean-Michel Marin in Montpellier, where we discussed the potential Peer Community In Computational Statistics (PCI Comput Stats) with the people behind PCI Evol Biol at INRA, Hopefully taking shape in the coming months! And went one evening through a few vineyards in Saint Christol with Jean-Michel and Arnaud. Including a long chat with the owner of Domaine Coste Moynier. [Whose domain includes the above parcel with views of Pic Saint-Loup.] And last but not least! some work planning about approximate MCMC.

On top of this, we submitted our paper on ABC with Wasserstein distances [to be arXived in an extended version in the coming weeks], our revised paper on ABC consistency thanks to highly constructive and comments from the editorial board, which induced a much improved version in my opinion, and we received a very positive return from JCGS for our paper on weak priors for mixtures! Next week should be exciting as well, with BNP 11 taking place in downtown Paris, at École Normale!!!

## Moment conditions and Bayesian nonparametrics

Posted in R, Statistics, University life with tags , , , , , , , , , , on August 6, 2015 by xi'an

Luke Bornn, Neil Shephard, and Reza Solgi (all from Harvard) have arXived a pretty interesting paper on simulating targets on a zero measure set. Although it is not initially presented this way, but rather in non-parametric terms as moment conditions

$\mathbb{E}_\theta[g(X,\beta)]=0$

where θ is the parameter of the sampling distribution, constrained by the value of β. (Which also contains quantile regression.) The very problem of simulating under a hard constraint has been bugging me for years and it is hence very exciting to see them come up with a proposal towards solving this difficulty! Even though it is restricted here to observations with a finite support (hence allowing for the use of a parametric Dirichlet prior). One interesting extension (Section 3.6) processed in the paper is the case when the support is unknown, but finite, with some points in the support being unobserved. Maybe connecting with non-parametrics if a prior is added on the number of unobserved points.

The setting of constricting θ via a parameterised moment condition relates to moment defined econometrics models, in a similar spirit to Gallant’s paper I recently discussed, but equally to empirical likelihood, which would then benefit from a fully Bayesian treatment thanks to the approach advocated by the authors.

Despite the zero-measure difficulty, or more exactly the non-linear manifold structure of the parameter space, for instance

β = log {θ/(1-θ)}

the authors manage to define a “projected” [my words] measure on the set of admissible pairs (β,θ). In a sense this is related with the choice of a certain metric, but the so-called Hausdorff reference measure allows for an automated definition of the original prior. It took me a (wee) while to spot (p.7) that the starting point was not a (unconstrained) prior on that (unconstrained) pair (β,θ) but directly on the manifold

$\mathbb{E}_\theta[g(X,\beta)]=0.$

Which makes its construction a difficulty. Even though, as noted in Section 4, all that we need is a prior over θ since the Hausdorff-Jacobian identity defines the “joint”, in a sort of backward way. (This is a wee bit confusing in that β being a transform of θ, all we need is a prior over θ, but we nonetheless end up with a different density on the joint distribution on the pair (β,θ). Any connection with incompatible priors merged together into a consensus prior?) Another question extending the scope of the paper would be to define Jeffreys’ or reference priors in this manifold sense.

The authors also discuss (Section 4.3) the problem I originally thought they were processing, namely starting from an unconstrained pair (β,θ) and it corresponding prior. The projected prior can then be defined based on a version of the original density on the constrained space, but it is definitely arbitrary. In that sense the paper does not address the general problem.

“…traditional simulation algorithms will fail because the prior and the posterior of the model are supported on a zero Lebesgue measure set…” (p.10)

I somewhat resist this presentation through the measure zero set: once the prior is defined on a manifold, the fact that it is a measure zero set in a larger space is moot. Provided one can simulate a proposal over that manifold, e.g., a random walk, absolutely continuous wrt the same dominating measure, and compute or estimate a Metropolis-Hastings ratio of densities against a common measure, one can formally run MCMC on manifolds as well as regular Euclidean spaces. A first and theoretically straightforward (?) solution is to solve the constraint

$\mathbb{E}_\theta[g(X,\beta)]=0$

in β=β(θ). Then the joint prior p(β,θ) can be projected by the Hausdorff projection into p(θ). For instance, in the case of the above logit transform, the projected density is

p(θ)=p(β,θ) {1+1/θ²(1-θ)²}½

In practice, the inversion may be too costly and Bornn et al. directly simulate the pair (β,θ) within the manifold capitalising on the fact that the constraint is linear in θ given β. Indeed, in this setting, β is unconstrained and θ can be simulated from a proposal restricted to the hyperplane. Gibbs-like.

## Conditional love [guest post]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , on August 4, 2015 by xi'an

[When Dan Simpson told me he was reading Terenin’s and Draper’s latest arXival in a nice Bath pub—and not a nice bath tub!—, I asked him for a blog entry and he agreed. Here is his piece, read at your own risk! If you remember to skip the part about Céline Dion, you should enjoy it very much!!!]

Probability has traditionally been described, as per Kolmogorov and his ardent follower Katy Perry, unconditionally. This is, of course, excellent for those of us who really like measure theory, as the maths is identical. Unfortunately mathematical convenience is not necessarily enough and a large part of the applied statistical community is working with Bayesian methods. These are unavoidably conditional and, as such, it is natural to ask if there is a fundamentally conditional basis for probability.

Bruno de Finetti—and later Richard Cox and Edwin Jaynes—considered conditional bases for Bayesian probability that are, unfortunately, incomplete. The critical problem is that they mainly consider finite state spaces and construct finitely additive systems of conditional probability. For a variety of reasons, neither of these restrictions hold much truck in the modern world of statistics.

In a recently arXiv’d paper, Alexander Terenin and David Draper devise a set of axioms that make the Cox-Jaynes system of conditional probability rigorous. Furthermore, they show that the complete set of Kolmogorov axioms (including countable additivity) can be derived as theorems from their axioms by conditioning on the entire sample space.

This is a deep and fundamental paper, which unfortunately means that I most probably do not grasp it’s complexities (especially as, for some reason, I keep reading it in pubs!). However I’m going to have a shot at having some thoughts on it, because I feel like it’s the sort of paper one should have thoughts on. Continue reading

## Advances in scalable Bayesian computation [day #4]

Posted in Books, Mountains, pictures, R, Statistics, University life with tags , , , , , , , , , , , , , , , , , on March 7, 2014 by xi'an

Final day of our workshop Advances in Scalable Bayesian Computation already, since tomorrow morning is an open research time ½ day! Another “perfect day in paradise”, with the Banff Centre campus covered by a fine snow blanket, still falling…, and making work in an office of BIRS a dream-like moment.

Still looking for a daily theme, parallelisation could be the right candidate, even though other talks this week went into parallelisation issues, incl. Steve’s talk yesterday. Indeed, Anthony Lee gave a talk this morning on interactive sequential Monte Carlo, where he motivated the setting by a formal parallel structure. Then, Darren Wilkinson surveyed the parallelisation issues in Monte Carlo, MCMC, SMC and ABC settings, before arguing in favour of a functional language called Scala. (Neat entries to those topics can be found on Darren’s blog.) And in the afternoon session, Sylvia Frühwirth-Schnatter exposed her approach to the (embarrassingly) parallel problem, in the spirit of Steve’s , David Dunson’s and Scott’s (a paper posted on the day I arrived in Chamonix and hence I missed!). There was plenty to learn from that talk (do not miss the Yin-Yang moment at 25 mn!), but it also helped me to break a difficulty I had with the consensus Bayes representation for two weeks (more on that later!). And, even though Marc Suchard mostly talked about flu and trees in a very pleasant and broad talk, he also had a slide on parallelisation to fit the theme! Although unrelated with parallelism,  Nicolas Chopin’s talk was on sequential quasi-Monte Carlo algorithms: while I had heard previous versions of this talk in Chamonix and BigMC, I found it full of exciting stuff. And it clearly got the room truly puzzled by this possibility, in a positive way! Similarly, Alex Lenkoski spoke about extreme rain events in Norway with no trace of parallelism, but the general idea behind the examples was to question the notion of the calibrated Bayesian (with possible connections with the cut models).

This has been a wonderful week and I am sure the participants got as much as I did from the talks and the informal exchanges. Thanks to BIRS for the sponsorship and the superb organisation of the week (and to the Banff Centre for providing such a paradisical environment). I feel very privileged to have benefited from this support, even though I deadly hope to be back in Banff within a few years.

## Advances in scalable Bayesian computation [day #3]

Posted in Books, Mountains, pictures, R, Statistics, University life with tags , , , , , , , , , , on March 6, 2014 by xi'an

We have now gone over the midpoint of our workshop Advances in Scalable Bayesian Computation with three talks in the morning and an open research or open air afternoon. (Maybe surprisingly I chose to stay indoors and work on a new research topic rather than trying cross-country skiing!) If I must give a theme for the day, it would be (jokingly) corporate Big data, as the three speakers spoke of problems and solutions connected with Google, Facebook and similar companies. First, Russ Salakhutdinov presented some  hierarchical structures on multimedia data, like connecting images and text, with obvious applications on Google. The first part described Boltzman machines with impressive posterior simulations of characters and images. (Check the video at 45:00.) Then Steve Scott gave us a Google motivated entry to embarrassingly parallel algorithms, along the lines of papers recently discussed on the ‘Og. (Too bad we forgot to start the video at the very beginning!) One of the novel things in the talk (for me) was the inclusion of BART in this framework, with the interesting feature that using the whole prior on each machine was way better than using a fraction of the prior, as predicted by the theory! And Joaquin Quinonero Candela provided examples of machine learning techniques used by Facebook to suggest friends and ads in a most efficient way (techniques remaining hidden!).

Even though the rest of the day was free, the two hours of exercising between the pool in the early morning and the climbing wall in the late afternoon left me with no energy to experiment curling with a large subsample of the conference attendees, much to my sorrow!