Dan Simpson’s seminar at CREST

Posted in Kids, Mountains, Statistics, Travel, University life with tags , , , , , , , , , on April 18, 2014 by xi'an

Daniel Simpson gave a seminar at CREST yesterday on his recently arXived paper, “Penalising model component complexity: A principled, practical  approach to constructing priors” written with Thiago Martins, Andrea Riebler, Håvard Rue, and Sigrunn Sørbye. Paper that he should also have given in Banff last month had he not lost his passport in København airport…  I have already commented at length on this exciting paper, hopefully to become a discussion paper in a top journal!, so I am just pointing out two things that came to my mind during the energetic talk delivered by Dan to our group. The first thing is that those penalised complexity (PC) priors of theirs rely on some choices in the ordering of the relevance, complexity, nuisance level, &tc. of the parameters, just like reference priors. While Dan already wrote a paper on Russian roulette, there is also a Russian doll principle at work behind (or within) PC priors. Each shell of the Russian doll corresponds to a further level of complexity whose order need be decided by the modeller… Not very realistic in a hierarchical model with several types of parameters having only local meaning.

My second point is that the construction of those “politically correct” (PC) priors reflects another Russian doll structure, namely one of embedded models, hence would and should lead to a natural multiple testing methodology. Except that Dan rejected this notion during his talk, by being opposed to testing per se. (A good topic for one of my summer projects, if nothing more, then!)

Posted in pictures, Statistics, Travel with tags , , , , , , , , , , , on April 15, 2014 by xi'an

“At equilibrium, we thus should not expect gains of several orders of magnitude.”

As was signaled to me several times during the MCqMC conference in Leuven, Rémi Bardenet, Arnaud Doucet and Chris Holmes (all from Oxford) just wrote a short paper for the proceedings of ICML on a way to speed up Metropolis-Hastings by reducing the number of terms one computes in the likelihood ratio involved in the acceptance probability, i.e.

$\prod_{i=1}^n\frac{L(\theta^\prime|x_i)}{L(\theta|x_i)}.$

The observations appearing in this likelihood ratio are a random subsample from the original sample. Even though this leads to an unbiased estimator of the true log-likelihood sum, this approach is not justified on a pseudo-marginal basis à la Andrieu-Roberts (2009). (Writing this in the train back to Paris, I am not convinced this approach is in fact applicable to this proposal as the likelihood itself is not estimated in an unbiased manner…)

In the paper, the quality of the approximation is evaluated by Hoeffding’s like inequalities, which serves as the basis for a stopping rule on the number of terms eventually evaluated in the random subsample. In fine, the method uses a sequential procedure to determine if enough terms are used to take the decision and the probability to take the same decision as with the whole sample is bounded from below. The sequential nature of the algorithm requires to either recompute the vector of likelihood terms for the previous value of the parameter or to store all of them for deriving the partial ratios. While the authors adress the issue of self-evaluating whether or not this complication is worth the effort, I wonder (from my train seat) why they focus so much on recovering the same decision as with the complete likelihood ratio and the same uniform. It would suffice to get the same distribution for the decision (an alternative that is easier to propose than to create of course). I also (idly) wonder if a Gibbs version would be manageable, i.e. by changing only some terms in the likelihood ratio at each iteration, in which case the method could be exact… (I found the above quote quite relevant as, in an alternative technique we are constructing with Marco Banterle, the speedup is particularly visible in the warmup stage.) Hence another direction in this recent flow of papers attempting to speed up MCMC methods against the incoming tsunami of “Big Data” problems.

métro static

Posted in Kids, Travel with tags , , on March 26, 2014 by xi'an

[heard in the métro this morning]

“…les équations à deux inconnues ça va encore, mais à trois inconnues, c’est trop dur!”

["...systems of equations with two unknowns are still ok, but with three variables it is too hard!"]

Paris snapshot #2

Posted in pictures with tags , , , , on February 27, 2014 by xi'an

Carlin and Chib (1995) for fixed dimension problems

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on February 25, 2014 by xi'an

Yesterday, I was part of a (public) thesis committee at the Université Pierre et Marie Curie, in down-town Paris. After a bit of a search for the defence room (as the campus is still undergoing a massive asbestos clean-up, 20 years after it started…!), I listened to Florian Maire delivering his talk on an array of work in computational statistics ranging from the theoretical (Peskun ordering) to the methodological (Monte Carlo online EM) to the applied (unsupervised learning of classes shapes via deformable templates). The implementation of the online EM algorithm involved the use of pseudo-priors à la Carlin and Chib (1995), even though the setting was a fixed-dimension one, in order to fight the difficulty of exploring the space of templates by a regular Gibbs sampler. (As usual, the design of the pseudo-priors was crucial to the success of the method.) The thesis also included a recent work with Randal Douc and Jimmy Olsson on ranking inhomogeneous Markov kernels of the type

$P \circ Q \circ P \circ Q \circ ...$

against alternatives with components (P’,Q’). The authors were able to characterise minimal conditions for a Peskun-ordering domination on the components to transfer to the combination. Quite an interesting piece of work for a PhD thesis!