Archive for INLA

marginal likelihoods from MCMC

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on April 26, 2017 by xi'an

A new arXiv entry on ways to approximate marginal likelihoods based on MCMC output, by astronomers (apparently). With an application to the 2015 Planck satellite analysis of cosmic microwave background radiation data, which reminded me of our joint work with the cosmologists of the Paris Institut d’Astrophysique ten years ago. In the literature review, the authors miss several surveys on the approximation of those marginals, including our San Antonio chapter, on Bayes factors approximations, but mention our ABC survey somewhat inappropriately since it is not advocating the use of ABC for such a purpose. (They mention as well variational Bayes approximations, INLA, powered likelihoods, if not nested sampling.)

The proposal of this paper is to identify the marginal m [actually denoted a there] as the normalising constant of an unnormalised posterior density. And to do so the authors estimate the posterior by a non-parametric approach, namely a k-nearest-neighbour estimate. With the additional twist of producing a sort of Bayesian posterior on the constant m. [And the unusual notion of number density, used for the unnormalised posterior.] The Bayesian estimation of m relies on a Poisson sampling assumption on the k-nearest neighbour distribution. (Sort of, since k is actually fixed, not random.)

If the above sounds confusing and imprecise it is because I am myself rather mystified by the whole approach and find it difficult to see the point in this alternative. The Bayesian numerics does not seem to have other purposes than producing a MAP estimate. And using a non-parametric density estimate opens a Pandora box of difficulties, the most obvious one being the curse of dimension(ality). This reminded me of the commented paper of Delyon and Portier where they achieve super-efficient convergence when using a kernel estimator, but with a considerable cost and a similar sensitivity to dimension.

reflections on the probability space induced by moment conditions with implications for Bayesian Inference [refleXions]

Posted in Statistics, University life with tags , , , , , , , , , , on November 26, 2014 by xi'an

“The main finding is that if the moment functions have one of the properties of a pivotal, then the assertion of a distribution on moment functions coupled with a proper prior does permit Bayesian inference. Without the semi-pivotal condition, the assertion of a distribution for moment functions either partially or completely specifies the prior.” (p.1)

Ron Gallant will present this paper at the Conference in honour of Christian Gouréroux held next week at Dauphine and I have been asked to discuss it. What follows is a collection of notes I made while reading the paper , rather than a coherent discussion, to come later. Hopefully prior to the conference.

The difficulty I have with the approach presented therein stands as much with the presentation as with the contents. I find it difficult to grasp the assumptions behind the model(s) and the motivations for only considering a moment and its distribution. Does it all come down to linking fiducial distributions with Bayesian approaches? In which case I am as usual sceptical about the ability to impose an arbitrary distribution on an arbitrary transform of the pair (x,θ), where x denotes the data. Rather than a genuine prior x likelihood construct. But I bet this is mostly linked with my lack of understanding of the notion of structural models.

“We are concerned with situations where the structural model does not imply exogeneity of θ, or one prefers not to rely on an assumption of exogeneity, or one cannot construct a likelihood at all due to the complexity of the model, or one does not trust the numerical approximations needed to construct a likelihood.” (p.4)

As often with econometrics papers, this notion of structural model sets me astray: does this mean any latent variable model or an incompletely defined model, and if so why is it incompletely defined? From a frequentist perspective anything random is not a parameter. The term exogeneity also hints at this notion of the parameter being not truly a parameter, but including latent variables and maybe random effects. Reading further (p.7) drives me to understand the structural model as defined by a moment condition, in the sense that


has a unique solution in θ under the true model. However the focus then seems to make a major switch as Gallant considers the distribution of a pivotal quantity like

Z=\sqrt{n} W(\mathbf{x},\theta)^{-\frac{1}{2}} m(\mathbf{x},\theta)

as induced by the joint distribution on (x,θ), hence conversely inducing constraints on this joint, as well as an associated conditional. Which is something I have trouble understanding, First, where does this assumed distribution on Z stem from? And, second, exchanging randomness of terms in a random variable as if it was a linear equation is a pretty sure way to produce paradoxes and measure theoretic difficulties.

The purely mathematical problem itself is puzzling: if one knows the distribution of the transform Z=Z(X,Λ), what does that imply on the joint distribution of (X,Λ)? It seems unlikely this will induce a single prior and/or a single likelihood… It is actually more probable that the distribution one arbitrarily selects on m(x,θ) is incompatible with a joint on (x,θ), isn’t it?

“The usual computational method is MCMC (Markov chain Monte Carlo) for which the best known reference in econometrics is Chernozhukov and Hong (2003).” (p.6)

While I never heard of this reference before, it looks like a 50 page survey and may be sufficient for an introduction to MCMC methods for econometricians. What I do not get though is the connection between this reference to MCMC and the overall discussion of constructing priors (or not) out of fiducial distributions. The author also suggests using MCMC to produce the MAP estimate but this always stroke me as inefficient (unless one uses our SAME algorithm of course).

“One can also compute the marginal likelihood from the chain (Newton and Raftery (1994)), which is used for Bayesian model comparison.” (p.22)

Not the best solution to rely on harmonic means for marginal likelihoods…. Definitely not. While the author actually uses the stabilised version (15) of Newton and Raftery (1994) estimator, which in retrospect looks much like a bridge sampling estimator of sorts, it remains dangerously close to the original [harmonic mean solution] especially for a vague prior. And it only works when the likelihood is available in closed form.

“The MCMC chains were comprised of 100,000 draws well past the point where transients died off.” (p.22)

I wonder if the second statement (with a very nice image of those dying transients!) is intended as a consequence of the first one or independently.

“A common situation that requires consideration of the notions that follow is that deriving the likelihood from a structural model is analytically intractable and one cannot verify that the numerical approximations one would have to make to circumvent the intractability are sufficiently accurate.” (p.7)

This then is a completely different business, namely that defining a joint distribution by mean of moment equations prevents regular Bayesian inference because the likelihood is not available. This is more exciting because (i) there are alternative available! From ABC to INLA (maybe) to EP to variational Bayes (maybe). And beyond. In particular, the moment equations are strongly and even insistently suggesting that empirical likelihood techniques could be well-suited to this setting. And (ii) it is no longer a mathematical worry: there exist a joint distribution on m(x,θ), induced by a (or many) joint distribution on (x,θ). So the question of finding whether or not it induces a single proper prior on θ becomes relevant. But, if I want to use ABC, being given the distribution of m(x,θ) seems to mean I can only generate new values of this transform while missing a natural distance between observations and pseudo-observations. Still, I entertain lingering doubts that this is the meaning of the study. Where does the joint distribution come from..?!

“Typically C is coarse in the sense that it does not contain all the Borel sets (…)  The probability space cannot be used for Bayesian inference”

My understanding of that part is that defining a joint on m(x,θ) is not always enough to deduce a (unique) posterior on θ, which is fine and correct, but rather anticlimactic. This sounds to be what Gallant calls a “partial specification of the prior” (p.9).

Overall, after this linear read, I remain very much puzzled by the statistical (or Bayesian) implications of the paper . The fact that the moment conditions are central to the approach would once again induce me to check the properties of an alternative approach like empirical likelihood.

Bayes 250th versus Bayes 2.5.0.

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on July 20, 2013 by xi'an

More than a year ago Michael Sørensen (2013 EMS Chair) and Fabrizzio Ruggeri (then ISBA President) kindly offered me to deliver the memorial lecture on Thomas Bayes at the 2013 European Meeting of Statisticians, which takes place in Budapest today and the following week. I gladly accepted, although with some worries at having to cover a much wider range of the field rather than my own research topic. And then set to work on the slides in the past week, borrowing from my most “historical” lectures on Jeffreys and Keynes, my reply to Spanos, as well as getting a little help from my nonparametric friends (yes, I do have nonparametric friends!). Here is the result, providing a partial (meaning both incomplete and biased) vision of the field.

Since my talk is on Thursday, and because the talk is sponsored by ISBA, hence representing its members, please feel free to comment and suggest changes or additions as I can still incorporate them into the slides… (Warning, I purposefully kept some slides out to preserve the most surprising entry for the talk on Thursday!)

R.I.P. Emile…

Posted in Mountains, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , on July 5, 2013 by xi'an

IMG_0272I was thus in Montpellier for a few days, working with Jean-Michel Marin and attending the very final meeting of our ANR research group called Emile…  The very same group that introduced us to ABC in 2005. We had a great time, discussing about DIYABC.2, ABC for SNPs, and other extensions with our friend Arnaud Estoup, enjoying an outdoor dinner on the slopes of Pic Saint-Loup and a wine tasting on the way there, listening to ecological modelling this morning from elephant tracking [using INLA] to shell decoration in snails [using massive MCMC], running around Crès lake in the warm rain, and barely escaping the Tour de France on my way to the airport!!!IMG_0274

ISBA on INLA [webinar]

Posted in R, Statistics, University life with tags , , , , , , on April 3, 2013 by xi'an

If you have missed the item of information, Håvard Rue is giving an ISBA webinar tomorrow on INLA:

the ISBA Webinar on INLA is scheduled for April 4th, 2013
from 8:30 - 12:30 EDT.

To join the online meeting (Now from mobile devices using the Cisco WebEx
Meeting App)

1. Go to
2. Enter the meeting number  730 293 070 and click Join Now
3. Enter your name and email address, the meeting password and
click "Join Now"

A recording of the webinar will be provided shortly after the event.

Please verify that your computer is capable of connecting using WebEx at

or see  if you are having
trouble connecting.

latent Gaussian model workshop in Reykjavik

Posted in Mountains, R, Statistics, Travel, University life with tags , , , , on March 29, 2013 by xi'an

An announcement for an Icelandic meeting next September, meeting I would have loved to attend (darn!)… This meeting is sponsored by the BayesComp session, of course!!!

We are pleased to announce that the University of Iceland will host the 3rd Workshop on Bayesian Inference for Latent Gaussian Models with Applications (LGM).

The workshop will be held in Reykjavik, Iceland, on September 12-14 2013 at Harpa ~V Reykjavik Concert Hall and Conference Centre:

The emphasized topics of LGM 2013 are:
-Machine learning
-Spatial and spatio-temporal modeling
-Bayesian non-parametrics
-Latent Gaussian models
-The workshop is not restricted to these topics

The invited speakers are:
-Matthias Katzfuß at Universität Heidelberg
-Bani Mallick at Texas A&M University
-Peter Müller at University of Texas
-Michèle Sebag at INRIA Saclay, CNRS
-Matthias Seeger at École Polytechnique Fédérale de Lausanne
-Christopher Wikle at University of Missouri

Registration fees:
Early bird fee before May 21th ~@ 375
Registration fee after May 21th ~@ 440
Student fee ~@ 250

Detailed information on the scientific program, conference field trip, organizing committee, scientific committee and meeting registration is available on the conference web-site:

LGM 2012, Trondheim

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , on May 31, 2012 by xi'an

A break from the “snapshots from Guérande” that will be a relief for all ‘ Og readers, I am sure: I am now in Trondheim, Norway, for the second Latent Gaussian model meeting, organised by Håvard Rue and his collaborators. As in the earlier edition in Zürich, the main approach to those models (that is adopted in the talks) is the INLA methodology of Rue, Martino and Chopin. I nonetheless (given the theme) gave a presentation on Rao-Blackwellisation techniques for MCMC algorithms. As I had not printed the program of the meeting prior to my departure (blame Guérande!), I had not realised I had only 20 minutes for my talk and kept adding remarks and slides during the flight from Amsterdam to Trondheim [where the clouds prevented me from seeing Jotunheimen]. (So I had to cut the second half of the talk below on parallelisation. Even with this cut, the 20 minutes went awfully fast!) Apart from my talk, I am afraid I was not in a sufficient state of awareness [due to a really early start] to give a comprehensive of the afternoon talks….

Trondheim is a nice city that sometimes feels like a village despite its size. Walking up to the university along typical wooden houses, then going around the town and along the river tonight while running a 10k loop left me with the impression of a very pleasant place (at least in the summer months).