## chance meeting

Posted in Statistics with tags , , , , , , , , , on July 10, 2018 by xi'an

As I was travelling to Coventry yesterday, I spotted this fellow passenger on the train from Birmingham with a Valencia 9 bag, and a chat with him. It was a pure chance encounter as he was not attending our summer school, but continued down the line. (These bags are quite sturdy and I kept mine until a zipper broke.)

## Particle learning [rejoinder]

Posted in R, Statistics, University life with tags , , , , , , , , , , on November 10, 2010 by xi'an

Following the posting on arXiv of the Statistical Science paper of Carvalho et al., and the publication by the same authors in Bayesian Analysis of Particle Learning for general mixtures I noticed on Hedibert Lopes’ website his rejoinder to the discussion of his Valencia 9 paper has been posted. Since the discussion involved several points made by members of the CREST statistics lab (and covered the mixture paper as much as the Valencia 9 paper), I was quite eager to read Hedie’s reply. Unsurprisingly, this rejoinder is however unlikely to modify my reservations about particle learning. The following is a detailed examination of the arguments found in the rejoinder but requires a preliminary reading of the above papers as well as our discussion.. Continue reading

## Parallel processing of independent Metropolis-Hastings algorithms

Posted in R, Statistics, University life with tags , , , , , , , , on October 12, 2010 by xi'an

With Pierre Jacob, my PhD student, and Murray Smith, from National Institute of Water and Atmospheric Research, Wellington, who actually started us on this project at the last and latest Valencia meeting, we have completed a paper on using parallel computing in independent Metropolis-Hastings algorithms. The paper is arXived and the abstract goes as follows:

In this paper, we consider the implications of the fact that parallel raw-power can be exploited by a generic Metropolis–Hastings algorithm if the proposed values are independent. In particular, we present improvements to the independent Metropolis–Hastings algorithm that significantly decrease the variance of any estimator derived from the MCMC output, for a null computing cost since those improvements are based on a fixed number of target density evaluations. Furthermore, the techniques developed in this paper do not jeopardize the Markovian convergence properties of the algorithm, since they are based on the Rao–Blackwell principles of Gelfand and Smith (1990), already exploited in Casella and Robert 91996), Atchadé and Perron (2005) and Douc and Robert (2010). We illustrate those improvement both on a toy normal example and on a classical probit regression model but insist on the fact that they are universally applicable.

I am quite excited about the results in this paper, which took advantage of (a) older works of mine on Rao-Blackwellisation, (b) Murray’s interests in costly likelihoods, and (c) our mutual excitement when hearing about GPU parallel possibilities from Chris Holmes’ talk in Valencia. (As well as directions drafted in an exciting session in Vancouver!) The (free) gains over standard independent Metropolis-Hastings estimates are equivalent to using importance sampling gains, while keeping the Markov structure of the original chain. Given that 100 or more parallel threads can be enhanced from current GPU cards, this is clearly a field with much potential! The graph below

gives the variance improvements brought by three Rao-Blackwell estimates taking advantage of parallelisation over the initial MCMC estimate (first entry) with the importance sampling estimate (last entry) using only 10 parallel threads.

## Multidimension bridge sampling (CoRe in CiRM [5])

Posted in Books, R, Statistics, University life with tags , , , , , on July 14, 2010 by xi'an

Since Bayes factor approximation is one of my areas of interest, I was intrigued by Xiao-Li Meng’s comments during my poster in Benidorm that I was using the “wrong” bridge sampling estimator when trying to bridge two models of different dimensions, based on the completion (for $\theta_2=(\mu,\sigma^2)$ and $\mu=\theta_1$ missing from the first model)

$B^\pi_{12}(x)= \dfrac{\displaystyle{\int\pi_1^*(\mu|\sigma^2){\tilde\pi}_1(\sigma^2|x) \alpha(\theta_2) {\pi}_2(\theta_2|x)\hbox{d}\theta_2}}{ \displaystyle{\int{\tilde\pi}_2(\theta_2|x)\alpha(\theta_2) \pi_1(\sigma^2|x)\hbox{d}\sigma^2 } \pi_1^*(\mu|\sigma^2) \hbox{d}\mu }\,.$

When revising the normal chapter of Bayesian Core,  here in CiRM, I thus went back to Xiao-Li’s papers on the topic to try to fathom what the “true” bridge sampling was in that case. In Meng and Schilling (2002, JASA), I found the following indication, “when estimating the ratio of normalizing constants with different dimensions, a good strategy is to bridge each density with a good approximation of itself and then apply bridge sampling to estimate each normalizing constant separately. This is typically more effective than to artificially bridge the two original densities by augmenting the dimension of the lower one”. I was unsure of the technique this (somehow vague) indication pointed at until I understood that it meant  introducing one artificial posterior distribution for each of the parameter spaces and processing each marginal likelihood as an integral ratio in itself. For instance, if $\eta_1(\theta_1)$ is an arbitrary normalised density on $\theta_1$, and $\alpha$ is an arbitrary function, we have the bridge sampling identity on $m_1(x)$:

$\int\tilde{\pi}_1(\theta_1|x) \,\text{d}\theta_1 = \dfrac{\displaystyle{\int \tilde{\pi}_1(\theta_1|x) \alpha(\theta_1) {\eta}_1(\theta_1)\,\text{d}\theta_1}}{\displaystyle{\int\eta_1(\theta_1) \alpha(\theta_1) \pi_1(\theta_1|x) \,\text{d}\theta_1}}$

Therefore, the optimal choice of $\alpha$ leads to the approximation

$\widehat m_1(x) = \dfrac{\displaystyle{\sum_{i=1}^N {\tilde\pi}_1(\theta^\eta_{1i}|x)\big/\left\{{m_1(x) \tilde\pi}_1(\theta^\eta_{1i}|x) + \eta(\theta^\eta_{1i})\right\}}}{\displaystyle{ \sum_{i=1}^{N} \eta(\theta_{1i}) \big/ \left\{{m_1(x) \tilde\pi}_1(\theta_{1i}|x) + \eta(\theta_{1i})\right\}}}$

when $\theta_{1i}\sim\pi_1(\theta_1|x)$ and $\theta^\eta_{1i}\sim\eta(\theta_1)$. More exactly, this approximation is replaced with an iterative version since it depends on the unknown $m_1(x)$. The choice of the density $\eta$ is obviously fundamental and it should be close to the true posterior $\pi_1(\theta_1|x)$ to guarantee good convergence approximation. Using a normal approximation to the posterior distribution of $\theta$ or a non-parametric approximation based on a sample from $\pi_1(\theta_1|\mathbf{x})$, or yet again an average of MCMC proposals are reasonable choices.

The boxplot above compares this solution of Meng and Schilling (2002, JASA), called double (because two pseudo-posteriors $\eta_1(\theta_1)$ and $\eta_2(\theta_2)$ have to be introduced), with Chen, Shao and Ibragim (2001) solution based on a single completion $\pi_1^*$ (using a normal centred at the estimate of the missing parameter, and with variance the estimate from the simulation), when testing whether or not the mean of a normal model with unknown variance is zero. The variabilities are quite comparable in this admittedly overly simple case. Overall, the performances of both extensions are obviously highly dependent on the choice of the completion factors, $\eta_1$ and $\eta_2$ on the one hand and $\pi_1^*$ on the other hand, . The performances of the first solution, which bridges both models via $\pi_1^*$, are bound to deteriorate as the dimension gap between those models increases. The impact of the dimension of the models is less keenly felt for the other solution, as the approximation remains local.

Posted in Statistics, University life with tags , , , , on June 23, 2010 by xi'an

Following discussions at CREST, we have contributed comments on the following papers

Bernardo, José M. (Universitat de València, Spain)
Integrated objective Bayesian estimation and hypothesis testing. [discussion]

Consonni, Guido (Università di Pavia, Italy)
On moment priors for Bayesian model choice with applications to directed acyclic graphs. [discussion]

Frühwirth-Schnatter, Sylvia (Johannes Kepler Universität Linz, Austria)
Bayesian variable selection for random intercept modeling of Gaussian and non-Gaussian data. [discussion]

Huber, Mark (Claremont McKenna College, USA)
Using TPA for Bayesian inference. [discussion]

Lopes, Hedibert (University of Chicago, USA)
Particle learning for sequential Bayesian computation. [discussion]

Polson, Nicholas (University of Chicago, USA)
Shrink globally, act locally: Sparse Bayesian regularization and prediction. [discussion]

Wilkinson, Darren (University of Newcastle, UK)
Parameter inference for stochastic kinetic models of bacterial gene regulation: a Bayesian approach to systems biology. [discussion]

(with a possible incoming update on Mark Huber’s comments if we manage to get the simulations running in due time).

## Vanilla Rao-Blackwellisation accepted

Posted in Statistics, University life with tags , , on June 13, 2010 by xi'an

The revision of our Vanilla Rao-Blackwellisation paper has been accepted by the Annals of Statistics as I was leaving for València 9. This is a very good news, indeed! Especially because I came back from València 9 with an idea on how to extend the Rao-Blackwellisation…

## València 9 papers on line

Posted in Books, Statistics, University life with tags , , on June 11, 2010 by xi'an

Just received this email from José Bernardo:

The pdf files of the Valencia 9 invited papers are now available online at the conference webpage, as a link placed by the author name in the V9 invited program list. These are the  last version sent to me by the author, and will be substituted by more current ones as they become available.

I remind you that  you are encouraged to submit written contributions to the discussion of any of these 24 papers even if you could not attend the meeting. Your discussions should be directly emailed by June 28th  to the author(s) of the invited papers, with a copy to me. I will also need the LaTeX source and the eps files of any figures used. Contributions should not exceed six typeset pages (including figures) for invited discussions, and three pages for contributed discussions.

This means anyone can send discussions on the papers presented at the meeting, to be published soon in the Valencia 9 proceedings by Oxford University Press. We are just out of a post-conference meeting with our students and colleagues here at CREST, where we discussed the invited papers by Ickstadt, Nicholls (actually, sadly not open to written discussions!), Meek, and Wilkinson . (On Monday, we plan to cover Dunson, Früwirth-Schnatter, Lopes, Polson, and Vanucci.)