## integral priors for binomial regression

Posted in pictures, R, Statistics, University life with tags , , , , , , , , on July 2, 2013 by xi'an

Diego Salmerón and Juan Antonio Cano from Murcia, Spain (check the movie linked to the above photograph!), kindly included me in their recent integral prior paper, even though I mainly provided (constructive) criticism. The paper has just been arXived.

A few years ago (2008 to be precise), we wrote together an integral prior paper, published in TEST, where we exploited the implicit equation defining those priors (Pérez and Berger, 2002), to construct a Markov chain providing simulations from both integral priors. This time, we consider the case of a binomial regression model and the problem of variable selection. The integral equations are similarly defined and a Markov chain can again be used to simulate from the integral priors. However, the difficulty therein follows from the regression structure, which makes selecting training datasets more elaborate, and  whose posterior is not standard. Most fortunately, because the training dataset is exactly the right dimension, a re-parameterisation allows for a simulation of Bernoulli probabilities, provided a Jeffreys prior is used on those.  (This obviously makes the “prior” dependent on the selected training dataset, but it should not overly impact the resulting inference.)

Posted in Books, Statistics, University life with tags , , , , , , , , , on December 14, 2012 by xi'an

This week, my student Dona Skanji gave a presentation of the paper of Hastings “Monte Carlo sampling methods using Markov chains and their applications“, which set the rules for running MCMC algorithms, much more so than the original paper by Metropolis et al. which presented an optimisation device. even though the latter clearly stated the Markovian principle of those algorithms and their use for integration. (This is definitely a classic, selected in the book Biometrika: One hundred years, by Mike Titterington and David Cox.) Here are her slides (the best Beamer slides so far!):

Given that I had already taught my lectures on Markov chains and on MCMC algorithms, the preliminary part of Dona’s talk was easier to compose and understanding the principles of the method was certainly more straightforward than for the other papers in the series. I think she nonetheless did a rather good job in summing up the paper, running this extra simulation for the Poisson distribution—with the interesting “mistake” of including the burnin time in the representation of the output and concluding about a poor convergence—and mentioning the Gibbs extension.I led the discussion of the seminar towards irreducibility conditions and Peskun’s ordering of Markov chains, which maybe could have been mentioned by Dona since she was aware Peskun was Hastings‘ student.

## Random construction of interpolating sets

Posted in Kids, Statistics, University life with tags , , , , on January 5, 2012 by xi'an

One of the many arXiv papers I could not discuss earlier is Huber and Schott’s “Random construction of interpolating sets for high dimensional integration” which relates to their earlier TPA paper at the València meeting. (Paper that we discussed with Nicolas Chopin.) TPA stands for tootsie pop algorithm, The paper is very pleasant to read, just like its predecessor. The principle behind TPA is that the number of steps in the algorithm is Poisson with parameter connected  to  the unknown measure of the inner set:

$N\sim\mathcal{P}(\ln[\mu(B)/\mu(B^\prime)])$

Therefore, the variance of the estimation is known as well.  This is a significant property of a mathematically elegant solution. As already argued in our earlier discussion, it however seems the paper is defending an integral approximation that sounds far from realistic, in my opinion. Indeed, the TPA method requires as a fundamental item the ability to simulate from a measure μ restricted to a level set A(β). Exact simulation seems close to impossible in any realistic problem. Just as in Skilling (2006)’s nested sampling. Furthermore, the comparison with nested sampling is evacuated rather summarily: that the variance of this alternative cannot be computed “prior to running the algorithm” does not mean it is larger than the one of the TPA method. If the proposal is to become a realistic algorithm, some degree of comparison with the existing should appear in the paper. (A further if minor comment about the introduction is that the reason for picking the relative ideal balance α=0.2031 in the embedded sets is not clear. Not that it really matters in the implementation unless Section 5 on well-balanced sets is connected with this ideal ratio…)