## the HMC algorithm meets the exchange algorithm

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , on July 26, 2017 by xi'an

Julien Stoehr (now in Dublin, soon to join us as a new faculty in Paris-Dauphine!), Alan Benson and Nial Friel (both at UCD) arXived last week a paper entitled Noisy HMC for doubly-intractable distributions. Which considers solutions for adapting Hamiltonian Monte Carlo to target densities that involve a missing constant. In the sense of our workshop last year in Warwick. And in the theme pursued by Nial in the past years. The notion is thus to tackle a density π(θ)∞exp(V(X|θ)/Z(θ) when Z(θ) is intractable. In that case the gradient of log Z(θ) can be estimated as the expectation of the gradient of V(X|θ) [as a standard exponential family identity]. And the ratio of the Z(θ)’s appearing in the Metropolis ratio can be derived by Iain Murray’s exchange algorithm, based on simulations from the sampling distribution attached to the parameter in the denominator.

The resulting algorithm proposed by the authors thus uses N simulations of auxiliary variables at each step þ of the leapfrog part, towards an approximation of the gradient term, plus another N simulations for approximating the ratio of the normalising constants Z(θ)/Z(θ’). While justified from an importance sampling perspective, this approximation is quite poor when θ and θ’ differ. A better solution [as shown in the paper] is to take advantage of all leapfrog steps and of associated auxiliary simulations to build a telescopic product of ratios where the parameter values θ and θ’ are much closer. The main difficulty is in drawing a comparison with the exchange algorithm, since the noisy HMC version is computationally more demanding. (A secondary difficulty is in having an approximate algorithm that no longer leaves the target density stationary.)

## Jeffreys priors for mixtures [or not]

Posted in Books, Statistics, University life with tags , , , , , on July 25, 2017 by xi'an

Clara Grazian and I have just arXived [and submitted] a paper on the properties of Jeffreys priors for mixtures of distributions. (An earlier version had not been deemed of sufficient interest by Bayesian Analysis.) In this paper, we consider the formal Jeffreys prior for a mixture of Gaussian distributions and examine whether or not it leads to a proper posterior with a sufficient number of observations.  In general, it does not and hence cannot be used as a reference prior. While this is a negative result (and this is why Bayesian Analysis did not deem it of sufficient importance), I find it definitely relevant because it shows that the default reference prior [in the sense that the Jeffreys prior is the primary choice in nonparametric settings] does not operate in this wide class of distributions. What is surprising is that the use of a Jeffreys-like prior on a global location-scale parameter (as in our 1996 paper with Kerrie Mengersen or our recent work with Kaniav Kamary and Kate Lee) remains legit if proper priors are used on all the other parameters. (This may be yet another illustration of the tequilla-like toxicity of mixtures!)

Francisco Rubio and Mark Steel already exhibited this difficulty of the Jeffreys prior for mixtures of densities with disjoint supports [which reveals the mixture latent variable and hence turns the problem into something different]. Which relates to another point of interest in the paper, derived from a 1988 [Valencià Conference!] paper by José Bernardo and Javier Giròn, where they show the posterior associated with a Jeffreys prior on a mixture is proper when (a) only estimating the weights p and (b) using densities with disjoint supports. José and Javier use in this paper an astounding argument that I had not seen before and which took me a while to ingest and accept. Namely, the Jeffreys prior on a observed model with latent variables is bounded from above by the Jeffreys prior on the corresponding completed model. Hence if the later leads to a proper posterior for the observed data, so does the former. Very smooth, indeed!!!

Actually, we still support the use of the Jeffreys prior but only for the mixture mixtures, because it has the property supported by Judith and Kerrie of a conservative prior about the number of components. Obviously, we cannot advocate its use over all the parameters of the mixture since it then leads to an improper posterior.

## forward event-chain Monte Carlo

Posted in Statistics, Travel, University life with tags on July 24, 2017 by xi'an

One of the authors of this paper contacted me to point out their results arXived last February [and revised last month] as being related to our bouncy particle paper arXived two weeks ago. And to an earlier paper by Michel et al. (2014) published in the Journal of Chemical Physics. (The authors actually happen to work quite nearby, on a suburban road I take every time I bike to Dauphine!) I think one reason we missed this paper in our literature survey is the use of a vocabulary taken from Physics rather than our Monte Carlo community, as in, e.g., using “event chain” instead of “bouncy particle”… The paper indeed contains schemes similar to ours, as did the on-going work by Chris Sherlock and co-authors Chris presented last week at the Isaac Newton Institute workshop on scalability. (Although I had troubles reading its physics style, in particular the justification for stationarity or “global balance” and the use of “infinitesimals”.)

“…we would like to find the optimal set of directions {e} necessary for the ergodicity and  allowing for an efficient exploration of the target distribution.”

The improvement sought is about improving the choice of the chain direction at each direction change. In order to avoid the random walk behaviour. The proposal is to favour directions close to the gradient of the log-likelihood, keeping the orthogonal to this gradient constant in direction (as in our paper) if not in scale. (As indicated above I have trouble understanding the ergodicity proof, if not the irreducibility. I also do not see how solving (11), which should be (12), is feasible in general. And why (29) amounts to simulating from (27)…)

## and the travelling salesman is…

Posted in Books, pictures, Statistics, University life with tags , , , on July 21, 2017 by xi'an

Here is another attempt at using StippleGen on… Alan Turing‘s picture. My reason for attempting a travelling salesman rendering of this well-known picture towards creating a logo for PCI Comput Stats, the peer community project I am working on this summer. With the help of the originators of PCI Evol Biol.

## Midsummer dinner at Emmanuel College

Posted in Kids, pictures, Travel, University life, Wines with tags , , , , , , , , , on July 20, 2017 by xi'an

It just so happened that I was in Cambridge for the Midsummer dinner last Saturday at Emmanuel College and that a good friend, who happens to be a Fellow of that College, invited me to the dinner. Making the second dinner in a Cambridge college in a week, after the workshop dinner at Trinity. Except the one at Emmanuel was a much more formal affair, with dress requirement (!) and elaborate dishes. The wines were also exceptional, with a remarkable 2002 Chassagne-Montrachet.While the dinning room (or whatever it is called) is beautiful, it is also rather noisy and I could not engage in conversation with anyone but my immediate neighbours, but still managed to have a fairly interesting exchange with a biologist studying skuas on the Faroe Islands. The end of the meal was announced by a loud clap and Graces in Latin, followed by cheese and port (and a fabulous Sauternes!, not in the wine list) in an equally beautiful room, where it was easier to talk with my neighbours. All in all, a unique evening and opportunity for a glimpse into College traditions! [And a first wine post for the 20th of the month!!]

## RNG impact on MCMC [or lack thereof]

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , on July 13, 2017 by xi'an

Following the talk at MCM 2017 about the strange impact of the random generator on the outcome of an MCMC generator, I tried in Montréal airport the following code on the banana target of Haario et al. (1999), copied from Soetaert and Laine and using the MCMC function of the FME package:

```library(FME)
Banana <- function (x1, x2) {
return(x2 - (x1^2+1)) }
pmultinorm <- function(vec, mean, Cov) {
diff <- vec - mean
ex <- -0.5*t(diff) %*% solve(Cov) %*% diff
rdet <- sqrt(det(Cov))
power <- -length(diff)*0.5
return((2.*pi)^power / rdet * exp(ex)) }
BananaSS <- function (p) {
P <- c(p[1], Banana(p[1], p[2]))
Cov <- matrix(nr = 2, data = c(1, 0.9, 0.9, 1))
N=1e3
ejd=matrix(0,4,N)
RNGkind("Mars")
for (t in 1:N){
MCMC <- modMCMC(f = BananaSS, p = c(0, 0.7),
jump = diag(nrow = 2, x = 5), niter = 1e3)
ejd[1,t]=mean((MCMC\$pars[-1,2]-MCMC\$pars[1,2])^2)}
```

since this divergence from the initial condition seemed to reflect the experiment of the speaker at MCM 2017. Unsurprisingly, no difference came from using the different RNGs in R (which may fail to contain those incriminated by the study)…

## errors, blunders, and lies [book review]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , on July 9, 2017 by xi'an

This new book by David Salsburg is the first one in the ASA-CRC Series on Statistical Reasoning in Science and Society. Which explains why I heard about it both from CRC Press [as a suggested material for a review in CHANCE] and from the ASA [as mass emailing]. The name of the author did not ring a bell until I saw the line about his earlier The Lady Tasting Tea book,  a best-seller in the category of “soft [meaning math- and formula-free] introduction to Statistics through picturesque characters”. Which I did not read either [but Bob Carpenter did].

The current book is of the same flavour, albeit with some maths formulas [each preceded by a lengthy apology for using maths and symbols]. The topic is the one advertised in the title, covering statistical errors and the way to take advantage of them, model mis-specification and robustness, and the detection of biases and data massaging. I read the short book in one quick go, waiting for the results of the French Legislative elections, and found no particular appeal in the litany of examples, historical entries, pitfalls, and models I feel I have already read so many times in the story-telling approach to statistics. (Naked Statistics comes to mind.)

It is not that there anything terrible with the book, which is partly based on the author’s own experience in a pharmaceutical company, but it does not seem to bring out any novelty for engaging into the study of statistics or for handling data in a more rational fashion. And I do not see which portion of the readership is targeted by the book, which is too allusive for academics and too academic for a general audience, who is not necessarily fascinated by the finer details of the history (and stories) of the field. As in The Lady Tasting Tea, the chapters constitute a collection of vignettes, rather than a coherent discourse leading to a theory or defending an overall argument. Some chapters are rather poor, like the initial chapter explaining the distinction between lies, blunders, and errors through the story of the measure of the distance from Earth to Sun by observing the transit of Venus, not that the story is uninteresting, far from it!, but I find it lacking in connecting with statistics [e.g., the meaning of a “correct” observation is never explained]. Or the chapter on the Princeton robustness study, where little is explained about the nature of the wrong distributions, which end up as specific contaminations impacting mostly the variance. And some examples are hardly convincing, like those on text analysis (Chapters 13, 14, 15), where there is little backup for using Benford’s law on such short datasets.  Big data is understood only under the focus of large p, small n, which is small data in my opinion! (Not to mention a minor crime de lèse-majesté in calling Pierre-Simon Laplace Simon-Pierre Laplace! I would also have left the Marquis de aside as this title came to him during the Bourbon Restauration, despite him having served Napoléon for his entire reign.) And, as mentioned above, the book contains apologetic mathematics, which never cease to annoy me since apologies are not needed. While the maths formulas are needed.