Archive for Montréal

RNG impact on MCMC [or lack thereof]

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , on July 13, 2017 by xi'an

Following the talk at MCM 2017 about the strange impact of the random generator on the outcome of an MCMC generator, I tried in Montréal airport the following code on the banana target of Haario et al. (1999), copied from Soetaert and Laine and using the MCMC function of the FME package:

library(FME)
Banana <- function (x1, x2) {
 return(x2 - (x1^2+1)) }
pmultinorm <- function(vec, mean, Cov) {
 diff <- vec - mean
 ex <- -0.5*t(diff) %*% solve(Cov) %*% diff
 rdet <- sqrt(det(Cov))
 power <- -length(diff)*0.5
 return((2.*pi)^power / rdet * exp(ex)) }
BananaSS <- function (p) {
 P <- c(p[1], Banana(p[1], p[2]))
 Cov <- matrix(nr = 2, data = c(1, 0.9, 0.9, 1))
N=1e3
ejd=matrix(0,4,N)
RNGkind("Mars")
for (t in 1:N){
  MCMC <- modMCMC(f = BananaSS, p = c(0, 0.7), 
  jump = diag(nrow = 2, x = 5), niter = 1e3)
  ejd[1,t]=mean((MCMC$pars[-1,2]-MCMC$pars[1,2])^2)}

since this divergence from the initial condition seemed to reflect the experiment of the speaker at MCM 2017. Unsurprisingly, no difference came from using the different RNGs in R (which may fail to contain those incriminated by the study)…

MCM 2017 snapshots [#2]

Posted in Books, pictures, Running, Statistics, University life with tags , , , , , , , , , , , on July 7, 2017 by xi'an

On the second day of MCM 2017, Emmanuel Gobet (from Polytechnique) gave the morning plenary talk on regression Monte Carlo methods, where he presented several ways of estimating conditional means of rv’s in nested problems where conditioning involves other conditional expectations. While interested in such problems in connection with ABC, I could not see how the techniques developed therein could apply to said problems.

By some of random chance, I ended up attending a hard-core random generation session where the speakers were discussing discrepancies between GNU library generators [I could not understand the target of interest and using MCMC till convergence seemed prone to false positives!], and failed statistical tests of some 64-bit Mersenne Twisters, and low discrepancy on-line subsamples of Uniform samples. Most exciting of all, Josef Leydold gave a talk on ratio-of-uniforms, on which I spent some time a while ago  (till ending up reinventing the wheel!), with highly refined cuts of the original box.

My own 180 slides [for a 50mn talk] somewhat worried my chairman, Art Owen, who kindly enquired the day before at the likelihood I could go through all 184 of them!!! I had appended the ABC convergence slides to an earlier set of slides on ABC with random forests in case of questions about that aspect, although I did not plan to go through those slides [and I mostly covered the 64 other slides] As the talk was in fine more about an inference method than a genuine Monte Carlo technique, plus involved random forests that sounded unfamiliar to many, I did not get many questions from the audience but had several deep discussions with people after the talk. Incidentally, we have just reposted our paper on ABC estimation via random forests, updated the abcrf R package, and submitted it to Peer Community in Evolutionary Biology!

MCM17 snapshots

Posted in Kids, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , on July 5, 2017 by xi'an

At MCM2017 today, Radu Craiu presented a talk on adaptive Metropolis-within-Gibbs, using a family of proposals for each component of the target and weighting them by jumping distance. And managing the adaptation from the selection rate rather than from the acceptance rate as we did in population Monte Carlo. I find the approach quite interesting in that adaptation and calibration of Metropolis-within-Gibbs is quite challenging due to the conditioning, i.e., the optimality of one scale is dependent on the other components. Some of the graphs produced by Radu during the talk showed a form of local adaptivity that seemed promising. This raised a question I could not ask for lack of time, namely that with a large enough collection of proposals, it is unclear why this approach provides a gain compared with particle, sequential or population Monte Carlo algorithms. Indeed, when there are many parallel proposals, clouds of particles can be generated from all proposals in proportion to their appeal and merged together in an importance manner, leading to an easier adaptation. As it went, the notion of local scaling also reflected in Mylène Bédard’s talk on another Metropolis-within-Gibbs study of optimal rates. The other interesting sessions I attended were the ones on importance sampling with stochastic gradient optimisation, organised by Ingmar Schuster, and on sequential Monte Carlo, with a divide-and-conquer resolution through trees by Lindsten et al. I had missed.

Montréal street art [jatp]

Posted in Statistics with tags , , , , , on July 4, 2017 by xi'an

MCM 2017

Posted in Statistics with tags , , , , , , , , , , , , on July 3, 2017 by xi'an

And thus I am back in Montréal, for MCM 2017, located in HEC Montréal, on the campus of Université de Montréal, for three days. My talk is predictably about ABC, what else?!, gathering diverse threads from different talks and papers:

Hamiltonian MC on discrete spaces

Posted in Statistics, Travel, University life with tags , , , , , , , , on July 3, 2017 by xi'an

Following a lively discussion with Akihiko Nishimura during a BNP11 poster session last Tuesday, I took the opportunity of the flight to Montréal to read through the arXived paper (written jointly with David Dunson and Jianfeng Liu). The issue is thus one of handling discrete valued parameters in Hamiltonian Monte Carlo. The basic “trick” in handling this complexity goes by turning the discrete support via the inclusion of an auxiliary continuous variable whose discretisation is the discrete parameter, hence resembling to some extent the slice sampler. This removes the discreteness blockage but creates another difficulty, namely handling a discontinuous target density. (I idly wonder why the trick cannot be iterated to second or higher order so that to achieve the right amount of smoothness. Of course, the maths behind would be less cool!) The extension of the Hamiltonian to this setting by a  convolution is a trick I had not seen since the derivation of the Central Limit Theorem during Neveu’s course at Polytechnique.  What I find most exciting in the resolution is the move from a Gaussian momentum to a Laplace momentum, for the reason that I always wondered at alternatives [without trying anything myself!]. The Laplace version is indeed most appropriate here in that it avoids a computation of all discontinuity points and associated values along a trajectory. Since the moves are done component-wise, the method has a Metropolis-within-Gibbs flavour, which actually happens to be a special case. What is also striking is that the approach is both rejection-free and exact, provided ergodicity occurs, which is the case when the stepsize is random.

In addition to this resolution of the discrete parameter problem, the paper presents the further appeal of (re-)running an analysis of the Jolly-Seber capture-recapture model. Where the discrete parameter is the latent number of live animals [or whatever] in the system at any observed time. (Which we cover in Bayesian essentials with R as a neat entry to both dynamic and latent variable models.) I would have liked to see a comparison with the completion approach of Jérôme Dupuis (1995, Biometrika), since I figure the Metropolis version implemented here differs from Jérôme’s. The second example is built on Bissiri et al. (2016) surrogate likelihood (discussed earlier here) and Chopin and Ridgway (2017) catalogue of solutions for not analysing the Pima Indian dataset. (Replaced by another dataset here.)

complexity of the von Neumann algorithm

Posted in Statistics with tags , , , , , , , , , on April 3, 2017 by xi'an

“Without the possibility of computing infimum and supremum of the density f over compact subintervals of the domain of f, sampling absolutely continuous distribution using the rejection method seems to be impossible in total generality.”

The von Neumann algorithm is another name for the rejection method introduced by von Neumann circa 1951. It was thus most exciting to spot a paper by Luc Devroye and Claude Gravel appearing in the latest Statistics and Computing. Assessing the method in terms of random bits and precision. Specifically, assuming that the only available random generator is one of random bits, which necessarily leads to an approximation when the target is a continuous density. The authors first propose a bisection algorithm for distributions defined on a compact interval, which compares random bits with recursive bisections of the unit interval and stops when the interval is small enough.

In higher dimension, for densities f over the unit hypercube, they recall that the original algorithm consisted in simulating uniforms x and u over the hypercube and [0,1], using the uniform as the proposal distribution and comparing the density at x, f(x), with the rescaled uniform. When using only random bits, the proposed method is based on a quadtree that subdivides the unit hypercube into smaller and smaller hypercubes until the selected hypercube is entirely above or below the density. And is small enough for the desired precision. This obviously requires for the computation of the upper and lower bound of the density over the hypercubes to be feasible, with Devroye and Gravel considering that this is a necessary property as shown by the above quote. Densities with non-compact support can be re-expressed as densities on the unit hypercube thanks to the cdf transform. (Actually, this is equivalent to the general accept-reject algorithm, based on the associated proposal.)

“With the oracles introduced in our modification of von Neumann’s method, we believe that it is impossible to design a rejection algorithm for densities that are not Riemann-integrable, so the question of the design of a universally valid rejection algorithm under the random bit model remains open.”

In conclusion, I enjoyed very much reading this paper, especially the reflection it proposes on the connection between Riemann integrability and rejection algorithms. (Actually, I cannot think straight away of a simulation algorithm that would handle non-Riemann-integrable densities, apart from nested sampling. Or of significant non-Riemann-integrable densities.)